For the first example of FDIC failed banks data, you obtained data from the data.gov repository. The FDIC web site also has data available for download with additional information, so you will use it in the second example. We will not review the process of creating the Endeca Studio application for this example, except to revisit the advanced features on the attribute review page on application creation. The data set available from the FDIC is more comprehensive than the one available from data.gov. It includes dollar amounts associated with the failures and the insurance fund used to pay depositors. The data is for the entire his tory of the FDIC, going back to 1934. It also indicates the type of failure, indicating whether the intervention was only assistance or a full failure of the bank.
The data was not without its issues. There is a column called Location that has a city and state in one column separated by a comma. To address this issue, you can use a feature on the advanced attribute review page, namely, the split feature; this will tell the provisioning service to create two attributes. The result is that the Location attribute has values for both state and city, as shown in Figure 1.
FIGURE 1. Splitting a value into multiple values
Once the application is created, both the city and state names appear as values for the attribute Location. Figure 2 shows how these attributes appear in the available refinements component.
FIGURE 2. Location data with split attributes
This example illustrates how the “split value into multiple values” utility that is part of the advanced attribute review functions. In the case of this data, for city and state, a better approach would be to separate them prior to loading the Microsoft Excel spreadsheet into Endeca Studio.
Advanced Visualization of the FDIC Data
This new set of FDIC data contains information on the loss occurring because of each bank failure and the total deposits at each failed bank. It also contains data from the start of the FDIC in 1934. Endeca Studio provides a data visualization tool that provides a unique visualization of this data, as shown in Figure 3.
FIGURE 3. Bubble chart with loss versus total deposits with year ordering
This type of graph is known as a bubble chart. The bubble chart is one of the best ways to visualize many aspects of data simultaneously. The data in Figure 3 presents estimated loss versus total deposits for each year. The diameter of each bubble is determined by the amount of the estimated loss. The legend at the right of the chart lists the years, in decreasing order of magnitude. As you can see, 1989 saw the largest total losses because of bank failures. That year was when the most banks failed because of the savings and loan debacle that occurred at that time. The year 2009 is a close second in terms of estimated losses, but the failed banks during this year had about six times the total deposits. The Detail drop-down menu allows you to further refine the chart because it allows the attributes Institution Name and State to be selected for refining the data, as shown in Figure 4.
FIGURE 4. Bubble chart with loss versus total deposits with institution name refinement
In Figure 4, the size of each bubble now represents the size of the failure for an individual institution. The largest single failure was for Indymac Bank FSB in 2008, and this bank also had one of the largest amount of total deposits. As you can see from the legend, the top three individual institution failures occurred during the time period from 2008 to 2010, which is the time you will probably remember as the most severe economic period since the Great Depression. Figure 5 shows how this same chart appears when State is used for the Detail drop-down. You can quickly discern from the tooltip on the largest bubble that Texas had the most losses that year.
FIGURE 5. Bubble chart with loss versus total deposits with state refinement
If you click the year shown in the legend, 1989, you see the bubble chart shown in Figure 6, a detail of the largest bank failures of 1989. You can see from the tooltip that California was second to Texas for bank failures.
FIGURE 6. Bubble chart with loss versus total deposits by state for 1989
The tag cloud shown in Figure 7 illustrates another interesting attribute in the FDIC failed banks data, namely, the insurance fund used to resolve the bank failures.
FIGURE 7. Tag cloud for FDIC insurance fund
The size of the text in the tag cloud is related to the number of records in this data set. The FDIC was used the most, followed by the Resolution Trust Corporation (RTC). Let’s first select the FDIC and then view the bubble chart shown in Figure 8. Figure 8 shows a slightly different version of the bubble chart you have been exploring, only with the institutions in place of the year. Hence, this bubble chart depicts the largest individual bank failures funded by the FDIC.
FIGURE 8. Bubble chart with loss versus total deposits by institution refined by FDIC
Figure 9 depicts the same bubble chart, except RTC is selected in the tag cloud instead of FDIC.
FIGURE 9. Bubble chart with loss versus total deposits by institution refined by RTC
A comparison of Figures 8 and 9 yields interesting information. You can see that the FDIC funded two large bank failures and a number of smaller bank failures. The RTC funded a large number of bank failures of similar magnitudes, and from the institution names, you can see that these banks were savings and loan institutions.
The bubble chart has a remarkable capability for visualizing the FDIC data. What is more surprising is the relative ease with which you can create these bubble charts. Figure 10 depicts the configuration of the bubble chart used in Figures 8 and 9.
FIGURE 10. Bubble chart configuration
As you can see in Figure 10, configuring the bubble chart is relatively simple.