Data visualization example using the pivot table and the stacked bar chart

Data visualization exampleIn this example, you will use publicly available data for airline on-time statistics and delay causes to demonstrate two more excellent tools for data visualization: the pivot table and the stacked bar chart. This data is available at the U.S. Department of Transportation Bureau of Transportation Statistics web site. This is a topic familiar to most of us who travel on airlines, and the data available groups flight delays into several groups:

  • Delays caused by the airlines, such as late-arriving flight crews
  • Delays caused by airport security
  • Delays caused by weather
  • Delays caused by National Aviation System delay

This data includes the number of occurrences of these delays and the actual time. For this example, you will examine the number of occurrences of the delays. The data available from the web site allows a time period to be specified. In this example, you will be examining data for 2013, and the data is grouped by month.

The pivot table is similar to the bubble chart in that it allows a large number of attributes to be viewed with one component. The pivot table is purely tabular and allows data for a time period with a number of categories to be viewed. Figure 1 shows how to set up the pivot table.

Pivot table configuration

FIGURE 1. Pivot table configuration

As you can see in Figure 1, you are examining data for each airline on a per-month basis. The metrics you are examining are the occurrence of the four delays discussed earlier. Notice the button underneath the tooltip labeled Swap Rows and Columns. With this button you can make your columns airlines and make your rows the months and delay data. This capability is the reason this table is called a pivot table. Figure 2 shows the pivot table. Note that the months are shown as numbers, such as 1 for January, 2 for February, and so on.

Pivot table for airline delay data

FIGURE 2. Pivot table for airline delay data

Figure 3 shows the same table “pivoted.” Note that in this view you can view more data on one screen. Also note that data is summed for each category of delay at the bottom of the pivot table. You enable this feature using the View Options menu in the upper right corner of the pivot table. The pivot table enables large amounts of data to be viewed. Every part of the pivot table is selectable, including row and column headings, as well as the data.

“Pivoted” flight delay table with summaries

FIGURE 3. “Pivoted” flight delay table with summaries

The pivot table is useful for displaying large amounts of attribute values when you need to view the actual values, similar to a spreadsheet, and do summations on the data. The pivot table is also useful for viewing the actual values of attributes after refinements have been set on other controls that are more visual, such as the map component or a graph. You can export the pivot table to a CSV file; this feature is available from the Actions menu in the upper-right corner of the component. To view more rows in the pivot table, it is advisable to set the number of rows to a higher number, as shown in Figure 4, where we have set the value to 45.

Table height for pivot table

FIGURE 4. Table height for pivot table

Now that we have covered the pivot table for the airline data, you will examine one final visualization component, the stacked bar graph. The stacked bar graph can also depict all four types of delays. For this example, we have chosen Percentage Stacked Bars, which displays a single uniform vertical bar for each group dimension value, with a proportional section showing the percentage for each chart series. Figure 5 shows the delay data per airline.

Stacked bar graph for airline delay data

FIGURE 5. Stacked bar graph for airline delay data

Examining this chart reveals that airlines have similar proportions for each type of delay. Hawaiian Airlines has a low occurrence of NAS delays, which are defined as “delays and cancellations attributable to the National Aviation System that refer to a broad set of conditions, such as nonextreme weather conditions, airport operations, heavy traffic volume, and air traffic control.” Since Hawaii is far removed from the U.S. mainland and is in a time zone three hours later than U.S. Pacific Time, it is far less likely to be affected by such delays. What is most fascinating about this graph is the relative insignificance of weather, which is counterintuitive for most of us.

Вас заинтересует / Intresting for you:

Tableau: Definition and Short ...
Tableau: Definition and Short ... 396 views Дэйзи ак-Макарова Tue, 08 Oct 2019, 14:02:48
Oracle Engineered Systems for ...
Oracle Engineered Systems for ... 701 views Александров Попков Fri, 15 Jun 2018, 14:05:15
Data Science and Big Data
Data Science and Big Data 705 views Дэн Sat, 16 Jun 2018, 17:53:54
Importance of Data Science
Importance of Data Science 773 views Дэн Sun, 17 Jun 2018, 06:44:06