With the holiday season upon us, lots of people are making travel plans. Quite often these plans are centered around what day within the week a holiday lands on.
This began to make me wonder, what are the busiest travel days in the month of December? One of my favorite ways time series data can be visualized is by using a combination of a calendar and a heat map matrix. An example of this can be found here. So, the next question is, can something this that be built in SAS Visual Analytics? Well, as it turns out, the answer is yes! Using a custom graph I was able to build the report below:
The report above uses data from the Bureau of Transportation Statistics' Airline On-Time Performance Data. For this example report, I downloaded a subset of the available December flight data and appended the files together. The darker colored squares represent the dates in December with the most flights. Using this report, it's very easy to see not only which travel days are the busiest, but how those dates relate to what day of the week they fall on. Pretty cool! And the bonus is, in this post I'll show you how to build this type of graph in SAS Visual Analytics!
Start by creating a new custom graph using the graph builder. First, drag two Schedule Charts onto the reporting canvas and place them on top of each other:
Next go to the "Options" menu on the right. From first drop down menu select "A1". Then select the "Start each graph with its first default color" option from the "Overlaid Graph Colors" drop down list:
Now, still within the "Options" menu, select "Schedule Chart 2" from first drop down menu. Move the spacing slider to be 100% by sliding the button all the way to the right:
Also, again still within the "Options" menu, select "Schedule Chart 1" from first drop down menu. Move the spacing slider to be 0% by sliding the button all the way to the left:
Next select "X Axis" from the top drop down menu and deselect the check boxes next to "Axis label", "Axis line", "Tick values" and "Tick marks":
The last thing we need to change within the "Options" menu is under the "Left Y Axis" section. Select this section from the top drop down list and deselect the "Axis Label" and "Tick Marks" check boxes:
Now we need to configure the roles for our custom graph. Select the "Roles" menu on the right hand side of the screen. Click the three dots next to the "Shared Role 1" role and select "Edit":
Change the Classification to "Datetime". Rename the role to "Weekday" and press "OK":
Next click the three dots next to the "Schedule Chart 2 Start" Role. Choose "Create shared role" -> "Schedule Chart 2 Finish".
At the next screen, name your new shared role "Week" and press OK:
Next click the three dots next to the "Schedule Chart 1 Start" Role. Choose "Use Shared Role" -> "Week".
Now click the three dots next to the "Schedule Chart 1 Finish" role and select "Edit":
Rename this role "Next Week" and press OK:
Now we need to add a label role to the second schedule chart. Under the "Schedule Chart 2" section, click "Add Role".
Choose "Data Label" for the role type, name the role "Day" and press OK:
We will also need to add a data tip role to the first schedule chart. Under the "Schedule Chart 1" section, click "Add Role":
Choose "Data Tip" for the role type, name the role "Data Tip" and press OK:
The last role we need to add is a lattice. Under the "Data Driven Lattice Roles" section, click the "Add Role" button:
For role type choose "Lattice Column". Name the role "Year" and press "OK":
And you're done building your custom graph! Save your custom graph and give it a name.
As I mentioned before, the data from the report shown at the top of this post can be obtained from Bureau of Transportation Statistics' Airline On-Time Performance Data. However, if you do not wish to extract the data form the Airline On-Time Performance Data, there is some simulated data attached to this post which can also work in this example.
There is a variable in the data called "FL_Date" which represents the date of each observation (or flight). The first step was to summarize how many flights occurred on each day. This is achieved via the code below:
create table Source_data as select
FL_Date, Count(1) as Total_Flights
from in.flight_data group by FL_Date;
Next we will need to add some columns to our data. Specifically we will need to add:
We will also rename FL_Date to be Day. Additionally, we need to apply a sasformat to the column Day which will display the day of the month on which the date lands. Incredibly, all this can be sourced from the single column "FL_Date". The code to do this is below.
/* get the current week of the year */
Week = week(FL_Date,'U');
/* get the next week of the year */
Next_Week = week(FL_Date,'U')+1;
/* get the day of the week */
weekday = FL_Date;
format weekday weekdate9.;
/* get the Year*/
Year = FL_Date;
format Year Year4.;
/* format the FL_Date to day of year */
format FL_Date day.;
/* rename FL_Date to Day */
rename FL_Date = Day;
The last thing we need to do is group our data by how many flights have occurred on each day. Fortunately, PROC RANK will do this for us. The code below uses PROC RANK to group the dates into ranks and add those ranks back to the "Add_Columns" data set:
proc sort data=Add_Columns out=sorted;
by year day;
proc rank data=sorted descending out=ranks groups=5;
create table December_Travel as select t1.*,
put(t2.Total_Flights+1,z1.) as Rank
from Add_Columns as t1 left join ranks as t2 on (t1.day=t2.day);
And that's all that's needed for the data prep! A subset of the final output data set created from the simulated data set (that is attached to this post) is listed below:
Now the fun part: building the report! All that's left for us is to import the custom graph to a Visual Analytics report.
Add the data set 'December_Travel' to your report and apply the roles as shown:
The last thing we need to do is add a display rule so the days with more flights will be a darker color. We will use the "Rank" variable we added to our final data set for this. To do this, go to the Display Rule menu and create a new display rule based on the variable "Rank" (using the value of "1" as the darker color). Below are the hex color values for the display rule I created in my example report:
And that's it! For my final report I sorted the Y-Axis by the weekday in descending order. I also gave my report's tab an appropriate title:
This example was created in SAS Visual Analytics 8.3. The data from the report above can be obtained from Bureau of Transportation Statistics' Airline On-Time Performance Data. However, if you do not wish to extract the data form the Airline On-Time Performance Data, there is some simulated data attached to this post which can also work in this example.
The attachments for this post are:
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.