Path Analysis is one of the many statistical tools used to show the relationships between data sets. It connects the dots between variables, and sheds light on how customers move along pathways. In my last post, I stressed the value in not only using SAS solutions like Customer Intelligence 360 to collect data, but to take that data to the next level using tools like SAS Enterprise Guide, SAS Studio, and SAS Visual Analytics.
In my recent course I looked at data collected from SAS 360 Discover and how that data can be manipulated using both SAS Enterprise Guide and SAS Studio. I then use these data sets to create reporting objects in SAS Visual Analytics like a Path Analysis (Figure 1) and Decision Tree. In this post, I'll focus on the process I used to manipulate my data in SAS Enterprise Guide to ready it for a Path Analysis.
Figure 1
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
There are a few key points you should know before starting this process.
As stated, you may need to query data and create new columns for each input data set before appending them. This can all be done using a simple query task. I'll demonstrate this step using the VISIT_DETAILS table.
There are five columns needed from the VISIT_DETAILS table (Figure 2).
Figure 2
Identity_id and session_id already exist in the input tables so those can easily be dragged and dropped into the query. The last three columns, Time Stamp, Code, and Activity, are computed columns. Here's how to create those columns.
The Time Stamp will always be the dttm column of the input table. In the case of the VISIT_DETAILS table that column is visit_dttm. Create the computed column and use advanced expression to record the visit_dttm column in the computed column whenever a visit is present, like in Figure 3. The t1 represents the input table.
Figure 3
The Activity column works the same way. In the case of the VISIT_DETAILS table, create another computed column with an advanced expression to record the origination_type_nm when it is present (as in Figure 4). This will pull the traffic source associated with the visit.
Figure 4
The Code column will identify the category in which the identified activity falls in. As stated earlier, this code can be used in the creation of the path analysis as a filter. In the case of the VISIT_DETAILS table we'll use the code of TS to stand for Traffic Source. The expression is pictured in Figure 5. This is a Case expression stating when the origination_type_nm exists then the column will be populated with TS.
Figure 5
At this point all tables are queried. It's important to note that the CUSTOM_EVENTS table holds all events considered to be site conversions, so you may want to filter that table to show only site conversions. After doing that, your process flow may look something like Figure 6.
Figure 6
You'll use the Append Task to append all your tables, by simply opening the Task, selecting all your tables, and clicking Run. In the end you'll have one table including all your input tables, ready to be transformed into a Path Analysis.
Figure 7
If you would like to learn more about Customer Intelligence 360 data tables and creating reports, please visit any of these sites below:
Find more articles from SAS Global Enablement and Learning here.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.