BookmarkSubscribeRSS Feed

Preparing Data for a Path Analysis in SAS Enterprise Guide

Started 3 weeks ago by
Modified 3 weeks ago by
Views 156

Path Analysis is one of the many statistical tools used to show the relationships between data sets. It connects the dots between variables, and sheds light on how customers move along pathways. In my last post, I stressed the value in not only using SAS solutions like Customer Intelligence 360 to collect data, but to take that data to the next level using tools like SAS Enterprise Guide, SAS Studio, and SAS Visual Analytics.


In my recent course I looked at data collected from SAS 360 Discover and how that data can be manipulated using both SAS Enterprise Guide and SAS Studio. I then use these data sets to create reporting objects in SAS Visual Analytics like a Path Analysis (Figure 1) and Decision Tree. In this post, I'll focus on the process I used to manipulate my data in SAS Enterprise Guide to ready it for a Path Analysis.



Figure 1


Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.


Key Points


There are a few key points you should know before starting this process.


  1. This process will involve appending tables. This simply means that we will concatenate the tables resulting in a data set containing every row and variable in all input tables.
  2. In order to append tables, every input table must have the same columns, so some initial querying may be needed prior to the append.
  3. There are a few attributes you will need to create the Path Analysis later. These are identity_id, session_id, an activity like a page view or a traffic source, the time stamp for each activity, and an activity code like TS for traffic source or PV for page view. This code categorizes the activities and can be used to filter data later when creating the path analysis.


Querying the Data


As stated, you may need to query data and create new columns for each input data set before appending them. This can all be done using a simple query task. I'll demonstrate this step using the VISIT_DETAILS table.


There are five columns needed from the VISIT_DETAILS table (Figure 2).                                                                                



Figure 2


Identity_id and session_id already exist in the input tables so those can easily be dragged and dropped into the query. The last three columns, Time Stamp, Code, and Activity, are computed columns. Here's how to create those columns.


Time Stamp


The Time Stamp will always be the dttm column of the input table. In the case of the VISIT_DETAILS table that column is visit_dttm. Create the computed column and use advanced expression to record the visit_dttm column in the computed column whenever a visit is present, like in Figure 3. The t1 represents the input table.



Figure 3




The Activity column works the same way. In the case of the VISIT_DETAILS table, create another computed column with an advanced expression to record the origination_type_nm when it is present (as in Figure 4). This will pull the traffic source associated with the visit.                                   



 Figure 4




The Code column will identify the category in which the identified activity falls in. As stated earlier, this code can be used in the creation of the path analysis as a filter. In the case of the VISIT_DETAILS table we'll use the code of TS to stand for Traffic Source. The expression is pictured in Figure 5. This is a Case expression stating when the origination_type_nm exists then the column will be populated with TS.



 Figure 5


Appending Tables


At this point all tables are queried. It's important to note that the CUSTOM_EVENTS table holds all events considered to be site conversions, so you may want to filter that table to show only site conversions. After doing that, your process flow may look something like Figure 6.



 Figure 6


You'll use the Append Task to append all your tables, by simply opening the Task, selecting all your tables, and clicking Run. In the end you'll have one table including all your input tables, ready to be transformed into a Path Analysis.



 Figure 7


If you would like to learn more about Customer Intelligence 360 data tables and creating reports, please visit any of these sites below:



Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
3 weeks ago
Updated by:



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags