BookmarkSubscribeRSS Feed

Preparing Data for a Path Analysis in SAS Enterprise Guide

Started ‎11-20-2023 by
Modified ‎11-20-2023 by
Views 1,087

Path Analysis is one of the many statistical tools used to show the relationships between data sets. It connects the dots between variables, and sheds light on how customers move along pathways. In my last post, I stressed the value in not only using SAS solutions like Customer Intelligence 360 to collect data, but to take that data to the next level using tools like SAS Enterprise Guide, SAS Studio, and SAS Visual Analytics.

 

In my recent course I looked at data collected from SAS 360 Discover and how that data can be manipulated using both SAS Enterprise Guide and SAS Studio. I then use these data sets to create reporting objects in SAS Visual Analytics like a Path Analysis (Figure 1) and Decision Tree. In this post, I'll focus on the process I used to manipulate my data in SAS Enterprise Guide to ready it for a Path Analysis.

 

Border_01_MT_Path-Analysis-1024x498.png

Figure 1

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Key Points

 

There are a few key points you should know before starting this process.

 

  1. This process will involve appending tables. This simply means that we will concatenate the tables resulting in a data set containing every row and variable in all input tables.
  2. In order to append tables, every input table must have the same columns, so some initial querying may be needed prior to the append.
  3. There are a few attributes you will need to create the Path Analysis later. These are identity_id, session_id, an activity like a page view or a traffic source, the time stamp for each activity, and an activity code like TS for traffic source or PV for page view. This code categorizes the activities and can be used to filter data later when creating the path analysis.

 

Querying the Data

 

As stated, you may need to query data and create new columns for each input data set before appending them. This can all be done using a simple query task. I'll demonstrate this step using the VISIT_DETAILS table.

 

There are five columns needed from the VISIT_DETAILS table (Figure 2).                                                                                

 

02_MT_Query-Builder.png

Figure 2

 

Identity_id and session_id already exist in the input tables so those can easily be dragged and dropped into the query. The last three columns, Time Stamp, Code, and Activity, are computed columns. Here's how to create those columns.

 

Time Stamp

 

The Time Stamp will always be the dttm column of the input table. In the case of the VISIT_DETAILS table that column is visit_dttm. Create the computed column and use advanced expression to record the visit_dttm column in the computed column whenever a visit is present, like in Figure 3. The t1 represents the input table.

                                                     

Border_03_MT_visit_dttm.png

Figure 3

 

Activity

 

The Activity column works the same way. In the case of the VISIT_DETAILS table, create another computed column with an advanced expression to record the origination_type_nm when it is present (as in Figure 4). This will pull the traffic source associated with the visit.                                   

 

Border_04_MT_origination_type_nm.png

 Figure 4

 

Code

 

The Code column will identify the category in which the identified activity falls in. As stated earlier, this code can be used in the creation of the path analysis as a filter. In the case of the VISIT_DETAILS table we'll use the code of TS to stand for Traffic Source. The expression is pictured in Figure 5. This is a Case expression stating when the origination_type_nm exists then the column will be populated with TS.

                             

Border_05_MT_Code.png

 Figure 5

 

Appending Tables

 

At this point all tables are queried. It's important to note that the CUSTOM_EVENTS table holds all events considered to be site conversions, so you may want to filter that table to show only site conversions. After doing that, your process flow may look something like Figure 6.

 

Border_06_MT_Process-Flow-1.png

 Figure 6

 

You'll use the Append Task to append all your tables, by simply opening the Task, selecting all your tables, and clicking Run. In the end you'll have one table including all your input tables, ready to be transformed into a Path Analysis.

                                                                                                   

07_MT_Path-Analysis-Table.png

 Figure 7

 

If you would like to learn more about Customer Intelligence 360 data tables and creating reports, please visit any of these sites below:

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎11-20-2023 01:12 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags