Hi there,
I am performing a market basket analysis using PROC ASSOC and PROC SEQUENCE. My data contains information on user sessions (session_id): pages visited (page_type), time spent on each page (time) and money spent on each page (sum).
data sample;
infile datalines dsd truncover;
input session_id :13. page_type:$4. sum:8. time:6. n:2.;
datalines4;
30001,6001,10,0.1,1
30001,6001,1,0.4,2
30001,6005,7,3,3
30002,6002,34,0.2,1
30003,6002,2,12,1
30003,6003,5,0.7,2
30003,6002,0.55,3,3
30003,6005,0.9,3,4
;;;;;;;;
The resulting dataset after PROC SEQUENCE has only generated rules (all possible existing in the data sequences of visited pages) and unnecessary at the moment statistics such as count, support, etc.
However, I need to calculate statistics for each sequence such as a sum of time spent (time) and sum of money spent (sum). That is the problem.
Is there a way to do so?
Thank you in advance.
Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported. The only supported approaches to perform Market Basket Analysis are to use one of the following:
1. The Market Basket Analysis Node in SAS Enterprise Miner
2. The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya
For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at
Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event. It does not even matter if an event happened fifty times at a specific time point or only once. It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored.
What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases. Here is an example using the first few lines of your data:
/*** BEGIN SAS CODE ***/
data sample;
input session_id $ page_type $ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;
proc print data=sample;
title 'input data';
run;
proc summary data=sample sum;
by session_id;
var sum time;
output out=rollup sum=tot_sum tot_time;
run;
proc print data=rollup;
title 'output from SUMMARY procedure';
run;
title;
run;
/*** END SAS CODE ***/
which generates the output below my signature.
Hope this helps!
Doug
Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported. The only supported approaches to perform Market Basket Analysis are to use one of the following:
1. The Market Basket Analysis Node in SAS Enterprise Miner
2. The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya
For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at
Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event. It does not even matter if an event happened fifty times at a specific time point or only once. It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored.
What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases. Here is an example using the first few lines of your data:
/*** BEGIN SAS CODE ***/
data sample;
input session_id $ page_type $ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;
proc print data=sample;
title 'input data';
run;
proc summary data=sample sum;
by session_id;
var sum time;
output out=rollup sum=tot_sum tot_time;
run;
proc print data=rollup;
title 'output from SUMMARY procedure';
run;
title;
run;
/*** END SAS CODE ***/
which generates the output below my signature.
Hope this helps!
Doug
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.