Solved
Highlighted
Senior User
Posts: 1

# Calculating Statistics in market basket analysis

Hi there,

I am performing a market basket analysis using PROC ASSOC and PROC SEQUENCE. My data contains information on user sessions (session_id): pages visited (page_type), time spent on each page (time) and money spent on each page (sum).

``data sample;  infile datalines dsd truncover;  input session_id :13. page_type:\$4. sum:8. time:6. n:2.;datalines4;30001,6001,10,0.1,130001,6001,1,0.4,230001,6005,7,3,330002,6002,34,0.2,130003,6002,2,12,130003,6003,5,0.7,230003,6002,0.55,3,330003,6005,0.9,3,4;;;;;;;;``

The resulting dataset after PROC SEQUENCE has only generated rules (all possible existing in the data sequences of visited pages) and unnecessary at the moment statistics such as count, support, etc.

However, I need to calculate statistics for each sequence such as a sum of time spent (time) and sum of money spent (sum). That is the problem.

Is there a way to do so?

Accepted Solutions
Solution
‎02-21-2018 09:32 AM
SAS Employee
Posts: 231

## Re: Calculating Statistics in market basket analysis

[ Edited ]

Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported.  The only supported approaches to perform Market Basket Analysis are to use one of the following:

1.  The Market Basket Analysis Node in SAS Enterprise Miner

2.  The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya

For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at

Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event.  It does not even matter if an event happened fifty times at a specific time point or only once.  It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored.

What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases.  Here is an example using the first few lines of your data:

/*** BEGIN SAS CODE ***/

data sample;

input session_id \$ page_type \$ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;

proc print data=sample;
title 'input data';
run;

proc summary data=sample sum;
by session_id;
var sum time;
output out=rollup sum=tot_sum tot_time;
run;

proc print data=rollup;
title 'output from SUMMARY procedure';
run;

title;
run;

/*** END SAS CODE ***/

which generates the output below my signature.

Hope this helps!

Doug

All Replies
Solution
‎02-21-2018 09:32 AM
SAS Employee
Posts: 231

## Re: Calculating Statistics in market basket analysis

[ Edited ]

Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported.  The only supported approaches to perform Market Basket Analysis are to use one of the following:

1.  The Market Basket Analysis Node in SAS Enterprise Miner

2.  The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya

For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at

Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event.  It does not even matter if an event happened fifty times at a specific time point or only once.  It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored.

What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases.  Here is an example using the first few lines of your data:

/*** BEGIN SAS CODE ***/

data sample;

input session_id \$ page_type \$ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;

proc print data=sample;
title 'input data';
run;

proc summary data=sample sum;
by session_id;
var sum time;
output out=rollup sum=tot_sum tot_time;
run;

proc print data=rollup;
title 'output from SUMMARY procedure';
run;

title;
run;

/*** END SAS CODE ***/

which generates the output below my signature.

Hope this helps!

Doug

☑ This topic is solved.