Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Calculating Statistics in market basket analysis

Accepted Solution Solved
Reply
Highlighted
Senior User
Posts: 1
Accepted Solution

Calculating Statistics in market basket analysis

Hi there,

I am performing a market basket analysis using PROC ASSOC and PROC SEQUENCE. My data contains information on user sessions (session_id): pages visited (page_type), time spent on each page (time) and money spent on each page (sum).

data sample;
  infile datalines dsd truncover;
  input session_id :13. page_type:$4. sum:8. time:6. n:2.;
datalines4;
30001,6001,10,0.1,1
30001,6001,1,0.4,2
30001,6005,7,3,3
30002,6002,34,0.2,1
30003,6002,2,12,1
30003,6003,5,0.7,2
30003,6002,0.55,3,3
30003,6005,0.9,3,4
;;;;;;;;

Capture1.PNG

The resulting dataset after PROC SEQUENCE has only generated rules (all possible existing in the data sequences of visited pages) and unnecessary at the moment statistics such as count, support, etc.

Capture1.PNG

However, I need to calculate statistics for each sequence such as a sum of time spent (time) and sum of money spent (sum). That is the problem.

Is there a way to do so?

Thank you in advance.


Accepted Solutions
Solution
Wednesday
SAS Employee
Posts: 226

Re: Calculating Statistics in market basket analysis

[ Edited ]

Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported.  The only supported approaches to perform Market Basket Analysis are to use one of the following:

   1.  The Market Basket Analysis Node in SAS Enterprise Miner

   2.  The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya

 

For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at 

 

https://blogs.sas.com/content/sgf/2018/01/17/visualizing-the-results-of-a-market-basket-analysis-in-...

 

Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event.  It does not even matter if an event happened fifty times at a specific time point or only once.  It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored. 

 

What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases.  Here is an example using the first few lines of your data:

 

/*** BEGIN SAS CODE ***/

 

data sample;

    input session_id $ page_type $ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;

 

proc print data=sample;
    title 'input data';
run;

 

proc summary data=sample sum;
     by session_id;
     var sum time;
     output out=rollup sum=tot_sum tot_time;
run;


proc print data=rollup;
     title 'output from SUMMARY procedure';
run;

 

title;
run;

 

/*** END SAS CODE ***/

 

which generates the output below my signature. 

 

Hope this helps!

Doug

 

SUMMARY_procedure_results.JPG

 

View solution in original post


All Replies
Solution
Wednesday
SAS Employee
Posts: 226

Re: Calculating Statistics in market basket analysis

[ Edited ]

Please note that the direct use of the ASSOC and SEQUENCE procedures which are used by the Association node in SAS Enterprise Miner is not supported.  The only supported approaches to perform Market Basket Analysis are to use one of the following:

   1.  The Market Basket Analysis Node in SAS Enterprise Miner

   2.  The MBANALYSIS procedure available via Visual Data Mining and Machine Learning on SAS Viya

 

For more detail on visualizing your data using SAS Visual Data Mining and Machine Learning, check out the blog at 

 

https://blogs.sas.com/content/sgf/2018/01/17/visualizing-the-results-of-a-market-basket-analysis-in-...

 

Having said that, Market Basket Analysis like Association and Sequence Analysis do not concern themselves with anything other than the occurrence of an event.  It does not even matter if an event happened fifty times at a specific time point or only once.  It does not even matter if the pattern happened fifty times in a transaction nor does it matter what other variables are included in the data since they will be ignored. 

 

What you are asking for relates to rolling-up your data and summarizing certain amounts which can be done with the SUMMARY procedure in many cases.  Here is an example using the first few lines of your data:

 

/*** BEGIN SAS CODE ***/

 

data sample;

    input session_id $ page_type $ sum time n;
cards;
30001 6001 10.00 0.1 1
30001 6001 1.00 0.4 2
30001 6005 7.00 3.0 3
30002 6002 34.00 0.2 1
30003 6002 2.00 12.0 1
30003 6003 5.00 0.7 2
30003 6002 0.55 3.0 3
30003 6005 0.90 3.0 4
;
run;

 

proc print data=sample;
    title 'input data';
run;

 

proc summary data=sample sum;
     by session_id;
     var sum time;
     output out=rollup sum=tot_sum tot_time;
run;


proc print data=rollup;
     title 'output from SUMMARY procedure';
run;

 

title;
run;

 

/*** END SAS CODE ***/

 

which generates the output below my signature. 

 

Hope this helps!

Doug

 

SUMMARY_procedure_results.JPG

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 423 views
  • 1 like
  • 2 in conversation