BookmarkSubscribeRSS Feed

Market Basket Analysis (Part 3): Scoring Transactional Data in SAS Viya

Started 2 weeks ago by
Modified 2 weeks ago by
Views 135

In my previous post, I discussed how to perform association rule mining on a transaction database using the MBANALYSIS procedure. As a next step you would be interested in using this model to score the new data. Note that the only scoring recipe that is obtained from the MBANALYSIS procedure is in the form of analytic store table. This analytic store can then be used with the ASTORE procedure to score new data.

 

In this post, I will walk you through the steps that can simplify and streamline the scoring process. The scoring process using PROC ASTORE is not straightforward and requires some data pre-processing to obtain desired results. Refer to the code below extracted from my previous post that is used to run the association analysis on the transaction data set ProdSales.

 

proc mbanalysis data=casuser.ProdSales items=2 support=1;
output out=casuser.out outfreq=casuser.outfreq outrule=casuser.outrule;
customer Customer;
target Item;
hierarchy data = casuser.BaseLevel casuser.ParentLevel;
savestate rstore=casuser.StateArm;
run;

 

Notice the RSTORE= option in the SAVESTATE statement that creates analytic store for the model and saves it as a binary object in a data table named StateArm. I will be using this analytic store in the ASTORE procedure to score new data. The demo to follow assumes that the model has been created successfully using above code and the analytic store has been saved. Recall that this analysis resulted in 128 rules. The information about the rules is produced below for reference:

 

01_MS_OutRule-3.jpg

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Steps to Perform Scoring

 

  • Launch SAS Studio and submit following program to start a CAS session and assign libraries.

     

    cas;
    caslib _ALL_ assign;

 

  • Identify the data that needs to be scored. For this demonstration, I will create a transaction table containing transactions from four customers at a grocery store and load it into the CASUSER library. Submit the following DATA steps to achieve this

     

    data casuser.Transaction;
    length Customer $ 10 Item $ 20 ;
    retain Customer;
    customer='A';
    Item='Low Fat Milk'; output;
    Item='Chicken'; output;
    Item='Cake'; output;
    Item='Frozen Pizza'; output;
    Item='Ice Cream'; output;
    Item='Curd'; output;
    customer='B';
    Item='Chips'; output;
    Item='Swiss Cheese'; output;
    Item='Frozen Pizza'; output;
    Item='Pancakes'; output;
    customer='C';
    Item='Skim Milk'; output;
    Item='Cheddar Cheese'; output;
    Item='Oatmeal'; output;
    Item='Cake'; output;
    Customer='D';
    Item='Frozen Pizza'; output;
    Item='Chicken'; output;
    Item='Ice Cream';output;
    Item='Eggs'; output;
    run;

 

The picture below shows the snapshot of Transaction table.

 

02_MS_TransactionTable1.jpg

 

Now this transaction table cannot be used in its current form for the ASTORE procedure. Why is that? Because the transactional data table is in long format and PROC ASTORE accepts input in wide format. So, the input data format must be converted from long to wide before the data are provided as input to PROC ASTORE. To do this, I use TRANSPOSE procedure but do not forget to sort the data before transposing it.

 

  • Submit the following piece of code to sort and transpose the transaction table

     

    proc sort data=CASUSER.TRANSACTION out=WORK.SORTed;
    by Customer; run; proc transpose data=WORK.SORTed out=CASUSER.Transpose1 prefix=Item_; var Item; by Customer; run;

 

The successful completion of run produces a table in wide format as below

 

03_MS_TransactionTable_Transposed-1024x252.jpg

 

  • Next, I use this transposed table (Transpose1) as input to the ASTORE procedure to perform scoring by submitting the following piece of code

     

    proc astore;
    score data=casuser.Transpose1
    out=casuser.scored1
    rstore=casuser.StateArm;
    quit;

     

The PROC ASTORE statement invokes the procedure and does not require any options. The SCORE statement enables you to score the data using the previously trained model. The DATA= casuser.Transpose1 in SCORE statement names the input data table for PROC ASTORE to use. The OUT= casuser.scored1 specifies the output data table. The RSTORE=casuser.StateArm specifies the table that contains the analytic store used for scoring the data. The successful execution of the code generates a scored table in a wide format similar to the following:

 

04_MS_ScoredtableView1-1024x262.jpg

 

05_MS_ScoredtableView2-1024x254.jpg

 

The scoring model appears to list all 128 rule indices of the association rules as columns (id1 through id128), assigning the corresponding "RULEID" values from the rule set to them.

 

  • To maintain consistency, the output of PROC ASTORE needs to be converted from wide to long format. To do this, I again use the TRANSPOSE procedure and submit following code

     

    proc sort data=CASUSER.SCORED1 out=WORK.TableSorted;
    by Customer;
    run;
    
    proc transpose data=WORK.TableSorted out=casuser.Scored_Transaction (drop=_LABEL_ _NAME_ rename=(RuleID1=RuleID)) prefix=RuleID ;
    var id1-id128;
    by Customer;
    run;     

 

  • The resulting table now contains two columns and 512 rows (128*4=512), i.e. 128 rules times 4, the number of customers.

     

    06_MS_Scored_longForm1-1024x321.jpg

     

    All 128 rules are mapped to each customer in the transaction data. For any rule that applies based on the items in the basket, the scoring algorithm assigns the corresponding RuleID value. If a rule does not apply to a customer based on the items in their basket, the algorithm returns a value of zero.

     

    For example, in first row the RuleID column contains a value 1, which means it corresponds to the RuleID 1 that represents a rule “Dessert --> Cake”. You can make use of these rules for recommendations (cross-sell/ up-sell) knowing the items already purchased by the customers. Of course, these recommendations should be based on metrics like support, lift, support-lift and confidence because not all rules are useful.

     

    07_MS_Scored_longForm2-1-1024x316.jpg

     

    You will see several zero values in the RuleID column for customer A. This indicates that many of the 128 rules, generated by the previously trained model, do not match given the items in customer A’s basket. In row 130, for customer B, the RuleID column contains the value 4, indicating it corresponds to RuleID 4, which represents the rule "Breakfast → Dinner."

     

    08_MS_Scored_longform3-1024x301.jpg

     

    A similar interpretation can be applied to the other entries in the scored table.

     

    Note: - You may want to reference the OutRule table which is reproduced below for convenience.

     

    09_MS_OutRule_Sorted-1024x316.jpg

     

    You can utilize the scored table and the values in the RuleID column to identify the appropriate rule for making product recommendations. When there are multiple rules with the same item on the LHS, prioritize the recommendations based on metrics such as lift, support-lift and confidence.

     

     

    Find more articles from SAS Global Enablement and Learning here.

Contributors
Version history
Last update:
2 weeks ago
Updated by:

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags