The most basic and important question of a direct marketer is "Who to contact?". The other questions of "How often to contact?", "When to contact?", and "What in the contact?" can only be addressed once the basic "Who" question is answered. As the costs of acquiring new customers are increasingly difficult and expensive, direct marketers often try to identify the "Who's" in the existing customer base file. Since each contact costs money and time, you want to contact customer who are ready or most likely to purchase. To contact cost effectively requires creating the optimal mix of customer segment. After each campaign the efficiency of the contact mix must be measured and adjusted for each segment. Over several cycles this process results in a contact plan that balances retention and cost. Here comes Recency, Frequency, and Monetary (further referred to as RFM) to help direct marketers to develop their contact plans.
RFM analysis puts your customers into segments based on the following simply theory:
Follow along with these 8 quick and easy steps and you're on your way to RFM success!
We’ll start with the dataset. There are four variables you need to run the RFM analysis: a customer identifier, transaction date, monetary amount, and quantity (not necessary for Transaction-level data). If you don’t have these variables, stop! Go find the proper variables, then come back and come back to step 2.
In SAS Enterprise Guide, navigate to Tasks > Data Mining > Recency, Frequency, and Monetary analysis.
When the RFM Analysis window appears, select Transaction-level Data. I know to choose Transaction-level because every data record in my dataset represents a single transaction as opposed to a single customer. Notice lines 3, 5, and 6 have transaction information for Customer 183. Select the Data tab on left side panel.
Assign the variables to the task roles as shown below. You can do this a few ways: drag and drop the variables to the corresponding task role, select a variable and click the “Add” arrow and then select which role to assign, or right click the variable and click “Assign to Role.” Choose which method you like best.
If you’d like to include customers with incomplete transaction records, or change the how customer’s individual transactions are combined, you can do so at the bottom of this window in the section labeled “Aggregation of transactions”. This section will not appear if we selected Customer data in step 2 because the transactions for each customer have already been combined in the dataset.
On this same page, you can edit the task filter. The task filter uses conditional logic to select only the observations you want to analyze.
For example, if you only want to analyze sales greater than $50, or only want to analyze customers who have ordered a product within the last year, this is where you define that condition. You can also have more than one condition. The task filter becomes very useful when you have a master dataset and would like to use it for multiple analyses without creating a new dataset for every analysis.
You also have the ability to filter on variables that are in the dataset but are not being used in the analysis, such as gender, age, or income. I chose to filter on two variables: Employee_ID and Quantity. I want to exclude rows where Employee_ID = 9999999, which represent internet sales in my dataset and I only want to focus on in-store purchases. I also want to only include rows where Quantity>1 to focus my analysis on repeat customers.
Select the Binning tab on left panel. For simplicity, let’s select Independent binning. The scores from independent binning are easier to interpret. If you want to learn more about independent and nested binning, stay tuned for my next article!
Select the number of bins that makes that most sense for your unique business problem. Please note that the “Number of bins” for Independent Binning really means the maximum number of possible bins. The binning algorithm will try to create as many bins as you request, but sometimes the data prohibits this from happening. If you have more questions on why you’re seeing less bins than you expect for your specific dataset, I encourage you to open a support track here and I’ll be more than happy to work with you.
Select the Results tab on the left side panel. By default, the RFM scores will be saved in your Work library. Remember that this library is temporary and only exists in the current SAS session! If you exit SAS Enterprise Guide and reopen it, the RFM scores table will no longer exist. You must recreate your analysis which can be tedious and time consuming.
If you don’t want to have to recreate the analysis every time you want to see this table, click “Edit” to save the RFM scores table in a permanent library. When you edit the location, you have the ability to rename the table. I find myself always renaming the table because, after two or three analyses, it is increasingly difficult to remember which table corresponds to which analysis.
You can also change the names of the output dataset variables if you so choose. Remember to stick to SAS variable naming conventions. RFM uses validvarname=v7 naming conventions. A SAS variable name must:
And now, select the Plots tab on the left side panel. Select any combination of plots to visualize your analysis. If you only want to see the output table and don’t care about the visuals, that’s fine too. You can deselect all of the plots and just get the output RFM scores table. Even if you choose not to include the charts and heat map, I highly suggest including the “Summary of RFM Segments.” This table shows the metadata (data about data) for the RFM scores.
When you’re ready, click Run. Yes, it’s that easy!
**As always, don’t forget to check the log for errors before you look at your results.**
The “Summary of RFM Segments” table will be the first thing you’ll see in the Results tab. The first column displays the RFM scores in descending order. These are the customers who have bought most recent, most often, and spent the most money. Can we convince them to buy even more? Maybe! The table also tells you how many customers have each RFM score, what percent of customers have that score, and median values for that group’s recency, frequency, and monetary variables. This is metadata – data about data.
The Output Data tab will display the actual RFM scores table that you have saved. This is a SAS dataset that you can use with other SAS products and solutions. A few ideas on moving forward include performing cluster analyses to further group these customers to narrow down marketing techniques, or perhaps creating business rules to designate which customers receive which advertisements based on certain conditions. There’s so much stuff you can do!
Alternatively, you can run the exact same analysis and get the same results just a little bit of SAS code. This is helpful if you want to quickly change your analysis without having to click through the windows in SAS Enterprise Guide. You can copy and paste the code generated by the analysis and run it in any SAS code editor (Base SAS Windowing Environment, SAS Studio, or SAS Enterprise Guide). The results will be exactly the same. Go ahead and try it!
/*This proc is only needed if you want to filter your input dataset.*/ PROC SQL; CREATE VIEW WORK.RFM_PREP01 AS SELECT * FROM SASUSER.CUSTOMER_ORDERS WHERE Employee_ID NOT = 9999999 AND Quantity > 1; QUIT;RUN; %aaRFM; %EM_RFM_CONTROL ( Mode = T, InData = WORK.RFM_PREP01, CustomerID = Customer_ID, N_R_Grp = 5, N_F_Grp = 5, N_M_Grp = 5, BinMethod = I, PurchaseDate = Order_Date, PurchaseAmt = Total_Retail_Price, SetMiss = Y, SummaryFunc = SUM, MostRecentDate = , NPurchase = , TotPurchaseAmt = , MonetizationMap = Y, BinChart = Y, BinTable = Y, OutData = SASUSER.RFM_RESULTS, Recency_Score = recency_score, Frequency_Score = frequency_score, Monetary_Score = monetary_score, RFM_Score = rfm_score ); %_eg_conditional_dropds(WORK.RFM_PREP01);
The general framework and list of accepted values for the macro and input variables is below.
%macro EM_RFM_CONTROL ( Mode =, /* Mode of analysis: T = Transactional Data, C = Customer Data */ InData =, /* Input data */ CustomerID =, /* Customer ID */ /* RFM Analysis Specifications */ N_R_Grp = 5, /* Number of Most_Recent_Purchase_Date groups */ N_F_Grp = 5, /* Number of Frequency groups */ N_M_Grp = 5, /* Number of Monetary groups */ BinMethod = N, /* Method for binning: N = Nested and I = Independent */ /* Input Variables and Specifications Only for Transactional Data Mode (will be ignored if Mode is Customer Data) */ FreqVar =, /* Optional replication frequency variable */ PurchaseDate =, /* Purchase date */ PurchaseAmt =, /* Purchase amount */ SetMiss = Y, /* Y = If a customer ID has missing purchase date or missing purchase amount, then set all results to missing regardless of any complete records, N = Otherwise */ SummaryFunc = SUM, /* Summary function: SUM, MEAN, MEDIAN, or MAX */ /* Input Variables for Customer Data Mode, but Output Variables for Transactional Data Model */ MostRecentDate =, /* Most recent purchase date (Output for Transactional) */ NPurchase =, /* Number of purchases (Output for Transactional) */ TotPurchaseAmt =, /* Total amount of purchases (Output for Transactional) */ /* Display options for both Modes */ MonetizationMap = Y, /* Show Monetization map (Y/N)? */ BinChart = Y, /* Chart of bin counts (Y/N)? */ BinTable = N, /* Table of bin counts (Y/N)? */ /* Output options for both Modes */ OutData =, /* (Out) Output customer level data with scores */ Recency_Score =, /* (Out) Recency score (1=least recent, N_R_Grp=most recent) */ Frequency_Score =, /* (Out) Frequency score (1=least frequent, N_F_Grp=most frequent) */ Monetary_Score =, /* (Out) Monetary score (1=least amount, N_M_Grp=most amount) */ RFM_Score = /* (Out) RFM Score (= 100 * Recency_score + 10 * Frequency_Score + Monetary_Score) */ );
For more information on the macros used in this data mining analysis, see the aaRFM.sas7bdat dataset in the SASHELP library. This dataset includes more descriptive information about what certain aspects of the macro mean and more information on error messages. A common error may resemble the following error message:
ERROR: Template 'xxxxx' was unable to write to the template store!
If you encounter this particular error during your analysis, don't panic. Follow the steps in this SAS note to help you get around this error and get more information on deleting templates in the SAS/STAT 9.4 User’s Guide.
Do you have questions? Good! I have answers.