This is part 2 of a series about creating customer loyalty lifestyle segmentation in SAS Viya. Part 1 outlines the steps needed in SAS Data Explorer and SAS Studio.
SAS Visual Analytics
Step 1: Launch SAS Visual Analytics: Open the SAS Visual Analytics application (from the SAS Viya home menu, select Explore and Visualize Data). This will open the Visual Analytics interface in your web browser. After loading, you should see the Visual Analytics start page or an interface ready to create reports.
Step 2: Create a New Report: In the top-right corner, click on New Report
This will create a blank report canvas for your analysis.
Step 3: Add the Customer Data to the Report: In the VA interface, look at the left side Data pane which says “To begin, add or import data.” Click the Add Data button (labelled as Add Data or a + Data icon)
In the data selection dialog that appears, search for your table. You should find Customer_Lifestyle_Summary in the list of available tables. Select this table and click Add or double-click it to add it to the report. After loading, the Data pane will list Customer_Lifestyle_Summary as the active data source, and you will see all its fields categorized under Measures (numeric fields like Total_Spend, Transaction_Count, etc.) and Categories (categorical fields like Loyalty_Tier, Gender, etc.).
Double-check that all the important variables are present. If any field’s role is not what you expect (for instance, if Customer_ID appears under Measures but you want it treated as an ID/Category), you can adjust it: expand the data item’s options by clicking the arrow to the right of it and change its classification.
Step 4: Verify the Data in VA: Before performing clustering, do a quick sanity check to ensure the data came through correctly. You can create a simple table or chart in VA:
- Drag a List Table object from the Objects pane onto the canvas.
- From the Data pane, select a few key fields (e.g., Customer_ID, Total_Spend, Transaction_Count, Pct_Spend_Travel, Loyalty_Tier) and drag them into the list table to populate it.
- The table will display these fields for a subset of customers. Scroll through or sort the table by different columns (for example, sort by Total_Spend descending) to see if the information makes sense. Check that high Total_Spend customers also have a reasonable Transaction_Count or Avg_Transaction_Value, that percentage fields like Pct_Spend_Travel are within 0–1 range (or 0–100% if formatted accordingly), and that categorical fields (Loyalty_Tier, Gender) look consistent.
- Everything should mirror what was in the SAS Studio data. If something looks off (say, all zeros in a column that shouldn’t be, or missing values), you might need to revisit the data prep. Assuming the data looks correct, you can proceed with confidence. (You can remove the validation table from the report or simply add new pages for the analysis.)
- Note that list tables aggregate automatically in Visual Analytics. You will need to add a primary key, such as Customer_ID, or enable the “Detail data” option in the list table’s option pane on the right side of the screen
Step 5: Perform Customer Segmentation (Clustering): Now, use Visual Analytics to find clusters (segments) in the customer data based on the features we created.
- In the Objects pane, find the Statistics objects (or use the search within Objects) and drag a Cluster object onto the report canvas. This adds a clustering analysis visual.
- With the cluster object selected, look at the Roles pane on the right side of the screen. Assign the input variables for clustering: add the measures we want to base the clusters on. These should include the key behavioral metrics such as Total_Spend, Transaction_Count, Avg_Transaction_Value, Distinct_Merchant_Categories, Pct_Spend_Travel, Pct_Spend_Dining, and Online_Txn_Count. Do not include unique identifiers like Customer_ID (which would trivially separate every customer) or fields like Loyalty_Tier or Age in the clustering variables, since we want the clusters to be formed purely on behavior/lifestyle attributes.
- By default, Visual Analytics might automatically choose a number of clusters, or it may require you to specify it. Check the cluster object's Options panel for the "Number of clusters" setting. You can leave it at an automatic selection or set a specific number (e.g., 4 or 5) based on how many segments you expect or want to examine. You can leave it at an automatic selection or set a specific number (e.g., 4 or 5) based on how many segments you expect or want to examine.
- Once the variables (roles) are set, the clustering will run and the object will display a visualization of the clusters. Typically, Visual Analytics shows a scatter plot (or scatter plot matrix) with points colored by cluster membership. Each point represents a customer plotted along two of the feature dimensions, and the color indicates which cluster (segment) they belong to. If you expand the object, you can view a Details tab for the cluster object where you can see the cluster statistics (like the mean values of each input variable for Cluster 1, Cluster 2, etc.).
- Ensure that multiple clusters have been identified (e.g., you see different color groupings, not just one uniform group). If you only see one cluster, you may need to increase the number of clusters or check if the input role assignment is correct. If the clusters are too granular (e.g., more than needed), you can reduce the number of clusters. Adjust the cluster count and the analysis will update in real-time.
- Examine the cluster centroids (mean values): for instance, you might see one cluster has much higher Total_Spend and high Pct_Spend_Travel (indicating a segment of high-value travelers), while another cluster has lower spend but a high number of transactions (indicating frequent low-value shoppers). These statistics give an initial understanding of each segment’s profile. At this stage, the customers are segmented into groups based on similar behavior patterns.
Step 6: Profile and Visualize the Segments: With clusters formed, the final step is to interpret and communicate the characteristics of each customer segment using visualizations.
- Derive Cluster Membership: First, create a new data item for the cluster assignment if VA hasn't already. In many versions, the cluster object can generate a Cluster ID variable for each observation. Find the option (often under the cluster object's Options or via a right-click) to derive cluster ID items. This will add a categorical variable (say, Cluster) to the data as well as other statistics, assigning each customer a cluster label (such as 1, 2, 3, ...). Now you can use this Cluster ID variable and statistics in other visuals.
- Examine the Parallel Coordinates Plot: The parallel coordinates chart in the Cluster object can help you identify what types of customers tend to make up clusters. This can aid you in your exploration to label types of customers from clusters.
- Segment Size: Add a bar chart showing how many customers are in each cluster. Drag a Bar Chart onto the canvas. Set the category axis to the Cluster variable, and the measure to a count of Customer_ID (or simply use the automatic count of records, Frequency). This will display a bar for each cluster (segment) with its size. For example, you might find Cluster 1 has 200 customers, Cluster 2 has 500, etc. This helps gauge the volume of each segment.
- Compare Cluster Metrics: Create visuals to compare feature values across clusters. For instance, you can have a bar chart of average Total_Spend by Cluster (Cluster on the x-axis, and the bar height showing the average Total_Spend for customers in that cluster). Similarly, you could create a bar chart for average Transaction_Count by cluster, or average Pct_Spend_Travel by cluster. These charts make it easy to see which segment spends the most, which segment has the highest transaction frequency, which segment spends a large portion on travel or dining, etc. You might observe, for example, that Cluster 1 has the highest Total_Spend and a large Travel spend percentage, whereas Cluster 3 has the lowest Total_Spend and is skewed towards Dining expenses. Such differences are the hallmarks of distinct lifestyles.
- Demographic Breakdown: To add context, examine how demographic or categorical variables differ by cluster. For example, create a stacked bar chart with Cluster on the x-axis and segments of the bar representing Loyalty_Tier distribution (Gold/Silver/Bronze) within each cluster. This could reveal, for instance, that Cluster 1 is mostly Gold-tier members, whereas Cluster 3 might be mostly Bronze. You can do similar analysis for Gender or Age groups if those fields are available, by creating charts or cross-tabulations (e.g., a crosstab of Cluster by Age_group count).
- Interactive Exploration: Leverage VA’s interactivity to dig deeper. For instance, if you click on the bar for Cluster 1 in the segment size chart, you can set it to filter the page by Cluster 1. Then all other charts (spend, frequency, etc.) will update to show values just for Cluster 1. To do this, click the Actions pane on the right side of the screen. Selecting an individual object will allow you to choose which object it filters or highlights. Alternatively, you can use Automatic actions on all objects for Visual Analytics to automatically add interactions. This interactive filtering is a powerful way to isolate each segment and note its characteristics. Do this for each cluster to write down their profiles.
- Interpretation: Summarize what defines each segment. For example, you might find:
- Cluster 1: Very high average spend and a large share of spend on travel, with most members in top loyalty tiers and heavy use of online channels – these could be labeled “Affluent Online Travelers.”
- Cluster 2: Moderate spend but the highest transaction count (many small purchases, a big share on dining and everyday categories) – perhaps “Frequent Everyday Spenders.”
- Cluster 3: Low spend and low transaction count, predominantly branch usage and lower loyalty tiers – maybe “Occasional Traditionalists.”
- Cluster 4: Mid-level spend with balanced category spend and mixed channel usage – e.g., “Diversified Regulars.”
Use the features of Visual Analytics to deep dive into the data to identify cluster profiles and build an interactive report from it.