About DanielT_SAS

DanielT_SAS · ‎11-12-2025

This is part 2 of a series about creating customer loyalty lifestyle segmentation in SAS Viya. Part 1 outlines the steps needed in SAS Data Explorer and SAS Studio. SAS Visual Analytics Step 1: Launch SAS Visual Analytics: Open the SAS Visual Analytics application (from the SAS Viya home menu, select Explore and Visualize Data). This will open the Visual Analytics interface in your web browser. After loading, you should see the Visual Analytics start page or an interface ready to create reports. Step 2: Create a New Report: In the top-right corner, click on New Report This will create a blank report canvas for your analysis. Step 3: Add the Customer Data to the Report: In the VA interface, look at the left side Data pane which says “To begin, add or import data.” Click the Add Data button (labelled as Add Data or a + Data icon) In the data selection dialog that appears, search for your table. You should find Customer_Lifestyle_Summary in the list of available tables. Select this table and click Add or double-click it to add it to the report. After loading, the Data pane will list Customer_Lifestyle_Summary as the active data source, and you will see all its fields categorized under Measures (numeric fields like Total_Spend, Transaction_Count, etc.) and Categories (categorical fields like Loyalty_Tier, Gender, etc.). Double-check that all the important variables are present. If any field’s role is not what you expect (for instance, if Customer_ID appears under Measures but you want it treated as an ID/Category), you can adjust it: expand the data item’s options by clicking the arrow to the right of it and change its classification. Step 4: Verify the Data in VA: Before performing clustering, do a quick sanity check to ensure the data came through correctly. You can create a simple table or chart in VA: Drag a List Table object from the Objects pane onto the canvas. From the Data pane, select a few key fields (e.g., Customer_ID, Total_Spend, Transaction_Count, Pct_Spend_Travel, Loyalty_Tier) and drag them into the list table to populate it. The table will display these fields for a subset of customers. Scroll through or sort the table by different columns (for example, sort by Total_Spend descending) to see if the information makes sense. Check that high Total_Spend customers also have a reasonable Transaction_Count or Avg_Transaction_Value, that percentage fields like Pct_Spend_Travel are within 0–1 range (or 0–100% if formatted accordingly), and that categorical fields (Loyalty_Tier, Gender) look consistent. Everything should mirror what was in the SAS Studio data. If something looks off (say, all zeros in a column that shouldn’t be, or missing values), you might need to revisit the data prep. Assuming the data looks correct, you can proceed with confidence. (You can remove the validation table from the report or simply add new pages for the analysis.) Note that list tables aggregate automatically in Visual Analytics. You will need to add a primary key, such as Customer_ID, or enable the “Detail data” option in the list table’s option pane on the right side of the screen Step 5: Perform Customer Segmentation (Clustering): Now, use Visual Analytics to find clusters (segments) in the customer data based on the features we created. In the Objects pane, find the Statistics objects (or use the search within Objects) and drag a Cluster object onto the report canvas. This adds a clustering analysis visual. With the cluster object selected, look at the Roles pane on the right side of the screen. Assign the input variables for clustering: add the measures we want to base the clusters on. These should include the key behavioral metrics such as Total_Spend, Transaction_Count, Avg_Transaction_Value, Distinct_Merchant_Categories, Pct_Spend_Travel, Pct_Spend_Dining, and Online_Txn_Count. Do not include unique identifiers like Customer_ID (which would trivially separate every customer) or fields like Loyalty_Tier or Age in the clustering variables, since we want the clusters to be formed purely on behavior/lifestyle attributes. By default, Visual Analytics might automatically choose a number of clusters, or it may require you to specify it. Check the cluster object's Options panel for the "Number of clusters" setting. You can leave it at an automatic selection or set a specific number (e.g., 4 or 5) based on how many segments you expect or want to examine. You can leave it at an automatic selection or set a specific number (e.g., 4 or 5) based on how many segments you expect or want to examine. Once the variables (roles) are set, the clustering will run and the object will display a visualization of the clusters. Typically, Visual Analytics shows a scatter plot (or scatter plot matrix) with points colored by cluster membership. Each point represents a customer plotted along two of the feature dimensions, and the color indicates which cluster (segment) they belong to. If you expand the object, you can view a Details tab for the cluster object where you can see the cluster statistics (like the mean values of each input variable for Cluster 1, Cluster 2, etc.). Ensure that multiple clusters have been identified (e.g., you see different color groupings, not just one uniform group). If you only see one cluster, you may need to increase the number of clusters or check if the input role assignment is correct. If the clusters are too granular (e.g., more than needed), you can reduce the number of clusters. Adjust the cluster count and the analysis will update in real-time. Examine the cluster centroids (mean values): for instance, you might see one cluster has much higher Total_Spend and high Pct_Spend_Travel (indicating a segment of high-value travelers), while another cluster has lower spend but a high number of transactions (indicating frequent low-value shoppers). These statistics give an initial understanding of each segment’s profile. At this stage, the customers are segmented into groups based on similar behavior patterns. Step 6: Profile and Visualize the Segments: With clusters formed, the final step is to interpret and communicate the characteristics of each customer segment using visualizations. Derive Cluster Membership: First, create a new data item for the cluster assignment if VA hasn't already. In many versions, the cluster object can generate a Cluster ID variable for each observation. Find the option (often under the cluster object's Options or via a right-click) to derive cluster ID items. This will add a categorical variable (say, Cluster) to the data as well as other statistics, assigning each customer a cluster label (such as 1, 2, 3, ...). Now you can use this Cluster ID variable and statistics in other visuals. Examine the Parallel Coordinates Plot: The parallel coordinates chart in the Cluster object can help you identify what types of customers tend to make up clusters. This can aid you in your exploration to label types of customers from clusters. Segment Size: Add a bar chart showing how many customers are in each cluster. Drag a Bar Chart onto the canvas. Set the category axis to the Cluster variable, and the measure to a count of Customer_ID (or simply use the automatic count of records, Frequency). This will display a bar for each cluster (segment) with its size. For example, you might find Cluster 1 has 200 customers, Cluster 2 has 500, etc. This helps gauge the volume of each segment. Compare Cluster Metrics: Create visuals to compare feature values across clusters. For instance, you can have a bar chart of average Total_Spend by Cluster (Cluster on the x-axis, and the bar height showing the average Total_Spend for customers in that cluster). Similarly, you could create a bar chart for average Transaction_Count by cluster, or average Pct_Spend_Travel by cluster. These charts make it easy to see which segment spends the most, which segment has the highest transaction frequency, which segment spends a large portion on travel or dining, etc. You might observe, for example, that Cluster 1 has the highest Total_Spend and a large Travel spend percentage, whereas Cluster 3 has the lowest Total_Spend and is skewed towards Dining expenses. Such differences are the hallmarks of distinct lifestyles. Demographic Breakdown: To add context, examine how demographic or categorical variables differ by cluster. For example, create a stacked bar chart with Cluster on the x-axis and segments of the bar representing Loyalty_Tier distribution (Gold/Silver/Bronze) within each cluster. This could reveal, for instance, that Cluster 1 is mostly Gold-tier members, whereas Cluster 3 might be mostly Bronze. You can do similar analysis for Gender or Age groups if those fields are available, by creating charts or cross-tabulations (e.g., a crosstab of Cluster by Age_group count). Interactive Exploration: Leverage VA’s interactivity to dig deeper. For instance, if you click on the bar for Cluster 1 in the segment size chart, you can set it to filter the page by Cluster 1. Then all other charts (spend, frequency, etc.) will update to show values just for Cluster 1. To do this, click the Actions pane on the right side of the screen. Selecting an individual object will allow you to choose which object it filters or highlights. Alternatively, you can use Automatic actions on all objects for Visual Analytics to automatically add interactions. This interactive filtering is a powerful way to isolate each segment and note its characteristics. Do this for each cluster to write down their profiles. Interpretation: Summarize what defines each segment. For example, you might find: Cluster 1: Very high average spend and a large share of spend on travel, with most members in top loyalty tiers and heavy use of online channels – these could be labeled “Affluent Online Travelers.” Cluster 2: Moderate spend but the highest transaction count (many small purchases, a big share on dining and everyday categories) – perhaps “Frequent Everyday Spenders.” Cluster 3: Low spend and low transaction count, predominantly branch usage and lower loyalty tiers – maybe “Occasional Traditionalists.” Cluster 4: Mid-level spend with balanced category spend and mixed channel usage – e.g., “Diversified Regulars.” Use the features of Visual Analytics to deep dive into the data to identify cluster profiles and build an interactive report from it.

DanielT_SAS · ‎11-11-2025

This is part of a new series of technical use cases, designed to help SAS users solve a particular business or technical challenge. The challenge: How to segment customers based on their lifestyle, spending habits, and engagement patterns to develop targeted loyalty programs that increase customer retention, cross-sell opportunities, and brand advocacy, while optimizing the ability to offer personalized rewards and services that align with each customer segment. Products used: SAS Studio and Visual Analytics (Visual Statistics provides advanced segmentation and predictive analytics to improve the accuracy of lifestyle segmentation). The Visual Analytics components are covered in Part 2 of this series. SAS Data Explorer Step 1: Import Customer Data into CAS: Open Manage Data in SAS Viya (this launches SAS Data Explorer) In the Import tab, choose the source as Local Files (if uploading from your computer) and select the raw data files – for example, a customers.csv (with demographic and loyalty info) and a transactions.csv (with transaction histories). Specify the target CAS library (e.g., CASUSER for your personal session) and table name for each or accept defaults. Click Import to load each file individually or Import All to load all files at once. The data will be loaded into SAS Viya's in-memory environment (CAS) as tables (e.g., CASUSER.CUSTOMERS and CASUSER.TRANSACTIONS). If a table with the same name already exists, you can either import under a different name or check the Replace option to overwrite it. Result: The raw customer profile and transaction data are now available in CAS as in-memory tables. SAS Studio Step 2: Verify Data in SAS Studio and start a CAS session: Launch SAS Studio (from the SAS landing page or applications menu, select Develop Code and Flows) From the Start Page, select “Program in SAS” to create a new SAS program, enter the following code and press F3 to run it: cas; caslib _ALL_ assign; Check the Log for confirmation messages (you should see notes such as "The session CASAUTO connected successfully" and a list of assigned caslibs). Now test the setup to ensure you can work with the CAS data: for example, run a simple proc print data=CASUSER.CUSTOMERS (obs=5); to print the first 5 rows of the CUSTOMERS table. If the output appears as expected, it means your CAS session is active and the data is accessible through SAS code. From this point on, any data steps or procedures you run on tables loaded to CAS will execute in the CAS environment. Tables that are not in CAS will execute locally in compute. In SAS Studio's left pane, expand the Libraries section and locate the CAS library (e.g., CASUSER) where you imported the data. You should see your tables (e.g., CUSTOMERS and TRANSACTIONS). Open each table to inspect the contents (for instance, right-click on CUSTOMERS and choose Open). Verify that all expected columns are present and correctly populated: for example, the CUSTOMERS table should have fields like Customer_ID, Name, Age, Gender, Location, Loyalty_Tier, etc., and the TRANSACTIONS table should have Customer_ID (to link to customers), Transaction_Date, Amount, Merchant_Category, Channel, etc. Check a few sample rows to ensure values look reasonable (e.g., numeric fields contain numbers, dates are in proper format). Also confirm the row count roughly matches the source data (e.g., if transactions.csv had 50,000 records, the CAS table shows ~50,000 rows). This can be done with the following code: proc contents data=casuser.customers; run; Or, proc fedsql sessref=casauto; select count(*) from casuser.customers; quit; This step confirms the data was imported correctly and is accessible for processing Step 3: (Optional) Load Data via PROC CASUTIL or PROC IMPORT: As an alternative to the interactive import in Step 1, you can use SAS code to load data into CAS (useful for automation or repeatable scripts). For example, to load the transactions CSV via code, you could use: This code loads the CSV file from the specified path into the CASUSER caslib as a table named TRANSACTIONS (replacing any existing table with that name). You would execute a similar proc casutil step for the customers.csv file. After running, check the SAS log for a note confirming the load (e.g., it will report the number of rows and columns loaded and the table name). The end result is the same: the data resides in CAS tables ready for use. This step is optional if you already imported the data via the GUI, but it's good to know for creating automated workflows. Step 4: Clean and Merge Data: Prepare the data for analysis by cleaning inconsistencies and combining where necessary. First, standardize column names and data types — ensure that all field names are valid and easy to read (for example, SAS automatically converts spaces to underscores, so "Annual Income" becomes Annual_Income; rename any oddly formatted names if needed). Verify numeric fields are truly numeric in CAS (if something imported as a character, convert it, e.g., using an INPUT function in a DATA step, or expand the number of sample rows so PROC CASUTIL or PROC IMPORT has a better guess). Address missing values or outliers as appropriate (you might, for instance, fill missing ages with a median or flag them, and check for any negative transaction amounts or other anomalies). Next, merge the customer and transaction tables to create a unified dataset. Typically, you perform a join on the Customer_ID field. In SAS Studio, you can use a PROC FEDSQL step or a DATA step merge. For example, using PROC FEDSQL: This left join ensures every transaction record (t) is paired with its corresponding customer profile info (c) The result CASUSER.CUSTOMER_COMBINED will have multiple rows per customer (one for each transaction, including customers with no transactions if any). After merging, perform some quick checks on this combined table: verify that for a given Customer_ID, the demographic fields (like Name, Loyalty_Tier) are consistent across all their transaction rows, and confirm that the number of rows matches the number of transactions (to ensure no duplication or loss occurred during the join). You can check for duplication using this code: proc sort data=casuser.customers out=casuser.dupe_check nodupkey dupout=casuser.dupes; by customer_id; run; This code will check for and remove duplicate customer IDs in CASUSER.customers, create a new deduplicated table, and save the duplicated values to a separate CAS table. If this table is populated with rows, then there are duplicates. The original table is still in-tact and not modified. If you wish to modify the original table, remove the out= option from PROC SORT. By the end of this step, you have a clean, integrated dataset of customer information and their transactions, which will serve as the basis for calculating segmentation features. Step 5: Feature Engineering – Create Customer-Level Metrics: Derive aggregated metrics that capture each customer’s engagement and spending habits, for use in segmentation. We need to summarize the transaction data up to the customer level. Using the combined dataset from Step 5, calculate features for each Customer_ID. Key metrics include: Total_Spend: total monetary amount the customer has spent (sum of all transaction amounts) Transaction_Count: total number of transactions the customer made (frequency of purchase). Avg_Transaction_Value: average amount per transaction (Total_Spend divided by Transaction_Count). Distinct_Merchant_Categories: number of unique merchant categories in which the customer has made purchases (breadth of spending interests). Pct_Spend_Travel / Pct_Spend_Dining: percentage of the customer's spend devoted to specific categories like Travel or Dining (for example, if $200 of $1000 total is travel-related, Pct_Spend_Travel = 20%). These indicate lifestyle preferences (e.g., a high travel percentage might identify a frequent traveler). Online_Txn_Count: number of transactions conducted via online channels (versus in-branch), as an indicator of digital engagement. You could also compute an online transaction ratio (Online_Txn_Count / Transaction_Count). Last_Transaction_Date: the most recent transaction date for the customer, which can be transformed into a recency measure (e.g., days since last purchase). Also, carry forward key demographic or profile attributes for reference (such as Age, Gender, or Loyalty_Tier) into the customer-level table (since these are constant per customer, you might take a MIN or MAX in the aggregation just to include them). You can obtain these metrics using a single SQL query with GROUP BY, or with SAS procedures. For example, a PROC FEDSQL might look like: proc fedsql; create table casuser.customer_features as select Customer_ID, sum(Transaction_Amount) as Total_Spend, count(*) as Transaction_Count, mean(Transaction_Amount) as Avg_Transaction_Value, count(distinct Merchant_Category) as Distinct_Merchant_Categories, sum(case when Merchant_Category = 'Travel' then Transaction_Amount else 0 end) / SUM(Transaction_Amount) as Pct_Spend_Travel, sum(case when Merchant_Category = 'Dining' then Transaction_Amount else 0 end) / SUM(Transaction_Amount) as Pct_Spend_Dining, sum(case when Channel_Type = 'Online' then 1 else 0 end) as Online_Txn_Count, sum(case when Channel_Type = 'Online' then 1 else 0 end) / COUNT(*) as Online_Txn_Ratio, max(Transaction_Date) as Last_Transaction_Date from casuser.customers group by Customer_ID ; quit; This creates a new table CASUSER.CUSTOMER_FEATURES with one row per Customer_ID and the aggregated values. After running such a step, inspect the resulting table (e.g., view a few rows or run PROC MEANS) to ensure the calculations make sense. For instance, check that for each customer, Total_Spend roughly equals Avg_Transaction_Value * Transaction_Count (barring rounding errors), and that percentages like Pct_Spend_Travel are between 0 and 1 (or 0–100% if formatted as percent). Now you have a concise customer-level feature dataset that quantitatively describes each customer's behavior and lifestyle, ready for clustering analysis. Step 6: Promote the Customer Feature Table: The new customer-level table (e.g., CUSTOMER_FEATURES) currently exists in your session’s CAS library and would disappear when the session ends. To use it in Visual Analytics and keep it accessible, promote it to global status in CAS. In SAS Studio, run a PROC CASUTIL promote action. For example: proc casutil; promote casdata='customer_metrics' incaslib='casuser' outcaslib='casuser'; quit; This command takes the CASUSER.CUSTOMER_FEATURES table from your session and promotes it as a globally available table named Customer_Lifestyle_Summary in the CASUSER library. Note that if you have saved your data to CASUSER, you must promote it to another global CASLIB that is accessible by others. Otherwise, your data can only be seen by you since CASUSER is your personal library. In part 2, we'll show you how to create segmentation reports in SAS Visual Analytics.

DanielT_SAS · ‎11-04-2014

Hi Patrick - From my experience, the role of a data steward can differ from org to org, but the common theme is that they straddle the line between IT and business to make data management-focused projects work. As far as job titles go, you do come across somebody with the title "data steward" (the first winner of this contest has the title "chief data steward" at her organization). But, it could just as well be a business analyst, IT systems manager or another title who does the job of communicating data issues between technology groups and their business counterparts.

DanielT_SAS · ‎09-03-2014

Hi - with an error of this type, the best course of action would be to contact SAS Tech Support - SAS Customer Support Knowledge Base and Community. Thanks, Daniel

Online Status	Offline
Date Last Visited	‎11-10-2025 12:41 PM

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 2

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 1

Re: You like me, you really like me!

Re: SAS problem help me please.

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 2

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 1

Re: You like me, you really like me!

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 2

Customer Loyalty Lifestyle Segmentation in SAS Viya: Part 1

Re: You like me, you really like me!

Re: SAS problem help me please.