BookmarkSubscribeRSS Feed
95student
Calcite | Level 5

I'm not fluent in english so hope you understand 🙂

 

I am handling customer card usage history dataset to study emnier.

 

I have two question. 

 

1. Each row of this dataset is customer's transaction and column consists of customer's age, usage date, amount of payment, and so on.

To make analysing easier, I made each row of dataset to be one customer's total transaction. And to make similar scale between each variable, I turn data into % scale .( For example combine buy date variable by seasonally, then make into buy_spring, buy_summer variable.. ) Is this right?

 

 

 

2.  I am using a combination of cluster node+segment profile node but result of segement profile chart doesn't show up all. Is it because something wrong or  variable worth calculated by eminer is too small? Image is at the bottom.

ajssfsfsd.PNG

 

thank you!

2 REPLIES 2
DougWielenga
SAS Employee

1. Each row of this dataset is customer's transaction and column consists of customer's age, usage date, amount of payment, and so on.  To make analysing easier, I made each row of dataset to be one customer's total transaction. And to make similar scale between each variable, I turn data into % scale .( For example combine buy date variable by seasonally, then make into buy_spring, buy_summer variable.. ) Is this right?

 

 The Segment Profile node attempts to predict segment (cluster/group) membership from the available input variables.   It uses a Tree based method to assess how to split the variables so changes to the scale of the variables should not be critical.  Transformations can lead to slightly different splits leading to the results appearing different, but that is the nature of Trees in general which can look very different when fit to data sets which are only slightly different.  Either way, it should be useful for understanding how the input variables relate to segment membership.  It sounds as if you have summarized the data where there is one row per customer which makes sense if you are trying to work at the customer level.  It is not clear how you are modifying the original input variables to create the variables that you are using but there is no 'right' or 'wrong' way to do so -- it is only that some approaches to doing so create more useful information than others.  The usefulness of any segment profile results must be assessed based on your specific business goals. 

 

2.  I am using a combination of cluster node+segment profile node but result of segement profile chart doesn't show up all. Is it because something wrong or  variable worth calculated by eminer is too small?

 

I did some searching but did not find anyone reporting this specific behavior.  It would be expected that not all variables would be helpful but I would expect some of the variables to be helpful, particularly if the segments were created by a process in SAS Enterprise Miner.  There are things that might cause unusual behavior in the software such as...

... expired SAS software license

... data anomalies that are causing numerical issues (e.g. extremely small range of values for one or more variables)

... configuration issues such as using an unsupported version of Java

... large percentage of missing values (consider imputing some of the values)

 

One good test in this situation is to test building a similar flow against sample data such as that available by opening a SAS Enterprise Miner project and clicking on     

 

    Help --> Generate Sample Data Sources...

 

For example, add the Home Equity data source (SAMPSIO.HMEQ) to a new diagram and then add a Cluster node followed by a Segment Profile node to the flow.  Run the flow and see if you get all of the expected results.  If you still encounter problems seeing all of the profile plots, you likely have a configuration issue.  If the flow does produce a full set of results, you might be dealing with a data specific issue involving either the input data itself or the segment variable you are using.   In this case, review your input data for potential problems such as unusually wide formatting, high percentages of missing values, extremely large number of levels, very small numbers of observations in some subset of segments, etc...   If you are unable to locate the problem, you can also contact SAS Technical Support to assist you.

 

Hope this helps!

Doug 

 

MeraHumaira
Calcite | Level 5

Ensure your cluster variable role to be in 'Segment'.

 

MeraHumaira_0-1629100948511.png

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2141 views
  • 1 like
  • 3 in conversation