Hi
I have been doing some work on cluster analysis on our customers to try and learn a bit more about the different types of for marketing purposes.
I have used a basic proc fastclust for this and have 7 clusters that I am happy with using a sample of our data whilst also applying a few rules which
were needed to avoid a lot of people going into one meaningless cluster. I actually used only a year time frame of activity and must have ordered
within both 6m seasonal time frames. This was due to the nature of our business which required these rules.
What I want to do now is apply the clusters to the whole database. That is people who have and haven't ordered in the last year and those
who haven't. It also needs to apply to those who may not have ordered in both the last 6m period.
I have gathered all the necessary variables for the full population. (just done the last year of activity since their last order).
How do I apply the clusters now to allocate them properly? Is there an algorithm that I can get from the cluster analysis that will allocate
everybody to their nearest cluster. I have looked at loads of cluster analysis material online but they all focus on the cluster analysis itself
not the step I am on now.
Thanks
Stephen
proc fastclust data=training_data outstat=CLUSTERS;
var ...;
run;
proc fastclust data=data_to_be_assigned_to_clusters instat=CLUSTERS out=result;
var ...;
run;
oh wow much simpler than I thought.
Thanks I will give that a go.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.