BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Quantopic
Obsidian | Level 7

Hi all,

 

I need to perform a Cluster Analysis to build a scoring model in SAS exploiting some specific procedures as, for instance, PROC CLUSTER, PROC FASTCLUS and so on.

 

I have a set of continuous variables in my dataset and I must cluster them all to compute the Population Stability Index and check for the frequency of each of them against a target variable, that is a dummy variable assuming value equal to 1 in the case the couterparty went on default and 0 otherwise.

 

Browsing on the internet, I noted that SAS provides a lot of solutions about clustering analysis, but I was not able to exploit such SAS solutions to compute the clusters and get a new dataset with clustered values. 

 

I tried to run the following code:

 

 

PROC CLUSTER DATA = MYLIBRARY.MYDATASET METHOD = AVERAGE CCC PSEUDO
OUT = NEWDATASET;

PLOTS (MAXPOINTS = 200)=DEN(HEIGHT=RSQ);

VAR CONTINUOUS_VAR1 CONTINUOUS_VAR2 CONTINUOUS _VAR3;

ID TARGET_VARIABLE;
RUN;

 

but I did not get the new dataset containing the clustered variables or the cutoff values to build the clusters.

 

Can you help me, please?

 

Any help will be appreciated!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

It looks like you are trying to imitate the Getting Started example for the CLUSTER procedure.  That's fine, but PROC CLUSTER is slightly more complex than the simpler FASTCLUS procedure, which perform k-means clustering. Try this example to get started. If you need tree-based models, you can revisit PROC CLUSTER later:

 

proc fastclus data=sashelp.iris out=Clust maxclusters=3;
   var SepalWidth SepalLength PetalWidth PetalLength;
   ID species;
run;

proc sgscatter data=Clust;
   matrix SepalWidth SepalLength PetalWidth PetalLength / group=cluster;
run;

View solution in original post

1 REPLY 1
Rick_SAS
SAS Super FREQ

It looks like you are trying to imitate the Getting Started example for the CLUSTER procedure.  That's fine, but PROC CLUSTER is slightly more complex than the simpler FASTCLUS procedure, which perform k-means clustering. Try this example to get started. If you need tree-based models, you can revisit PROC CLUSTER later:

 

proc fastclus data=sashelp.iris out=Clust maxclusters=3;
   var SepalWidth SepalLength PetalWidth PetalLength;
   ID species;
run;

proc sgscatter data=Clust;
   matrix SepalWidth SepalLength PetalWidth PetalLength / group=cluster;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2149 views
  • 0 likes
  • 2 in conversation