BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Quantopic
Obsidian | Level 7

Hi all,

 

I need to perform a Cluster Analysis to build a scoring model in SAS exploiting some specific procedures as, for instance, PROC CLUSTER, PROC FASTCLUS and so on.

 

I have a set of continuous variables in my dataset and I must cluster them all to compute the Population Stability Index and check for the frequency of each of them against a target variable, that is a dummy variable assuming value equal to 1 in the case the couterparty went on default and 0 otherwise.

 

Browsing on the internet, I noted that SAS provides a lot of solutions about clustering analysis, but I was not able to exploit such SAS solutions to compute the clusters and get a new dataset with clustered values. 

 

I tried to run the following code:

 

 

PROC CLUSTER DATA = MYLIBRARY.MYDATASET METHOD = AVERAGE CCC PSEUDO
OUT = NEWDATASET;

PLOTS (MAXPOINTS = 200)=DEN(HEIGHT=RSQ);

VAR CONTINUOUS_VAR1 CONTINUOUS_VAR2 CONTINUOUS _VAR3;

ID TARGET_VARIABLE;
RUN;

 

but I did not get the new dataset containing the clustered variables or the cutoff values to build the clusters.

 

Can you help me, please?

 

Any help will be appreciated!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

It looks like you are trying to imitate the Getting Started example for the CLUSTER procedure.  That's fine, but PROC CLUSTER is slightly more complex than the simpler FASTCLUS procedure, which perform k-means clustering. Try this example to get started. If you need tree-based models, you can revisit PROC CLUSTER later:

 

proc fastclus data=sashelp.iris out=Clust maxclusters=3;
   var SepalWidth SepalLength PetalWidth PetalLength;
   ID species;
run;

proc sgscatter data=Clust;
   matrix SepalWidth SepalLength PetalWidth PetalLength / group=cluster;
run;

View solution in original post

1 REPLY 1
Rick_SAS
SAS Super FREQ

It looks like you are trying to imitate the Getting Started example for the CLUSTER procedure.  That's fine, but PROC CLUSTER is slightly more complex than the simpler FASTCLUS procedure, which perform k-means clustering. Try this example to get started. If you need tree-based models, you can revisit PROC CLUSTER later:

 

proc fastclus data=sashelp.iris out=Clust maxclusters=3;
   var SepalWidth SepalLength PetalWidth PetalLength;
   ID species;
run;

proc sgscatter data=Clust;
   matrix SepalWidth SepalLength PetalWidth PetalLength / group=cluster;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2128 views
  • 0 likes
  • 2 in conversation