Home
- /
Analytics
- /
Stat Procs
- /
Time Series Clustering Example

06-25-2016 07:37 AM

Hi all,

I need to perform a Cluster Analysis to build a scoring model in SAS exploiting some specific procedures as, for instance, PROC CLUSTER, PROC FASTCLUS and so on.

I have a set of continuous variables in my dataset and I must cluster them all to compute the Population Stability Index and check for the frequency of each of them against a target variable, that is a dummy variable assuming value equal to 1 in the case the couterparty went on default and 0 otherwise.

Browsing on the internet, I noted that SAS provides a lot of solutions about clustering analysis, but I was not able to exploit such SAS solutions to compute the clusters and get a new dataset with clustered values.

I tried to run the following code:

PROC CLUSTER DATA = MYLIBRARY.MYDATASET METHOD = AVERAGE CCC PSEUDO

OUT = NEWDATASET;

PLOTS (MAXPOINTS = 200)=DEN(HEIGHT=RSQ);

VAR CONTINUOUS_VAR1 CONTINUOUS_VAR2 CONTINUOUS _VAR3;

ID TARGET_VARIABLE;

RUN;

but I did not get the new dataset containing the clustered variables or the cutoff values to build the clusters.

Can you help me, please?

Any help will be appreciated!

07-03-2016
06:22 AM

06-25-2016 12:17 PM

It looks like you are trying to imitate the Getting Started example for the CLUSTER procedure. That's fine, but PROC CLUSTER is slightly more complex than the simpler FASTCLUS procedure, which perform k-means clustering. Try this example to get started. If you need tree-based models, you can revisit PROC CLUSTER later:

```
proc fastclus data=sashelp.iris out=Clust maxclusters=3;
var SepalWidth SepalLength PetalWidth PetalLength;
ID species;
run;
proc sgscatter data=Clust;
matrix SepalWidth SepalLength PetalWidth PetalLength / group=cluster;
run;
```

07-03-2016
06:22 AM

06-25-2016 12:17 PM

