BookmarkSubscribeRSS Feed
aahlers2
Calcite | Level 5

I have never used PROC SIMILARITY and would like some advice from the SAS Community.I have time series data (1970-2013) for 40 states.  I want to cluster these 40 states into 3-4 groups based on their time series similarities.  Additionally, I would like to find a way to visually represent these derived groupings (e.g., a cluster tree).  My csv file has 40 columns (each state) and 43 rows (years).  The SAS example for time series clustering appears fairly incomplete and I am having trouble finding the syntax for what I want to do. Can someone provide any advice?

Thanks,

AAA

1 REPLY 1
Ruiwen
SAS Employee

I would recommend you using the Time Series Similarity node in Enterprise Miner (if it is an option for you) which outputs the clusters and the hierarchy structure.

Of course, you can do the same thing in SAS (ETS installation required) through some coding yourself. You can obtain the similarity matrix from PROC SIMILARITY and then call PROC CLUSTER or CORR for analyzing the association and clustering. The basic syntax to meet your specific need will look like,

PROC SIMILARITY data=InputData  out=OutputData outsum=OutSummary;

     ID timeid interval = ... accumulate=... ;

     TARGET series1 series2 ...;

RUN;

You don't need BY statement if you don't have Cross ID.

You can then feed the OutSummary data as a Distance type to PROC CLUSTER.

    

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1636 views
  • 0 likes
  • 2 in conversation