BookmarkSubscribeRSS Feed
aahlers2
Calcite | Level 5

I have never used PROC SIMILARITY and would like some advice from the SAS Community.I have time series data (1970-2013) for 40 states.  I want to cluster these 40 states into 3-4 groups based on their time series similarities.  Additionally, I would like to find a way to visually represent these derived groupings (e.g., a cluster tree).  My csv file has 40 columns (each state) and 43 rows (years).  The SAS example for time series clustering appears fairly incomplete and I am having trouble finding the syntax for what I want to do. Can someone provide any advice?

Thanks,

AAA

1 REPLY 1
Ruiwen
SAS Employee

I would recommend you using the Time Series Similarity node in Enterprise Miner (if it is an option for you) which outputs the clusters and the hierarchy structure.

Of course, you can do the same thing in SAS (ETS installation required) through some coding yourself. You can obtain the similarity matrix from PROC SIMILARITY and then call PROC CLUSTER or CORR for analyzing the association and clustering. The basic syntax to meet your specific need will look like,

PROC SIMILARITY data=InputData  out=OutputData outsum=OutSummary;

     ID timeid interval = ... accumulate=... ;

     TARGET series1 series2 ...;

RUN;

You don't need BY statement if you don't have Cross ID.

You can then feed the OutSummary data as a Distance type to PROC CLUSTER.

    

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1557 views
  • 0 likes
  • 2 in conversation