BookmarkSubscribeRSS Feed
Karteek
Calcite | Level 5
Hello Folks,

Have any one tried using Proc Similarity or any other technique for clustering similar time series patterns.

Thanks,
Karteek
5 REPLIES 5
tailee
SAS Employee
In the M2009 conference, a paper was presented for Proc Similarity and Time Series Clustering, you may look at it: Mining Transactional and Time Series Data (Michael Leonard & Meredith John, SAS),
oloolo
Fluorite | Level 6
check my SCSUG 2009 paper
SVD Filtered Temporal Usage Pattern Analysis and Clustering

the main idea is SVD on a toeplitz matrix like the SSA technique used in Meteorology

you can obtain a copy either at SCSUG.org or at my blog

http://www.sas-programming.com/2009/09/svd-filtered-temporal-usage-pattern.html


> Hello Folks,
>
> Have any one tried using Proc Similarity or any other
> technique for clustering similar time series
> patterns.
>
> Thanks,
> Karteek Message was edited by: oloolo
jonathan_jmp
Calcite | Level 5
By no means am I an expert on the matter. I haven't used PROC Similarity, so I can't comment about the proc.

A friend of mine told me once he was clustering a number of series according to their measurements across time. That didn't sit right with me, so I created some data to see how it would work.

I started by creating several series of data with no pattern across time, but with different averages. Even with telling the software to standardize the data, the series that had similar means were clustered together.

I then created data with patterns across time, with some series having the same pattern. And those series that had different patterns across time were given similar means. The clustering grouped the series that had similar means, even though the patterns across time were different. Also, the series that had similar patterns were not grouped, because the means were different.

I then took that same data and made all the means equal. When I did the clustering, the groups with similar patterns were put together.

So, to make along story short, how do you quantify a pattern across time, so you can cluster series that are similar. That's a hard question. The patterns have to be more obvious than differences in series averages, or other things you don't want to cluster on.

I'm sure there's more to this, but I'm not educated enough on the subject to comment futher. Message was edited by: jonathan@jmp
oloolo
Fluorite | Level 6
could u detail the clustering algorithm and process you used in your analysis?
plain k-mean clustering on raw/simply stdized data?

> By no means am I an expert on the matter. I haven't
> used PROC Similarity, so I can't comment about the
> proc.
>
> A friend of mine told me once he was clustering a
> number of series according to their measurements
> across time. That didn't sit right with me, so I
> created some data to see how it would work.
>
> I started by creating several series of data with no
> pattern across time, but with different averages.
> Even with telling the software to standardize the
> data, the series that had similar means were
> clustered together.
>
> I then created data with patterns across time, with
> some series having the same pattern. And those
> series that had different patterns across time were
> given similar means. The clustering grouped the
> series that had similar means, even though the
> patterns across time were different. Also, the
> series that had similar patterns were not grouped,
> because the means were different.
>
> I then took that same data and made all the means
> equal. When I did the clustering, the groups with
> similar patterns were put together.
>
> So, to make along story short, how do you quantify a
> pattern across time, so you can cluster series that
> are similar. That's a hard question. The patterns
> have to be more obvious than differences in series
> averages, or other things you don't want to cluster
> on.
>
> I'm sure there's more to this, but I'm not educated
> enough on the subject to comment futher.
>
> Message was edited by: jonathan@jmp
jonathan_jmp
Calcite | Level 5
It was nothing special. I don't want to make more out of it than it was.

I just created several series of data, and tried to cluster the series, with their values across time used as the clustering variables. I used JMP, and used the default Ward clustering. I know there are other clustering methods, but I didn't try them. JMP's clustering platform has a Standardize Data option. I tried it with and without that option.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1834 views
  • 0 likes
  • 4 in conversation