topic Re: Time Series Similarity Clustering in SAS Data Science

Time Series Similarity Clustering

Karteek — Wed, 18 Nov 2009 19:24:09 GMT

Hello Folks,

Have any one tried using Proc Similarity or any other technique for clustering similar time series patterns.

Thanks,
Karteek

Re: Time Series Similarity Clustering

tailee — Fri, 20 Nov 2009 17:11:53 GMT

In the M2009 conference, a paper was presented for Proc Similarity and Time Series Clustering, you may look at it: Mining Transactional and Time Series Data (Michael Leonard & Meredith John, SAS),

Re: Time Series Similarity Clustering

oloolo — Wed, 07 Apr 2010 00:52:50 GMT

check my SCSUG 2009 paper
SVD Filtered Temporal Usage Pattern Analysis and Clustering

the main idea is SVD on a toeplitz matrix like the SSA technique used in Meteorology

you can obtain a copy either at SCSUG.org or at my blog

http://www.sas-programming.com/2009/09/svd-filtered-temporal-usage-pattern.html

> Hello Folks,
>
> Have any one tried using Proc Similarity or any other
> technique for clustering similar time series
> patterns.
>
> Thanks,
> Karteek Message was edited by: oloolo

Re: Time Series Similarity Clustering

jonathan_jmp — Thu, 08 Apr 2010 15:29:26 GMT

By no means am I an expert on the matter. I haven't used PROC Similarity, so I can't comment about the proc.

A friend of mine told me once he was clustering a number of series according to their measurements across time. That didn't sit right with me, so I created some data to see how it would work.

I started by creating several series of data with no pattern across time, but with different averages. Even with telling the software to standardize the data, the series that had similar means were clustered together.

I then created data with patterns across time, with some series having the same pattern. And those series that had different patterns across time were given similar means. The clustering grouped the series that had similar means, even though the patterns across time were different. Also, the series that had similar patterns were not grouped, because the means were different.

I then took that same data and made all the means equal. When I did the clustering, the groups with similar patterns were put together.

So, to make along story short, how do you quantify a pattern across time, so you can cluster series that are similar. That's a hard question. The patterns have to be more obvious than differences in series averages, or other things you don't want to cluster on.

I'm sure there's more to this, but I'm not educated enough on the subject to comment futher. Message was edited by: jonathan@jmp

Re: Time Series Similarity Clustering

oloolo — Sat, 10 Apr 2010 00:54:41 GMT

could u detail the clustering algorithm and process you used in your analysis?
plain k-mean clustering on raw/simply stdized data?

> By no means am I an expert on the matter. I haven't
> used PROC Similarity, so I can't comment about the
> proc.
>
> A friend of mine told me once he was clustering a
> number of series according to their measurements
> across time. That didn't sit right with me, so I
> created some data to see how it would work.
>
> I started by creating several series of data with no
> pattern across time, but with different averages.
> Even with telling the software to standardize the
> data, the series that had similar means were
> clustered together.
>
> I then created data with patterns across time, with
> some series having the same pattern. And those
> series that had different patterns across time were
> given similar means. The clustering grouped the
> series that had similar means, even though the
> patterns across time were different. Also, the
> series that had similar patterns were not grouped,
> because the means were different.
>
> I then took that same data and made all the means
> equal. When I did the clustering, the groups with
> similar patterns were put together.
>
> So, to make along story short, how do you quantify a
> pattern across time, so you can cluster series that
> are similar. That's a hard question. The patterns
> have to be more obvious than differences in series
> averages, or other things you don't want to cluster
> on.
>
> I'm sure there's more to this, but I'm not educated
> enough on the subject to comment futher.
>
> Message was edited by: jonathan@jmp

Re: Time Series Similarity Clustering

jonathan_jmp — Mon, 12 Apr 2010 19:18:21 GMT

It was nothing special. I don't want to make more out of it than it was.

I just created several series of data, and tried to cluster the series, with their values across time used as the clustering variables. I used JMP, and used the default Ward clustering. I know there are other clustering methods, but I didn't try them. JMP's clustering platform has a Standardize Data option. I tried it with and without that option.