turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Forecasting
- /
- How to get a Similarity Matrix

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-25-2013 12:55 PM

In article *An Introduction to Similarity Analysis Using SAS *by Leonard et al., similarity matrices are introduced in these terms :

Similarity measures can be used to compare several

time sequences to form a *similarity matrix*. This situation usually

arises in *time series clustering*. For example, given *K *time

sequences, a (KxK) symmetric matrix can be constructed whose *ij*th

element contains the similarity measure between the *i*th and *j*th

sequence.

That's a neet idea. However, Proc Similarity (in SAS/ETS 9.3) doesn't accept the same series to be listed as an input and a target sequence. What is the best way to get a similarity matrix with Proc Similarity?

PG

PG

Accepted Solutions

Solution

06-25-2013
02:45 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-25-2013 02:45 PM

Hello -

This example might be useful - additional information can be found here: http://support.sas.com/documentation/cdl/en/etsug/63939/HTML/default/viewer.htm#etsug_similarity_sec...

Thanks,

Udo

data tmp;

set sashelp.snacks;

retain Series 0;

if first.product then series+1;

by product;

run;

proc sort data=tmp out=tmp2;

by date;

run;

proc transpose data=tmp2

OUT=tmp3

PREFIX=C_

NAME=reihe

LABEL=Etikett

;

BY Date;

ID series;

VAR QtySold;

run;

proc similarity data=tmp3 out=_null_ outsum=summary;

id date interval=day accumulate=total;

target _numeric_ /normalize=standard measure=mabsdevmax;

run;

data matrix(type=distance);

set summary;

drop _status_;

run;

proc cluster data=matrix outtree=tree method=average;

id _input_;

run;

proc tree data=tree out=result nclusters=4;

id _input_;

run;

All Replies

Solution

06-25-2013
02:45 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-25-2013 02:45 PM

Hello -

This example might be useful - additional information can be found here: http://support.sas.com/documentation/cdl/en/etsug/63939/HTML/default/viewer.htm#etsug_similarity_sec...

Thanks,

Udo

data tmp;

set sashelp.snacks;

retain Series 0;

if first.product then series+1;

by product;

run;

proc sort data=tmp out=tmp2;

by date;

run;

proc transpose data=tmp2

OUT=tmp3

PREFIX=C_

NAME=reihe

LABEL=Etikett

;

BY Date;

ID series;

VAR QtySold;

run;

proc similarity data=tmp3 out=_null_ outsum=summary;

id date interval=day accumulate=total;

target _numeric_ /normalize=standard measure=mabsdevmax;

run;

data matrix(type=distance);

set summary;

drop _status_;

run;

proc cluster data=matrix outtree=tree method=average;

id _input_;

run;

proc tree data=tree out=result nclusters=4;

id _input_;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-25-2013 03:34 PM

Thanks Udo. That's very helpful. I didn't realize that statement INPUT was optional and that in its absence, target sequences would also be considered as input sequences. I hadn't read the clustering example since I have another application in mind.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-26-2013 10:53 AM

PS: my colleague posted a very interesting blog today on How to color clusters in a dendogram - The DO Loop - which might be of interest.