09-24-2013 03:32 PM
i , I have a simple question I want to find the distance between clusters
I have a cluster data set
date cluster_id number
11/30/2000 1 7
12/31/2000 1 8
11/30/2000 2 6
12/31/2000 2 5
Potentially 100 cluster_ids
I want to compute the euclidean distance between each cluster
for all dates and all cluster_ids
dist_i_j = sum(( number i - number j )^2)
The final output should look like
cluster_id with_cluster_id dist_i_j
1 2 10
2 1 10
I get the 10 finding the distance across all dates (in this example 2).
10 = (7-6)^2 + (8-5)^2 = 1+9 = 10
Thanks so much for your help!
09-24-2013 05:49 PM
Your distance measure is squared euclidean distance by cluster joining by date.
I think proc distance or proc corr could be used or proc fastclus. There's always SQL.
proc distance will work with a few extra steps.
How many clusters are you likely to have, will you know that ahead of time or is it dynamic?