Help using Base SAS procedures

Distance of mixed data, which can be used in Proc Cluster

Reply
N/A
Posts: 0

Distance of mixed data, which can be used in Proc Cluster

Hello all,

I have a problem to deal with a mixed data for cluster.
Data consists of 4 variables (3 categorical and 1 numerical) and 1 id. Each categorical variables has more than 2 levels and is nominal (not ordinal).
First, I tried to find a distance data of this id (120 id's): Proc Distance
Second, I will use this distance data for finding clusters: Proc Clutser


Here is the data and code I use:

data reference;
input year $ title $ ghg technology $ cluster refnum $;
cards;
2000 title1 60 p-Si 1 ref1
2000 title1 30 p-Si 1 ref2
2000 title1 20 p-Si 1 ref3
2000 title1 50 a-Si 1 ref4
2000 title1 20 a-Si 1 ref5
2000 title1 10 a-Si 1 ref6
2006 title2 30 ribbon-Si 5 ref7
2006 title2 35 p-Si 5 ref8
2006 title2 45 m-Si 5 ref9
1996 title3 167 p-Si 7 ref10
1996 title3 98 p-Si 7 ref11
1996 title4 164 m-Si 7 ref12
1996 title4 228 p-Si 7 ref13
2004 title5 41 m-Si 2 ref14
2004 title5 37 m-Si 2 ref15
2004 title5 56 m-Si 2 ref16
2004 title5 38 m-Si 2 ref17
2004 title5 37 m-Si 2 ref18
2004 title5 39 m-Si 2 ref19
2004 title5 53 m-Si 2 ref20
;
proc distance data=reference out=some11;
var nnominal(year title technology) anominal(ghg/ustd=std) ;
id refnum;
run;

proc cluster data=some11 outtree=tree method=average ccc pseudo;
id refnum;
run;


Can you help me producing a right distance data and clusters from this mixed data?

Thank you in advance,

Jin Message was edited by: jinsuk
Ask a Question
Discussion stats
  • 0 replies
  • 84 views
  • 0 likes
  • 1 in conversation