Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
89 78 23 3
90 91 92 93
run;
HI i wnat to create one cluster no on a,b,c,d variable if atleast one is matching in the entire row
for cluster id 1 that i have given is having some thing in common and if it is not matching any where it should give the new number and if all the four are blank
then new number for each one like this i want.
output
a b c d cluster_id
1 2 3 4 1
2 3 4 1 1
5 6 7 8 2
10 11 12 1 1
3
4
89 78 23 3 1
90 91 92 93 5
I don't understand what you are trying to do. You example, IMHO, didn't provide enough information.
output
a b c d cluster_id
1 2 3 4 1 *the clster id is 1 as it is the intial one*
2 3 4 1 1 *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8 2 *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1 *the clster id is 1 as one is matching form the cluster id *
3 *the clster id is 3 as it is all blank *
4 *the clster id is 4 as it is all blank *
89 78 23 3 1 *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this *
90 91 92 93 5 *the clster id is 5 as it no relation with the above ones*
output
a b c d cluster_id
1 2 3 4 1 *the clster id is 1 as it is the intial one*
2 3 4 1 1 *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8 2 *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1 *the clster id is 1 as one is matching form the cluster id *
3 *the clster id is 3 as it is all blank *
4 *the clster id is 4 as it is all blank *
89 78 23 3 1 *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this *
90 91 92 93 5 *the clster id is 5 as it no relation with the above ones*
I'm not sure if this is what you want, and I am sure that the following can be simplified, but does it come close to what you are trying to accomplish?:
Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
. . . .
. . . .
89 78 23 3
90 91 92 93
;
run;
data want (keep=cluster a b c d);
set temp;
array values(99999);
array inval(*) a b c d;
retain values:;
if _n_ eq 1 then do;
hcluster+1;
do i=1 to dim(inval);
if not missing(inval(i)) then do;
values(inval(i))=hcluster;
end;
end;
cluster=hcluster;
end;
else do;
if missing(max(of inval(*))) then do;
hcluster+1;
cluster=hcluster;
end;
else do;
do i=1 to dim(inval);
if not missing(values(inval(i))) then do;
cluster=values(inval(i));
i=dim(inval)+1;
end;
end;
if missing(cluster) then do;
hcluster+1;
cluster=hcluster;
do i=1 to dim(inval);
values(inval(i))=cluster;
end;
end;
end;
end;
run;
I prefer to use Hash Table .
Data Temp; input a b c d; cards; 1 2 3 4 2 3 4 1 5 6 7 8 10 11 12 1 . . . . . . . . 89 78 23 3 90 91 92 93 ; run; data want(drop=rc i j k _k found flag _flag); if _n_ eq 1 then do; declare hash ha(hashexp:10); ha.definekey('k'); ha.definedata('flag'); ha.definedone(); end; set temp; array var{*} a b c d; do i=1 to dim(var); k=var{i};rc=ha.check(); if rc eq 0 then do;found=1;_k=k;end; end; if found then do;k=_k;ha.find(); cluster_id=flag;end; else do; _flag+1; flag=_flag;cluster_id=flag; do j=1 to dim(var); k=var{j};if not missing(k) then ha.replace(); end; end; run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.