Help using Base SAS procedures

Reg:Clustering

Reply
Regular Contributor
Posts: 229

Reg:Clustering


Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
         
         
89 78 23 3
90 91 92 93
run;

HI i wnat to create one cluster no on a,b,c,d variable if atleast one is matching in the entire row

for cluster id 1 that i have given is having some thing in common and if it is not matching any where it should give the new number and if all the four are blank

then new number for each one like this i want.

output
a b c d cluster_id
1 2 3 4  1
2 3 4 1  1
5 6 7 8  2
10 11 12 1 1
           3
           4
89 78 23 3 1
90 91 92 93 5

PROC Star
Posts: 7,467

Reg:Clustering

I don't understand what you are trying to do.  You example, IMHO, didn't provide enough information.

Regular Contributor
Posts: 229

Reg:Clustering

output
a b c d cluster_id
1 2 3 4  1   *the clster id is 1 as it is the intial one*
2 3 4 1  1   *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8  2   *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1  *the clster id is 1 as one is matching form the cluster id *
           3              *the clster id is 3 as it is all blank *
           4             *the clster id is 4 as it is all blank *
89 78 23 3 1     *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this  *

90 91 92 93 5    *the clster id is 5 as it no relation with the above ones*

Regular Contributor
Posts: 229

Reg:Clustering

output
a b c d cluster_id
1 2 3 4  1   *the clster id is 1 as it is the intial one*
2 3 4 1  1   *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8  2   *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1  *the clster id is 1 as one is matching form the cluster id *
           3              *the clster id is 3 as it is all blank *
           4             *the clster id is 4 as it is all blank *
89 78 23 3 1     *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this  *

90 91 92 93 5    *the clster id is 5 as it no relation with the above ones*

PROC Star
Posts: 7,467

Reg:Clustering

I'm not sure if this is what you want, and I am sure that the following can be simplified, but does it come close to what you are trying to accomplish?:

Data Temp;

input a b c d;

cards;

1 2 3 4

2 3 4 1

5 6 7 8

10 11 12 1

. . . .

. . . .

89 78 23 3

90 91 92 93

;

run;

data want (keep=cluster a b c d);

  set temp;

  array values(99999);

  array inval(*) a b c d;

  retain values:;

  if _n_ eq 1 then do;

    hcluster+1;

    do i=1 to dim(inval);

      if not missing(inval(i)) then do;

        values(inval(i))=hcluster;

            end;

    end;

    cluster=hcluster;

  end;

  else do;

    if missing(max(of inval(*))) then do;

      hcluster+1;

      cluster=hcluster;

    end;

    else do;

      do i=1 to dim(inval);

        if not missing(values(inval(i))) then do;

          cluster=values(inval(i));

          i=dim(inval)+1;

        end;

      end;

      if missing(cluster) then do;

        hcluster+1;

        cluster=hcluster;

        do i=1 to dim(inval);

          values(inval(i))=cluster;

              end;

      end;

    end;

  end;

run;

Super User
Posts: 10,018

Reg:Clustering

I prefer to use Hash Table .

Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
. . . .
. . . .
89 78 23 3
90 91 92 93
;
run;
data want(drop=rc i j k _k found flag _flag);
 if _n_ eq 1 then do;
                   declare hash ha(hashexp:10);
                    ha.definekey('k');
                    ha.definedata('flag');
                    ha.definedone();
                  end;
 set temp;
 array var{*} a b c d;
 do i=1 to dim(var);
  k=var{i};rc=ha.check();
  if rc eq 0 then do;found=1;_k=k;end;
 end;
 if found then do;k=_k;ha.find(); cluster_id=flag;end;
  else do;  
          _flag+1; flag=_flag;cluster_id=flag;
         do j=1 to dim(var);
          k=var{j};if not missing(k) then ha.replace();
         end;
       end;
run;



Ksharp

Ask a Question
Discussion stats
  • 5 replies
  • 152 views
  • 0 likes
  • 3 in conversation