BookmarkSubscribeRSS Feed
R_Win
Calcite | Level 5


Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
         
         
89 78 23 3
90 91 92 93
run;

HI i wnat to create one cluster no on a,b,c,d variable if atleast one is matching in the entire row

for cluster id 1 that i have given is having some thing in common and if it is not matching any where it should give the new number and if all the four are blank

then new number for each one like this i want.

output
a b c d cluster_id
1 2 3 4  1
2 3 4 1  1
5 6 7 8  2
10 11 12 1 1
           3
           4
89 78 23 3 1
90 91 92 93 5

5 REPLIES 5
art297
Opal | Level 21

I don't understand what you are trying to do.  You example, IMHO, didn't provide enough information.

R_Win
Calcite | Level 5

output
a b c d cluster_id
1 2 3 4  1   *the clster id is 1 as it is the intial one*
2 3 4 1  1   *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8  2   *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1  *the clster id is 1 as one is matching form the cluster id *
           3              *the clster id is 3 as it is all blank *
           4             *the clster id is 4 as it is all blank *
89 78 23 3 1     *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this  *

90 91 92 93 5    *the clster id is 5 as it no relation with the above ones*

R_Win
Calcite | Level 5

output
a b c d cluster_id
1 2 3 4  1   *the clster id is 1 as it is the intial one*
2 3 4 1  1   *the clster id is 1 as 2 3 4 are there in clusterid 1*
5 6 7 8  2   *the clster id is 2 as it is no combitantions of 1 *
10 11 12 1 1  *the clster id is 1 as one is matching form the cluster id *
           3              *the clster id is 3 as it is all blank *
           4             *the clster id is 4 as it is all blank *
89 78 23 3 1     *the clster id is 1 as 3 is matching form the cluster id 1 so again assained cluster 1 for this  *

90 91 92 93 5    *the clster id is 5 as it no relation with the above ones*

art297
Opal | Level 21

I'm not sure if this is what you want, and I am sure that the following can be simplified, but does it come close to what you are trying to accomplish?:

Data Temp;

input a b c d;

cards;

1 2 3 4

2 3 4 1

5 6 7 8

10 11 12 1

. . . .

. . . .

89 78 23 3

90 91 92 93

;

run;

data want (keep=cluster a b c d);

  set temp;

  array values(99999);

  array inval(*) a b c d;

  retain values:;

  if _n_ eq 1 then do;

    hcluster+1;

    do i=1 to dim(inval);

      if not missing(inval(i)) then do;

        values(inval(i))=hcluster;

            end;

    end;

    cluster=hcluster;

  end;

  else do;

    if missing(max(of inval(*))) then do;

      hcluster+1;

      cluster=hcluster;

    end;

    else do;

      do i=1 to dim(inval);

        if not missing(values(inval(i))) then do;

          cluster=values(inval(i));

          i=dim(inval)+1;

        end;

      end;

      if missing(cluster) then do;

        hcluster+1;

        cluster=hcluster;

        do i=1 to dim(inval);

          values(inval(i))=cluster;

              end;

      end;

    end;

  end;

run;

Ksharp
Super User

I prefer to use Hash Table .

Data Temp;
input a b c d;
cards;
1 2 3 4
2 3 4 1
5 6 7 8
10 11 12 1
. . . .
. . . .
89 78 23 3
90 91 92 93
;
run;
data want(drop=rc i j k _k found flag _flag);
 if _n_ eq 1 then do;
                   declare hash ha(hashexp:10);
                    ha.definekey('k');
                    ha.definedata('flag');
                    ha.definedone();
                  end;
 set temp;
 array var{*} a b c d;
 do i=1 to dim(var);
  k=var{i};rc=ha.check();
  if rc eq 0 then do;found=1;_k=k;end;
 end;
 if found then do;k=_k;ha.find(); cluster_id=flag;end;
  else do;  
          _flag+1; flag=_flag;cluster_id=flag;
         do j=1 to dim(var);
          k=var{j};if not missing(k) then ha.replace();
         end;
       end;
run;



Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 826 views
  • 0 likes
  • 3 in conversation