BookmarkSubscribeRSS Feed
My_SAS
Calcite | Level 5

i want to get one number for identical records and i want to get for the incremenatal runs alos

i am having firstdataset

firstdataset:
name age
a     10
b     10
c     11
d     11
e     12


output:

name age  cluster_no
a     10  1
b     10  1
c     11  2
d     11  2
e     12  3


For incremental data
----------------------
In this it should take the max of cluster_no+1 and should do the same if any of the records match with the existing records it should
give the same cluster_no and the incremental data should append to the exsting data

eample data for incrdataset

name age
e     13
f     12
g     14
h     14


Final output of both
name age  cluster_no
a     10  1
b     10  1
c     11  2
d     11  2
e     12  3
e     13  4(This should start with the max(cluster_no+1)
f     12  3(As is was alredy assined in the first dataset)
g     14  5
h     14  5

please help me in this


6 REPLIES 6
art297
Opal | Level 21

You could do it with either an array or a hash object.  Here is one way to do it with an array:

data have (drop=clusters: counter);

  input name $ age;

  array clusters(150);

  retain clusters:;

  if missing(clusters(age)) then do;

    counter+1;

    clusters(age)=counter;

    cluster_num=counter;

  end;

  else cluster_num=clusters(age);

  cards;

a     10

b     10

c     11

d     11

e     12

;

data incrdataset;

  input name $ age;

  cards;

e     13

f     12

g     14

h     14

;

data want (drop=clusters: counter x);

  set have indsname=fname incrdataset;

  array clusters(150);

  retain clusters:;

  if fname ne lag(fname) then do;

    x+1;

    if x gt 1 then counter=max(of clusters(*));

  end;

  if x eq 1 then clusters(age)=cluster_num;

  else do;

    if missing(clusters(age)) then do;

      counter+1;

      clusters(age)=counter;

      cluster_num=counter;

    end;

    else cluster_num=clusters(age);

  end;

run;

Haikuo
Onyx | Level 15

Art, when you put "array" and "hash" in the same sentence, that reminds me that "hash" used to be called "associative array".

Haikuo

joehinson
Calcite | Level 5

Here's one hash solution:

data want(keep=name age cluster_no);

     if _N_=1 then do;

          if(1=2)then set have;

          cluster_no=1;

          declare hash test();

          rc=test.defineKey("age");

          rc=test.defineData("cluster_no");

          test.defineDone();

     end;

    

     do until(done);

          set have end=done;

          rcf=test.find();

          if rcf ne 0 then do;

               rca=test.add();

               cluster_no=test.num_items;

               rcr=test.replace();

               call missing(cluster_no);

          end;

          rcf=test.find();

          output;

     end;

     stop;

run;

Ksharp
Super User

It is easy For Hash Table.

data have ;
input name $ age ;
cards;
a     10  
b     10  
c     11  
d     11  
e     12  
e     13  
f     12  
g     14  
h     14  
;
run;
data want;
 if _n_ eq 1 then do;
  declare hash ha();
   ha.definekey('age');
   ha.definedata('no');
   ha.definedone();
 end;
set have;
if ha.find() ne 0 then do;n+1;no=n;ha.add();end;
drop n;
run;


Ksharp

Haikuo
Onyx | Level 15

Nice, Ksharp! Learn some!

My_SAS
Calcite | Level 5

Thqs Art and ksharp for your reply it worked

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2168 views
  • 2 likes
  • 5 in conversation