DATA Step, Macro, Functions and more

Give one number for matching Records

Reply
Contributor
Posts: 37

Give one number for matching Records

i want to get one number for identical records and i want to get for the incremenatal runs alos

i am having firstdataset

firstdataset:
name age
a     10
b     10
c     11
d     11
e     12


output:

name age  cluster_no
a     10  1
b     10  1
c     11  2
d     11  2
e     12  3


For incremental data
----------------------
In this it should take the max of cluster_no+1 and should do the same if any of the records match with the existing records it should
give the same cluster_no and the incremental data should append to the exsting data

eample data for incrdataset

name age
e     13
f     12
g     14
h     14


Final output of both
name age  cluster_no
a     10  1
b     10  1
c     11  2
d     11  2
e     12  3
e     13  4(This should start with the max(cluster_no+1)
f     12  3(As is was alredy assined in the first dataset)
g     14  5
h     14  5

please help me in this


PROC Star
Posts: 7,363

Re: Give one number for matching Records

You could do it with either an array or a hash object.  Here is one way to do it with an array:

data have (drop=clusters: counter);

  input name $ age;

  array clusters(150);

  retain clusters:;

  if missing(clusters(age)) then do;

    counter+1;

    clusters(age)=counter;

    cluster_num=counter;

  end;

  else cluster_num=clusters(age);

  cards;

a     10

b     10

c     11

d     11

e     12

;

data incrdataset;

  input name $ age;

  cards;

e     13

f     12

g     14

h     14

;

data want (drop=clusters: counter x);

  set have indsname=fname incrdataset;

  array clusters(150);

  retain clusters:;

  if fname ne lag(fname) then do;

    x+1;

    if x gt 1 then counter=max(of clusters(*));

  end;

  if x eq 1 then clusters(age)=cluster_num;

  else do;

    if missing(clusters(age)) then do;

      counter+1;

      clusters(age)=counter;

      cluster_num=counter;

    end;

    else cluster_num=clusters(age);

  end;

run;

Respected Advisor
Posts: 3,124

Re: Give one number for matching Records

Art, when you put "array" and "hash" in the same sentence, that reminds me that "hash" used to be called "associative array".

Haikuo

Contributor
Posts: 45

Re: Give one number for matching Records

Here's one hash solution:

data want(keep=name age cluster_no);

     if _N_=1 then do;

          if(1=2)then set have;

          cluster_no=1;

          declare hash test();

          rc=test.defineKey("age");

          rc=test.defineData("cluster_no");

          test.defineDone();

     end;

    

     do until(done);

          set have end=done;

          rcf=test.find();

          if rcf ne 0 then do;

               rca=test.add();

               cluster_no=test.num_items;

               rcr=test.replace();

               call missing(cluster_no);

          end;

          rcf=test.find();

          output;

     end;

     stop;

run;

Super User
Posts: 9,681

Re: Give one number for matching Records

It is easy For Hash Table.

data have ;
input name $ age ;
cards;
a     10  
b     10  
c     11  
d     11  
e     12  
e     13  
f     12  
g     14  
h     14  
;
run;
data want;
 if _n_ eq 1 then do;
  declare hash ha();
   ha.definekey('age');
   ha.definedata('no');
   ha.definedone();
 end;
set have;
if ha.find() ne 0 then do;n+1;no=n;ha.add();end;
drop n;
run;


Ksharp

Respected Advisor
Posts: 3,124

Re: Give one number for matching Records

Nice, Ksharp! Learn some!

Contributor
Posts: 37

Re: Give one number for matching Records

Thqs Art and ksharp for your reply it worked

Ask a Question
Discussion stats
  • 6 replies
  • 333 views
  • 2 likes
  • 5 in conversation