BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SR11
Obsidian | Level 7

I want to create a Unique ID for each person based on the persons Roll_ID and Batch_ID. Some have both of them and some have either Roll_ID or Batch_ID. So I need to create a Unique ID based on two variables. 

Data I have:

Roll_ID    Batch_ID    VAR1    VAR2

10012      708900          A         B

121001        -                 C         D

122110     540440         E          F

     -           327089         G         H 

      -         3324566        I           J

232255           -              K          L

 

Data I want: 

Roll_ID    Batch_ID    VAR1    VAR2  Unique_ID

10012      708900          A         B             1

121001        -                 C         D            2

122110     540440         E          F            3

     -           327089         G         H            4

      -         3324566        I           J            5

232255           -              K          L            6

 

Thank you 

 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Reading the want-dataset it seems that each Unique_ID == _n_. If each combination of Rolll_ID and Batch_ID is unique, just use:

data want;
  set have;
  Unique_ID = _n_;
run;

If each combination of Roll_ID and Batch_ID can exists more than once, please post data in usable form, so that providing tested code is possible.

View solution in original post

3 REPLIES 3
andreas_lds
Jade | Level 19

Reading the want-dataset it seems that each Unique_ID == _n_. If each combination of Rolll_ID and Batch_ID is unique, just use:

data want;
  set have;
  Unique_ID = _n_;
run;

If each combination of Roll_ID and Batch_ID can exists more than once, please post data in usable form, so that providing tested code is possible.

Astounding
PROC Star

If it is possible that the same person appears on more than one observation, here's what you would need to do:

proc sort data=have;
   by roll_id batch_id;
run;

data want;
   set have;
   by roll_id batch_id;
   if first.batch_id then unique_id + 1;
run;
RichardDeVen
Barite | Level 11

You want a unique surrogate identifier based on the combination of ROLL_ID and BATCH_ID values.  Suppose each distinct combination is a key to the surrogates -- you would then be able to use a hash to maintain a list of keys and the surrogate.

 

Example:

 

data want;
  set have;
  if _n_ = 1 then do;
    declare hash combos();  * instantiate hash component object;
    combos.defineKey ('Roll_ID', 'Batch_ID');
    combos.defineData ('Combo_ID');
    combos.defineDone();
    call missing (Combo_ID);
  end;

if combos.find() ne 0 then do;
* new combination not previously encountered, create new surrogate id for the combination; Combo_ID = combos.count() + 1; combos.add(); end;

* when .find() returns 0, then key (the combination) exists in the hash
* and the values of the data elements (Combo_ID) is retrieved into the PDV;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 6420 views
  • 2 likes
  • 4 in conversation