I want to create a Unique ID for each person based on the persons Roll_ID and Batch_ID. Some have both of them and some have either Roll_ID or Batch_ID. So I need to create a Unique ID based on two variables.
Data I have:
Roll_ID Batch_ID VAR1 VAR2
10012 708900 A B
121001 - C D
122110 540440 E F
- 327089 G H
- 3324566 I J
232255 - K L
Data I want:
Roll_ID Batch_ID VAR1 VAR2 Unique_ID
10012 708900 A B 1
121001 - C D 2
122110 540440 E F 3
- 327089 G H 4
- 3324566 I J 5
232255 - K L 6
Thank you
Reading the want-dataset it seems that each Unique_ID == _n_. If each combination of Rolll_ID and Batch_ID is unique, just use:
data want;
set have;
Unique_ID = _n_;
run;
If each combination of Roll_ID and Batch_ID can exists more than once, please post data in usable form, so that providing tested code is possible.
Reading the want-dataset it seems that each Unique_ID == _n_. If each combination of Rolll_ID and Batch_ID is unique, just use:
data want;
set have;
Unique_ID = _n_;
run;
If each combination of Roll_ID and Batch_ID can exists more than once, please post data in usable form, so that providing tested code is possible.
If it is possible that the same person appears on more than one observation, here's what you would need to do:
proc sort data=have;
by roll_id batch_id;
run;
data want;
set have;
by roll_id batch_id;
if first.batch_id then unique_id + 1;
run;
You want a unique surrogate identifier based on the combination of ROLL_ID and BATCH_ID values. Suppose each distinct combination is a key to the surrogates -- you would then be able to use a hash to maintain a list of keys and the surrogate.
Example:
data want; set have; if _n_ = 1 then do; declare hash combos(); * instantiate hash component object; combos.defineKey ('Roll_ID', 'Batch_ID'); combos.defineData ('Combo_ID'); combos.defineDone(); call missing (Combo_ID); end;
if combos.find() ne 0 then do;
* new combination not previously encountered, create new surrogate id for the combination; Combo_ID = combos.count() + 1; combos.add(); end;
* when .find() returns 0, then key (the combination) exists in the hash
* and the values of the data elements (Combo_ID) is retrieved into the PDV;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.