Lets say I have a data set consisting of names and I would like to assign them a unique id.
Tom
Mary
Jill
Tom
How can I make sure tom gets the same id even though he shows up twice in the data set?
You could do this
data have;
input name $;
datalines;
Tom
Mary
Jill
Tom
;
data want;
if _N_ = 1 then do;
dcl hash h();
h.definekey('name');
h.definedata('id');
h.definedone();
end;
set have;
if h.find() ne 0 then do;
id + 1;
h.add();
end;
run;
Result:
Obs name id 1 Tom 1 2 Mary 2 3 Jill 3 4 Tom 1
Also, if you don't care that the ID's are increasing, starting from one, you could do something like this
data have;
input name $;
datalines;
Tom
Mary
Jill
Tom
;
data want;
set have;
id = input(md5(name), pib3.);
run;
Create a lookup table and make a format that can be applied to all tables.
Example is here:
https://gist.github.com/statgeek/fd94b0b6e78815430c1340e8c19f8644
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.