BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
suzannep
Calcite | Level 5

Hello, 

 

I am trying to create a unique identifier using parts of first and last name, DOB and Gender.  I want the first 2 letters of the first_name & the first 2 letters of the last_name & last 2 letters of the last_name & DOB with no slashes & Gender as either M or F.

 

For example:

First_name        Last_name               DOB               Gender            ID_number

John                   Smith             01/02/1995             M                      JOSMTH01021995M

Mary                   Jackson         03/15/1975             F                      MAJAON03151975F 

 

Any help would be greatly appreciated!! 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data want;
    set have;
    id = cats(substr(first_name,1,2),substr(last_name,1,2),
        reverse(substr(reverse(trim(last_name)),1,2)),compress(dob,'/'),gender);
run;
--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Assuming DOB is an actual numeric SAS date value and not a character string

 

data want;
    set have;
    id = cats(substr(first_name,1,2),substr(last_name,1,2),
        reverse(substr(reverse(trim(last_name)),1,2)),put(dob,mmddyyn8.),gender);
run;

If the last name has less than 4 letters, such as the name "Ty Law", then your ID would begin with TyLaaw (the letter A being repeated by this algorithm)

--
Paige Miller
suzannep
Calcite | Level 5

Thank you! my date is a character string. What is the format to use so that there are no / in the id?

PaigeMiller
Diamond | Level 26
data want;
    set have;
    id = cats(substr(first_name,1,2),substr(last_name,1,2),
        reverse(substr(reverse(trim(last_name)),1,2)),compress(dob,'/'),gender);
run;
--
Paige Miller
ballardw
Super User

I would NOT call this a "unique id". In one data set I work with of about 5000 individuals I had two people with the same first and last name, birth date and gender. I knew they were not the same person from other data such as residence.

 

Since Joel Smutny would get the same letters as John Smith, your algorithm is very likely to have issues with the "uniqueness" if the number of people involved is not even very large.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4027 views
  • 2 likes
  • 3 in conversation