BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

I would like to create a variable from first name, last name, and date of birth that looks like this:

 

First letter of first name + 3rd letter of first name + first letter of last name + 3rd letter of last name + DOB (mmddyy) + indicator of race (1-4) + Letter U

 

For example, for a person with this data:

First_name    Last_name   DOB                 Race

Sarah            Smith            09/01/1999      2

 

I want to create a variable that looks like this:

SRSI0901992U

 

Any help would be greatly appreciated. Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would not recommend using data like that as an ID.  Data errors and other problems (not to mention possible duplicates) will render the ID useless.  It is technically possible:

data want;
  set have;
  length id $100;
  id=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),put(dob,ddmmyy8.),put(race,1.),"U"));
run;

But I really wouldn't recommend it.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would not recommend using data like that as an ID.  Data errors and other problems (not to mention possible duplicates) will render the ID useless.  It is technically possible:

data want;
  set have;
  length id $100;
  id=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),put(dob,ddmmyy8.),put(race,1.),"U"));
run;

But I really wouldn't recommend it.

Kels123
Quartz | Level 8

I agree that this identifier is sub-optimal, but unfortunately I need to do a datamatch with a database that only has this identifier available.

 

One quick follow-up question: How do I remove the slashes from the identifier? 

 

I am currently using this code:

 

data match.mydatabase3;

  set match.mydatabase3;

  length URN $100;

 URN=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),substr(last_name,1,1),substr(last_name,3,1),put(dob,MMDDYY8.),put(Gender_forTLCmatch,1.),"U"));

run;

 

Which produces this:  SRSI09/01/992U

 

Instead of this (which is what I want):  SRSI0901992U

 

Thanks very much.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just add a compress:

 URN=compress(upcase(cats(substr(first_name,1,1),substr(first_name,3,1),substr(last_name,1,1),substr(last_name,3,1),put(dob,MMDDYY8.),put(Gender_forTLCmatch,1.),"U")),"/");
ChrisHemedinger
Community Manager

You didn't say what your real objective is.  Maybe this format is prescribed by someone else, and that's just the format you need. 

[Edit: now I see you're stuck with the format you asked for...but I'll leave this response for others who have a similar situation.]

 

BUT, if the real goal is to anonymize these fields such that the personal info is obscured, yet the original value can be checked/verified against this encoded value, then you probably want to use a one-way hash.

 

SAS supports an industry-standard function (SHA256) that encrypts the value you supply.  Assuming you have the original fields stored off somewhere else in a more secure table, you can always re-connect that personal data with the encrypted data because the calculated hash will be the same.  But a person with just access to the hash value will not be able to determine the original value.

 

Something like:

  

hash = sha256(cats(first_name,last_name,put(dob,ddmmyy8.)));

 

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2274 views
  • 0 likes
  • 3 in conversation