BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

I would like to create a variable from first name, last name, and date of birth that looks like this:

 

First letter of first name + 3rd letter of first name + first letter of last name + 3rd letter of last name + DOB (mmddyy) + indicator of race (1-4) + Letter U

 

For example, for a person with this data:

First_name    Last_name   DOB                 Race

Sarah            Smith            09/01/1999      2

 

I want to create a variable that looks like this:

SRSI0901992U

 

Any help would be greatly appreciated. Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would not recommend using data like that as an ID.  Data errors and other problems (not to mention possible duplicates) will render the ID useless.  It is technically possible:

data want;
  set have;
  length id $100;
  id=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),put(dob,ddmmyy8.),put(race,1.),"U"));
run;

But I really wouldn't recommend it.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would not recommend using data like that as an ID.  Data errors and other problems (not to mention possible duplicates) will render the ID useless.  It is technically possible:

data want;
  set have;
  length id $100;
  id=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),put(dob,ddmmyy8.),put(race,1.),"U"));
run;

But I really wouldn't recommend it.

Kels123
Quartz | Level 8

I agree that this identifier is sub-optimal, but unfortunately I need to do a datamatch with a database that only has this identifier available.

 

One quick follow-up question: How do I remove the slashes from the identifier? 

 

I am currently using this code:

 

data match.mydatabase3;

  set match.mydatabase3;

  length URN $100;

 URN=upcase(cats(substr(first_name,1,1),substr(first_name,3,1),substr(last_name,1,1),substr(last_name,3,1),put(dob,MMDDYY8.),put(Gender_forTLCmatch,1.),"U"));

run;

 

Which produces this:  SRSI09/01/992U

 

Instead of this (which is what I want):  SRSI0901992U

 

Thanks very much.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just add a compress:

 URN=compress(upcase(cats(substr(first_name,1,1),substr(first_name,3,1),substr(last_name,1,1),substr(last_name,3,1),put(dob,MMDDYY8.),put(Gender_forTLCmatch,1.),"U")),"/");
ChrisHemedinger
Community Manager

You didn't say what your real objective is.  Maybe this format is prescribed by someone else, and that's just the format you need. 

[Edit: now I see you're stuck with the format you asked for...but I'll leave this response for others who have a similar situation.]

 

BUT, if the real goal is to anonymize these fields such that the personal info is obscured, yet the original value can be checked/verified against this encoded value, then you probably want to use a one-way hash.

 

SAS supports an industry-standard function (SHA256) that encrypts the value you supply.  Assuming you have the original fields stored off somewhere else in a more secure table, you can always re-connect that personal data with the encrypted data because the calculated hash will be the same.  But a person with just access to the hash value will not be able to determine the original value.

 

Something like:

  

hash = sha256(cats(first_name,last_name,put(dob,ddmmyy8.)));

 

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1789 views
  • 0 likes
  • 3 in conversation