Desktop productivity for business analysts and programmers

Concatenate variables

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Concatenate variables

What is the best way to concatenate the following: I am trying to create a unique identifier to link two datasets. I am exploring the option of using SSN and DOB {not all observations have a valid SSN recorded; hence why I am adding DOB. DOB is currently in the MMDDYY10. format. SSN is currently in character format.

 

What is the best way to concatenate these two variables into a unique identifier, eliminating blanks and without distorting the date information?

Also, how does this handle missing data (say if a DOB is missing)?

 

Thanks!


Accepted Solutions
Solution
‎05-22-2018 05:51 PM
PROC Star
Posts: 1,604

Re: Concatenate variables

Posted in reply to novinosrin

And to remove the forward '/' 

 

data test;
ssn='123456789';
dob='02jan1993'd;
if not missing(dob) then new_var=cats(ssn,compress(put(dob,mmddyy10. -l),'/'));
else new_var=ssn;
put new_var=;
run;

View solution in original post


All Replies
PROC Star
Posts: 1,604

Re: Concatenate variables

[ Edited ]

are you after

 

 if not missing(dob) then new_var=cats(ssn,put(dob,mmddyy10. -l));

else new_var=ssn;

 

Contributor
Posts: 23

Re: Concatenate variables

Posted in reply to novinosrin

Hi there... so that code did not do anything for the unique id (it simply outputted the ssn without the dob included).

 

Ideally if someone's SSN is 123456789 and their DOB is 01/02/1993 - then the uniqueID would be: 12345678901021993.

 

Does that make sense?

Super User
Posts: 23,354

Re: Concatenate variables

Can you post the exact code you used please and some sample records. Since you don't provide sample data, the solutions are untested, of course.

PROC Star
Posts: 1,604

Re: Concatenate variables

Yes, please provide us a sample of your data, your output, the code you used and your log,

here is the test with my assumptions using your narrative:

 

122 data test;
123 ssn='123456789';
124 dob='02jan1993'd;
125 if not missing(dob) then new_var=cats(ssn,put(dob,mmddyy10. -l));
126 else new_var=ssn;
127 put new_var=;
128 run;

new_var=12345678901/02/1993
NOTE: The data set WORK.TEST has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

 

Solution
‎05-22-2018 05:51 PM
PROC Star
Posts: 1,604

Re: Concatenate variables

Posted in reply to novinosrin

And to remove the forward '/' 

 

data test;
ssn='123456789';
dob='02jan1993'd;
if not missing(dob) then new_var=cats(ssn,compress(put(dob,mmddyy10. -l),'/'));
else new_var=ssn;
put new_var=;
run;
Super User
Posts: 23,354

Re: Concatenate variables

Posted in reply to novinosrin

@novinosrin mmddyy10 -> 10 implies dashes, 8 would not include dashes, so try mmddyy8. instead perhaps?

PROC Star
Posts: 1,604

Re: Concatenate variables

@Reeza Love your attention to detail.. Caffeine effect after 6 cups of large pike at Starbucks on student discount Smiley Happy  Thank you though

Contributor
Posts: 23

Re: Concatenate variables

Posted in reply to novinosrin

Looks like a had a slight typo when I first tried it. This worked for me!

Super User
Posts: 13,358

Re: Concatenate variables

One thought to consider: You might want to include something like an underscore between the ssn portion and the date portion.

123456789_01021993

might be easier to identify the date part when humans need to fix something like a short SSN. Also if the SSN is missing then a value like

_11021993 tells you right away the SSN is completely missing where

11011993 could be confused with an incomplete SSN and a missing DOB.

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 167 views
  • 2 likes
  • 4 in conversation