Hello,
How can we go from
Id | LastName1 | LastName2 | FirstName1 | FirstName2 |
1 | Smith | Garcia | James | Michael |
2 | Jack | Tom | Oliver |
to
Id | LastName | FirstName |
1 | Smith | James |
1 | Smith | Michael |
1 | Garcia | James |
1 | Garcia | Michael |
2 | Jack | Tom |
2 | Jack | Oliver |
Thank you for help !
Regards,
data have;
infile cards expandtabs;
input Id (LastName1 LastName2 FirstName1 FirstName2 ) ($);
cards;
1 Smith Garcia James Michael
2 Jack . Tom Oliver
;
run;
data want;
set have;
if not missing(LastName1) then do;
lastname=LastName1;firstname=FirstName1;output;firstname=FirstName2;output;
end;
if not missing(LastName2) then do;
lastname=LastName2;firstname=FirstName1;output;firstname=FirstName2;output;
end;
drop LastName1 LastName2 FirstName1 FirstName2;
run;
IMO the simplest would be data step with assignment and explicit OUTPUT statements.
First step would be to establish the rest of the rules. You have shown what to do when LASTNAME2 is the only variable with a missing value. You would have to establish what should happen for every pattern of missing vs. nonmissing values. For example, what happens if:
These aren't the only possibilities. They're just examples of what needs to be spelled out before the programming can begin.
We are going to use the final table to send mails.
If a person has different firstnames or lastnames, we have to send as many letters to reach the person.
If the person has 2 firstnames and only one lastname there will be 2 lines etc...
I don't know if i am clear sorry
OK, here's one way to get there.
data want;
set have;
if firstname1 > ' ' then do;
firstname = firstname1;
if lastname1 > ' ' then do;
lastname = lastname1;
output;
end;
if lastname2 > ' ' then do;
lastname = lastname2;
output;
end;
end;
if firstname2 > ' ' then do;
firstname = firstname2;
if lastname1 > ' ' then do;
lastname = lastname1;
output;
end;
if lastname2 > ' ' then do;
lastname = lastname2;
output;
end;
end;
keep id firstname lastname;
run;
It's a little cluttered, but it's easy to write, read, and interpret.
Thanks for your answer.
I knew about this solution is not there a simple way to do it? Because besides firstnames and lastnames, I have 2 adresses 2 birth dates etc...
Thanks
If you re-shape your data, it is likely you could use SQL to get all combinations. But are you sure you want to? So far, you are talking about 4 variables There may be more. That means you have 16 pieces of mail for each observation in your data set. Is that a good idea? If you add a 5th variable, that increases to 32 pieces of mail per observation.
data have;
infile cards expandtabs;
input Id (LastName1 LastName2 FirstName1 FirstName2 ) ($);
cards;
1 Smith Garcia James Michael
2 Jack . Tom Oliver
;
run;
data want;
set have;
if not missing(LastName1) then do;
lastname=LastName1;firstname=FirstName1;output;firstname=FirstName2;output;
end;
if not missing(LastName2) then do;
lastname=LastName2;firstname=FirstName1;output;firstname=FirstName2;output;
end;
drop LastName1 LastName2 FirstName1 FirstName2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.