Hello I am trying to merge two data sets with a variable that is repeated in both data sets. For example;
data cm;
infile cards;
input pid 1-3 @5 date date9. medication $15-35;
format date date9.;
cards;
101 16Nov2009 GCNAAT
101 17Nov2009 GCCULTURE
102 16Nov2009 GCNAAT
102 17Nov2009 GCNAAT
102 18Nov2009 GCCULTURE
103 18Nov2009GCCULTURE
104 19Nov2009 GCCULTURE
105 17Nov2009 GCNAAT
run;
data ae;
input pid 1-3 @5 date date9. event $ 15-35;
format date mmddyy10.;
cards;
101 16Nov2009 AZE
102 16Nov2009 GEN
102 17Nov2009 FLU
102 18Nov2009MEN
103 17Nov2009 GEN
103 18Nov2009 AZI
105 17Nov2009 MEN
run;
Need your help please.
Thank you for your quick response. This was just an example. I have many variables like age , sex , zip etc in the data sets I am trying to merge. So do I need to list all of the variables in select like you have below?
Expected output?
would like to have table without losing data from both table.
There still needs to be more clarification.
Record 102 has 3 records in table 1 and 3 records in table 2.
How many records do you expect as output? What exactly are you expecting as output?
@Dhana18 wrote:
would like to have merged data set without losing data from both table.
Thank you. I am trying to merge them by id and date, so if any of these (id or date) are present in any of these datasets then I want them in the merged data set.
proc sql;
create table want as
select t1.pid as cm_PID, t1.date as cm_DATE, t1.medication,
t2.pid as ae_PID, t2.date as ae_date, t2.event
from cm as t1
full join ae as t2
on t1.pid=t2.pid and t1.date=t2.date;
quit;
Thank you for your quick response. This was just an example. I have many variables like age , sex , zip etc in the data sets I am trying to merge. So do I need to list all of the variables in select like you have below?
@Dhana18 wrote:
Thank you for your quick response. This was just an example. I have many variables like age , sex , zip etc in the data sets I am trying to merge. So do I need to list all of the variables in select like you have below?
Depends. You can use wild card references but then you'll have issues with any duplicates.
select t1.*, t2.*
This selects all columns but if you have columns with the same name it's problematic.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.