@zzfsimona wrote:
Hi there,
I am using SAS 9.4.
I have multiple datasets titled as 2007q1, 2007q2, 2007q3, ..., 2017q3, 2017q4. I also have a dataset titled Cohort. All dataset contain the variable patid.
I want to merge all those &yr.q&qr. datasets with the dataset Cohort by the common variable patid. There should be only 1 final merged dataset. The merged file should contain all and only patids that are in the dataset Cohort.
Here is my code:
%macro mergedata;
proc sort data=data.cohort; by patid; run;
%do yr=2007 %to 2017;
%do qr=1 %to 4;
proc sort data=in1perc.ses_diag&yr.q&qr.;
by patid;
run;
data data.merged;
merge
data.cohort (in=ina)
in1perc.&yr.q&qr. (in=inb);
by patid;
if ina and inb;
run;
%end;
%end;
%mend;
%mergedata
I know there is something wrong with the ina and inb part, but I cannot figure out how to fix it.
The log file shows that the next dataset in the loop keeps replacing the previous merged dataset, instead of adding into it.
Here is the log for iterations 2007Q4 and 2008Q1. Note how the total observations has decreased in data.Merged.
These two lines don't match:
proc sort data=in1perc.ses_diag&yr.q&qr.; in1perc.&yr.q&qr. (in=inb)
did you mean to use the sored data? as the data set names are not the same.
I suggest that you do with two data sets manually and see if the results are as desired. I suspect not as MERGE is going to replace like-named variable values.
I suspect that what you actually want to do is append all of the 2007q1, 2007q2 etc sets together and then select.
Some thing like
data work.temp;
set in1perc.ses_diag: ;
run;
proc sql;
create table data.merged as
select b.*
from data.cohort as a
left join
work.temp
on a.patid=b.patid
;
quit;
No claims for efficiency except in short code if this actually accomplishes what you want.
The set in1perc.ses_diag: ; will append all data sets in the in1perc library that start with ses_diag.
... View more