BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zzfsimona
Fluorite | Level 6

Hi there, 

I am using SAS 9.4.

I have multiple datasets titled as 2007q1, 2007q2, 2007q3, ..., 2017q3, 2017q4. I also have a dataset titled Cohort. All dataset contain the variable patid. 

I want to merge all those &yr.q&qr. datasets with the dataset Cohort by the common variable patid. There should be only 1 final merged dataset. The merged file should contain all and only patids that are in the dataset Cohort. 

Here is my code:

%macro mergedata;
	proc sort data=data.cohort; by patid; run;
	
	%do yr=2007 %to 2017;
		%do qr=1 %to 4;
			proc sort data=in1perc.ses_diag&yr.q&qr.; 
				by patid; 
			run;
			data data.merged;
				merge
				data.cohort (in=ina)
				in1perc.&yr.q&qr. (in=inb);
				by patid;
				if ina and inb;
			run;
		%end;
	%end;
%mend;

%mergedata

I know there is something wrong with the ina and inb part, but I cannot figure out how to fix it. 

The log file shows that the next dataset in the loop keeps replacing the previous merged dataset, instead of adding into it.

Here is the log for iterations 2007Q4 and 2008Q1. Note how the total observations has decreased in data.Merged.

 

NOTE: There were 1687 observations read from the data set DATA.COHORT.
NOTE: There were 712195 observations read from the data set IN1PERC.2007Q4.
NOTE: The data set DATA.MERGED has 7185 observations and 17 variables.
NOTE: There were 1687 observations read from the data set DATA.COHORT.
NOTE: There were 731171 observations read from the data set IN1PERC.2008Q1.
NOTE: The data set DATA.MERGED has 7031 observations and 17 variables.

 How do I fix this? Thank you for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

If I'm interpreting the question correctly, you would need to make this sort of change.  First, delete your data.merged data set.  Then change the last part of your loop where you now have:

 

 

 data data.merged;
merge data.cohort (in=ina) in1perc.&yr.q&qr. (in=inb);
by patid; if ina and inb;
run;

 

Instead, replace that section with:

 


data merged;
merge data.cohort (in=ina)
      in1perc.&yr.q&qr. (in=inb);
      by patid;
      if ina and inb;
run;
proc append data=merged base=data.merged;
run;

You can run this, even when DATA.MERGED does not yet exist.  For that case, MERGED will get copied to DATA.MERGED.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I haven't tested this as in a meeting, however maybe something like - needs all datasets in one library:

data merged;
  merge in1perc.ses_diag:;
  by patid;
  if ina and inb;
run;

Note the : after the prefix part, normally it takes anything with the prefix, but not sure about merge.

 

Question is why do you have lots of same data in dated files in the first place?  Every time when I see this I immediately think that the process before could be improved to give one dataset with date in the data - makes everyones life so much easier.

Astounding
PROC Star

If I'm interpreting the question correctly, you would need to make this sort of change.  First, delete your data.merged data set.  Then change the last part of your loop where you now have:

 

 

 data data.merged;
merge data.cohort (in=ina) in1perc.&yr.q&qr. (in=inb);
by patid; if ina and inb;
run;

 

Instead, replace that section with:

 


data merged;
merge data.cohort (in=ina)
      in1perc.&yr.q&qr. (in=inb);
      by patid;
      if ina and inb;
run;
proc append data=merged base=data.merged;
run;

You can run this, even when DATA.MERGED does not yet exist.  For that case, MERGED will get copied to DATA.MERGED.

ballardw
Super User

@zzfsimona wrote:

Hi there, 

I am using SAS 9.4.

I have multiple datasets titled as 2007q1, 2007q2, 2007q3, ..., 2017q3, 2017q4. I also have a dataset titled Cohort. All dataset contain the variable patid. 

I want to merge all those &yr.q&qr. datasets with the dataset Cohort by the common variable patid. There should be only 1 final merged dataset. The merged file should contain all and only patids that are in the dataset Cohort. 

Here is my code:

%macro mergedata;
	proc sort data=data.cohort; by patid; run;
	
	%do yr=2007 %to 2017;
		%do qr=1 %to 4;
			proc sort data=in1perc.ses_diag&yr.q&qr.; 
				by patid; 
			run;
			data data.merged;
				merge
				data.cohort (in=ina)
				in1perc.&yr.q&qr. (in=inb);
				by patid;
				if ina and inb;
			run;
		%end;
	%end;
%mend;

%mergedata

I know there is something wrong with the ina and inb part, but I cannot figure out how to fix it. 

The log file shows that the next dataset in the loop keeps replacing the previous merged dataset, instead of adding into it.

Here is the log for iterations 2007Q4 and 2008Q1. Note how the total observations has decreased in data.Merged.

 


These two lines don't match:

proc sort data=in1perc.ses_diag&yr.q&qr.; 
 
in1perc.&yr.q&qr. (in=inb)

did you mean to use the sored data? as the data set names are not the same.

 

 

I suggest that you do with two data sets manually and see if the results are as desired. I suspect not as MERGE is going to replace like-named variable values.

I suspect that what you actually want to do is append all of the 2007q1, 2007q2 etc sets together and then select.

Some thing like

data work.temp;
   set  in1perc.ses_diag: ;
run;

proc sql;
   create table data.merged as
   select b.*
   from data.cohort as a
        left join
        work.temp
        on a.patid=b.patid
   ;
quit;

No claims for efficiency except in short code if this actually accomplishes what you want.

 

 

The set in1perc.ses_diag: ; will append all data sets in the in1perc library that start with ses_diag.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 6091 views
  • 0 likes
  • 5 in conversation