BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lousam
Obsidian | Level 7

Hi,

 

I am working with multiple data sets. There are a total of 4 data sets per year (~250 variables total across all 4 datasets and I have to keep all of them),and I have 10 years of data.

 

I need help coming up with a macro to minimize the code I have to write.

 

Here is a list of the 4 data sets for year 2000 (I have 4 data sets each for years 2000-2015):

 

*List of data sets for year 2000;

clin_2000_phys_exam

clin_2000_lab

clin_2000_group

clin_2000_final

 

*List of data sets for year 2001;

clin_2001_phys_exam

clin_2001_lab

clin_2001_group

clin_2001_final

 

......and this repeats until we reach year 2010

 

 

What I am trying to do is something like this for each year:

 

*Merging "clin_2000_phys_exam" data with "clin_2000_lab" data by form;

proc sort data=clin_2000_phys_exam; by form; run;

proc sort data=clin_2000_lab; by form; run; 

 

data phys_lab_2000;

merge clin_2000_phys_exam  clin_2000_lab;

by form;

run;

 

*Merging the newly created "phys_lab_2000" data with "clin_2000_group" and "clin_2000_final" data by ID;

 

proc sort data=phys_lab_2000; by ID; run;

proc sort data=clin_2000_group; by ID; run;

proc sort data=clin_2000_final; by ID; run;

 

data 2000_final data;

merge phys_lab_2000 clin_2000_group clin_2000_final;

by ID; 

run;

 

*Repeat the code above for all 10 years (years 2000-2010) until each year has a merged final dataset;

*Concatenate data together to create a long dataset that contains data for all 10 years;

 

data year200_year2011;

set 2000_final data

2001_final data

2002_final data

2003_final data

2004_final data

2005_final data

2006_final data

2007_final data

2008_final data

2009_final data

20010_final data;

run;

 

I need help coming up with a SAS macro that does this so that I don't have to rewrite the same line of code for each year since the only difference would be the name of the dataset. 

 

I appreciate any help you can provide. 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

It looks like you know precisely what code needs to run for each year.  Here's an example of turning it into a macro.  I'll start with a piece of the code you supplied:

 

proc sort data=phys_lab_2000; by ID; run;

proc sort data=clin_2000_group; by ID; run;

proc sort data=clin_2000_final; by ID; run;

 

data 2000_final data;

merge phys_lab_2000 clin_2000_group clin_2000_final;

by ID; 

run;

 

You might want to add to that code, since you have indicated that more processing does take place for each year.  Turning it into a macro that processes one year would look like this.  It's as simple as replacing "2000" with "&year.":

 

%macro one_year (year=);

proc sort data=phys_lab_&year.; by ID; run;

proc sort data=clin_&year._group; by ID; run;

proc sort data=clin_&year._final; by ID; run;

 

data &year._final data;

merge phys_lab_&year. clin_&year._group clin_&year._final;

by ID; 

run;

%mend one_year;

 

Then actually executing the code looks like this:

 

%one_year (year=2000)

%one_year (year=2001)

 

etc.

 

Note that you will need to change the names of one of your data sets.  Data set names cannot begin with a number, so this would be an invalid name:

 

2000_final data

 

 

View solution in original post

3 REPLIES 3
Astounding
PROC Star

It looks like you know precisely what code needs to run for each year.  Here's an example of turning it into a macro.  I'll start with a piece of the code you supplied:

 

proc sort data=phys_lab_2000; by ID; run;

proc sort data=clin_2000_group; by ID; run;

proc sort data=clin_2000_final; by ID; run;

 

data 2000_final data;

merge phys_lab_2000 clin_2000_group clin_2000_final;

by ID; 

run;

 

You might want to add to that code, since you have indicated that more processing does take place for each year.  Turning it into a macro that processes one year would look like this.  It's as simple as replacing "2000" with "&year.":

 

%macro one_year (year=);

proc sort data=phys_lab_&year.; by ID; run;

proc sort data=clin_&year._group; by ID; run;

proc sort data=clin_&year._final; by ID; run;

 

data &year._final data;

merge phys_lab_&year. clin_&year._group clin_&year._final;

by ID; 

run;

%mend one_year;

 

Then actually executing the code looks like this:

 

%one_year (year=2000)

%one_year (year=2001)

 

etc.

 

Note that you will need to change the names of one of your data sets.  Data set names cannot begin with a number, so this would be an invalid name:

 

2000_final data

 

 

Junyong
Pyrite | Level 9

Though I don't have your data, hope this helps.

/*A sample dataset that mimics the shape of your dataset*/

%macro repeat1;

%do repeat=2000 %to 2010;

data clin_&repeat._phys_exam clin_&repeat._lab(drop=id) clin_&repeat._group(drop=form) clin_&repeat._final(drop=form);
	do form=1,2;
		do id=1,2;
			output clin_&repeat._phys_exam;
			if id=1 then output clin_&repeat._lab;
			if form=1 then output clin_&repeat._group;
			if form=1 then output clin_&repeat._final;
		end;
	end;
run;

%end;

%mend;

%repeat1;

/*The code that does what you need*/

%macro repeat2;

%do repeat=2000 %to 2010;

proc sort data=clin_&repeat._phys_exam;
	by form;
run;

proc sort data=clin_&repeat._lab;
	by form;
run;

data phys_lab_&repeat.;
	merge clin_&repeat._phys_exam clin_&repeat._lab;
	by form;
run;

proc sort data=phys_lab_&repeat.;
	by id;
run;

proc sort data=clin_&repeat._group;
	by id;
run;

proc sort data=clin_&repeat._final;
	by id;
run;

data phys_lab_&repeat.;
	merge phys_lab_&repeat. clin_&repeat._group clin_&repeat._final;
	by id;
run;

proc append base=year2000_year2010 data=phys_lab_&repeat.;
run;

%end;

%mend;

%repeat2;

Though this may not be the best solution, this may do something you mentioned.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1001 views
  • 3 likes
  • 4 in conversation