DATA Step, Macro, Functions and more

What function to use for multiple data set?

Reply
Occasional Contributor
Posts: 12

What function to use for multiple data set?

[ Edited ]

Hi,

 

I'm analyzing data from year 2000 - 2015 with each year having its own data file. Instead of repeating all the procedures yearly, is there a shortcut function where I can analyze all the years with one code? I can try to merge all the files but each file contains around 80 million observations.

 

All the variables are the same for each year.

 

Thank you.

Super User
Posts: 17,863

Re: What function to use for multiple data set?

There isn't a function but you can create a macro. Macro in the simplest form generate code. 

 

The macro below, called sample, runs proc means on the provided dataset. Note how the parameter is referenced within the macro (&datain). 

 


Options MPRINT SYMBOLGEN;

%macro sample(datain);

Proc means data = &datain;
Run;

%mend;

%sample(sashelp.class);
%sample(sashelp.cars);

 

See a tutorial here:

http://www.ats.ucla.edu/stat/sas/seminars/sas_macros_introduction/

 

Occasional Contributor
Posts: 12

Re: What function to use for multiple data set?

Thank you. I think it's time for me to learn macro

Moderator
Posts: 238

Re: What function to use for multiple data set?

Is your analysis "yearly", or are you performing the one analysis across all 15 years? Based on your request, I'm assuming the latter - if you want to avoid combining the tables first, you could instead create a view and then run the analysis on the view, although I'm not sure what your performance will be like.
Occasional Contributor
Posts: 12

Re: What function to use for multiple data set?

I am analyzing # of visits yearly rather than across 15 years.

 

For example.

 

# of visits in year 2000 = 1.3 million

# of visits in year 2001 = 1.4million

# of visits in year 2002 = 1.5 million

etc

 

I will be analyzing more than just # of visits but everything will be looked at yearly.

Super User
Posts: 6,946

Re: What function to use for multiple data set?

To run the statistics for each  year, you can do

%macro stats;
%do year = 2000 %to 2015;
proc means data=mylib.data_&year;
.......
run;
%end;
%mend;
%stats;

Now, if you have the year in the dataset in a variable, you could do

%macro stats;
data work.statview/view=work.statview;
set
%do year = 2000 %to 2015;
  mylib.data&year
%end;
;
run;

proc means data=work.statview;
by yearvar;
.....
run;
%mend;
%stats;

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,407

Re: What function to use for multiple data set?

As an alternative, you could do it in one simple datastep:

data tmp;
  do i=2000 to 2015;
    call execute(cats('proc means data=year_',i,'; var age; output out=yr',i,'; run;'));
  end;
  call execute('data want; set yr:; run;');
run;
Ask a Question
Discussion stats
  • 6 replies
  • 342 views
  • 3 likes
  • 5 in conversation