I have 1 dataset with 2 variables:
PROVINCE, which has codes of 10, 24, 32, 48, 52 (code for each province)
INDUSTRY, industry codes
FREQ, which is a number
I want to make that 1 dataset into 5 different ones, each with the name of the prov code.
Obviously I could do this 5 times:
data output.prov10;
set output.have;
where prov=10;
run;
But I do not want to do this manually (will have to do it for several other steps too).
How can I do a do loop to just automate it?
%do prov.first %to prov.last;
%let %newprov=prov;
data output.prov&newprov;
set output.have;
%end;
run;
will this work?
Don't.
This is what BY group processing is for, if absolutely required split the dataset at the last step.
That being said, this is an FAQ since it's how people tend to think if they haven't worked with SAS a lot.
Here's several posts with sample code and examples on how to automate this so you don't need to know the number of levels prior to running the program.
http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
http://www.sascommunity.org/wiki/Split_Data_into_Subsets
https://gist.github.com/statgeek/4bfb7574713bedf4e011
For info on how BY group processing works:
I also need to do a PROC EXPORT into excel for each of the datsets. Can this also be done without a do loop?
Depends, do you need a file each or one file with different tabs for each province.
Also, you can build the filter directly into the proc export rather than create datasets.
The WHERE clause would be part of the macro rather than generating different subsets.
proc export data=sashelp.class (WHERE=(SEX='F')) outfile='path to file' dbms=xlsx replace; run;
Wouldn't that require me to do that 5 times? Seems inefficient. I'm trying to build a loop.
Not quite. The macro processor %do does not work quite the same as the data step do loop. The %do must be used inside a defined macro (%macro / %mend statements) and then the macro must be called. Otherwise you will get an error about %do in "open code".
I would also suggest does your process really require separate data sets? You could use BY group processing for most things and WHERE dataset options elsewhere.
%macro doloop (list= ); %let count = %sysfunc(countw(&list)); %put Items in list= &count;/* just to show the numbers of items in the list*/ %do i = 1 %to &count; p=%scan(&list,&i); data output.prov&p ; set output.have; where prov="&p"; run; %end; %mend; %doloop(list=10 24 32 48 52);
Should work. Note the LIST in the macro call cannot have commas because a comma separates parameters. You only need one parameter: the list.
When creating macros it is a very good idea to ensure that each datastep or procedure has it's own RUN statement. Your code would rely on implied datastep boundaries and the nature of more complex macro coding (not always a good idea) makes that fragile.
Thanks, this looks good.
But is there a way to make the loop automatically know how many provinces there are?
But is there a way to make the loop automatically know how many provinces there are?
That's what the first line of @ballardw's macro does, it "automatically" gets a count of province codes in the macro argument LIST.
However, if you must create multiple data sets, I'd suggest having the macro construct one data step to make NP (number of province codes) datasets rather than NP data steps each making one dataset.
%macro doloop (list= );
%let count = %sysfunc(countw(&list));
%put Items in list= &count;/* just to show the numbers of items in the list*/
data
%do i = 1 %to &count;
%let p=%scan(&list,&i);
output.prov&p (where=(prov="&p"))
%end;
;
set output.have;
run;
%mend;
%doloop(list=10 24 32 48 52);
So what is it your trying to do exactly, rather this bits of information. Providing test data in the form of a datastep is especially usefull.
A dataset is a loop intrinsically, so you can use this functionality to avoid long macro lists, messy macro code etc. quite simply:
Say I have a dataset of codes, and for each I want to run a proc means and output to a tagset:
proc means data=have; by code; var xyz; output out=means_res n=n; run;
ods tagsets.excelxp file="c:\temp.xml";
data _null_; set codes_list; call execute('ods tagsets.excelxp options(sheet_name="'||strip(code)||'"); proc report data=means_res nowd; where code="'||strip(code)||'"; columns _all_; run;'); run;
ods tagsets.excelxp close;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.