Hello,
I'm relatively new to SAS and have tried to write a macro for this but without success.
How would I write a macro to use this array in 14 different datasets? (red.red05 is the 2005 dataset - I'd like to use this array in each year up to red.red18). Any suggestions are appreciated!
DATA LIB.NHR05;
SET RED.RED05 (WHERE=(CITY='3'));
ARRAY DIAGNOSIS (2) DIAG_CODE_1 DIAG_CODE_2;
DO i =1 TO 2;
IF DIAGNOSIS(i) =: 'A0' OR DIAGNOSIS(i) =: 'A1' THEN OUTPUT;
END;
RUN;
Hello @Angmar,
Your question requires more details before experts can help. Can you revise your question to include more information?
Review this checklist:
To edit your original message, select the "blue gear" icon at the top of the message and select Edit Message. From there you can adjust the title and add more details to the body of the message. Or, simply reply to this message with any additional information you can supply.
SAS experts are eager to help -- help them by providing as much detail as you can.
This prewritten response was triggered for you by fellow SAS Support Communities member @ballardw
.%macro dothis;
%do yr=5 to 18;
%let yrz=%sysfunc(putn(&yr,z2.));
data lib.nhr&yrz;
set red.red&yrz(where-(city='3'));
array diagnosis (2) diag_code_1 diag_code_2;
do i =1 to 2;
if diagnosis(i) =: 'A0' or diagnosis(i) =: 'A1' then output;
end;
run;
%end;
%mend;
%dothis
As shown above, you need a macro variable &YR that goes from 5 to 18, and another macro variable &YRZ that has a leading zero if necessary. Also, this assumes that your ARRAY code is correct and produces the correct answer, which I have not bothered to check, but certainly you should.
But ... generally creating data sets for each year is not recommended. I would strongly recommend one large data set with all years in it, which can thus be programmed without a macro.
data lib.nhr_all
set red.red05-red.red18 indsname=indsname;
where city='3';
dsname=indsname;
/* Your array statements go here */
run;
So, please simplify your coding and do this without macros.
No. Either copy the code or use macro processor to generate the code.
Note that your code probably does not do what you want. If both variables meet the IF condition do you really want to output that observation twice?
But perhaps you don't need to do any code generation. Do the datasets all have the same structure? Do you really need to keep the results as 14 separate datasets? Why not just run it for all 14 datasets at once?
DATA want;
SET RED.RED05-RED.RED18;
WHERE CITY='3';
ARRAY DIAGNOSIS DIAG_CODE_1 DIAG_CODE_2;
found=0;
DO index =1 TO dim(DIAGNOSIS ) while (not found);
IF DIAGNOSIS[index] in: ('A0','A1') THEN found=1;
END;
if found then output;
RUN;
Sorry, what do you mean 'output that observation twice'?
If you examine your code, you have a DO loop where I goes from 1 to 2. Each time through the loop, you can have the observation output, and so you could have an observation output zero times, or one time, or two times. If you examine your output data sets, you may see that this has indeed happened.
Thanks, I just learned how to do arrays this morning. What I wanted to do was output into one dataset all records that had one diagnosis and/or the other. I see that you created a new variable called 'found' - is this necessary in order to not output the same record multiple times?
@Angmar wrote:
Thanks, I just learned how to do arrays this morning. What I wanted to do was output into one dataset all records that had one diagnosis and/or the other. I see that you created a new variable called 'found' - is this necessary in order to not output the same record multiple times?
There are always many ways to write a program. For example SAS now has a statement to exit a DO loop, LEAVE, and you could use that instead of making a new variable. Try to see if you can adapt the logic below to your situation.
do i=1 to 100 ;
if i=5 then do;
output;
leave;
end;
end;
To just replace the array an the code using in would be
1) identify the code that needs to be reused
2) place that in a macro definition
3) place the reference to the macro into the code
%macro myarray(); ARRAY DIAGNOSIS (2) DIAG_CODE_1 DIAG_CODE_2; DO i =1 TO 2; IF DIAGNOSIS(i) =: 'A0' OR DIAGNOSIS(i) =: 'A1' THEN OUTPUT; END; %mend; DATA LIB.NHR05; SET RED.RED05 (WHERE=(CITY='3')); %myarray() RUN;
Or create the entire set of calls by looping over the variables (HINT: if you had used 4 digit years in the names the code is even simpler)
%macro yrs (start=, end=); %do i=&start. %to &end. ; %let yr= %sysfunc(putn(&i,z2.)); DATA LIB.NHR&yr.; SET RED.RED&yr. (WHERE=(CITY='3')); ARRAY DIAGNOSIS (2) DIAG_CODE_1 DIAG_CODE_2; DO i =1 TO 2; IF DIAGNOSIS(i) =: 'A0' OR DIAGNOSIS(i) =: 'A1' THEN OUTPUT; END; RUN; %end; %mend; %yrs(start=5,end=18)
And if by any chance you would combine these resulting data steps then no macro required at all:
Assumes all variables in common are of the same type and length if character:
DATA LIB.NHRall; SET RED.RED05-Red.Red18; WHERE CITY='3'; ARRAY DIAGNOSIS (2) DIAG_CODE_1 DIAG_CODE_2; DO i =1 TO 2; IF DIAGNOSIS(i) =: 'A0' OR DIAGNOSIS(i) =: 'A1' THEN OUTPUT; END; RUN;
Hi @Angmar
If it is the same array, meaning the same variables in all data sets, then why do you want to use an array. I think this would do the same:
%macro runall;
%do y = 5 to 18;
%let yz=%sysfunc(putn(&y,z2.));
data lib.nhr&yz;
set red.red&yz(where=(city='3' and (DIAG_CODE_1 =: 'A0' or DIAG_CODE_2 =: 'A1')));
run;
%end;
%mend;
%runall;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.