Hi,
I'm using %do year= 2002 %to 2014 to select cases and output resutls. When I select cases, I'm using two year values, say var1=2002 and var2=2003, or var1=2003 and var2=2004, etc.,, What makes this complicate is the year 2 variable (with next year's value) is embended in a date format (e.g. var1='1apr2002'd and var2="31mar2003"d.)
I thought the following code would work, but it didn't (log is attached at the end). I'm using more years of data but here I have show 3 years.
- Am I using Symput wrong - is it that I can't put a macro variabe (&year) in the second argument?
- A related question. It worked if I simply just repeated the same line 13 times using different year values (such as lines 3-5 of the sample SAS code below). What didn't work is I wrote a series of output data set names as DATA d2002-d2014 - SAS didn't regconize all datasets (d2002 d2003 d2004 d2005 etc.) and thus would not output data to some of the dataset. Is there a way to get around?
Sorry that I don't have a sample data for you to run. I'll just take your suggestions and apply to my data and report back.
Many thanks.
DATA d2002 d2003 d2004 subdata2002 subdata2003 subdata2004;
SET mydata
if effective_date <= '31mar2003'd and (end_date >= '1apr2002'd or end_date = .) then output d2002;
if effective_date <= '31mar2004'd and (end_date >= '1apr2003'd or end_date = .) then output d2003;
if effective_date <= '31mar2005'd and (end_date >= '1apr2004'd or end_date = .) then output d2004;
%macro get();
%do year=2002 %to 2004;
call symput('nextyr',1+&year.);
if effective_date <= "31mar&nextyr."d and (end_date >= "1apr&year."d or end_date = .) then output subdata&year;
%end;
%mend;
%get;
RUN;
LOG:
34 %macro get();
35
36 %do year=2002 %to 2003;
37 call symput('nextyr',1+&year.);
38 if effective_date <= "31mar&nextyr."d and (end_date >= "1apr&year."d or end_date = .) then output postal&year;
39 %end;
40 %mend;
41 %get;
NOTE: Line generated by the invoked macro "GET".
41 call symput('nextyr',1+&year.); if effective_date <= "31mar&nextyr."d and (end_date >= "1apr&year."d or end_date = .)
________________
77
41 ! then output postal&year;
MPRINT(GET): call symput('nextyr',1+2002);
WARNING: Apparent symbolic reference NEXTYR not resolved.
ERROR: Invalid date/time/datetime constant "31mar&nextyr."d.
MPRINT(GET): if effective_date <= "31mar&nextyr."d and (end_date >= "1apr2002"d or end_date = .) then output postal2002;
MPRINT(GET): call symput('nextyr',1+2003);
WARNING: Apparent symbolic reference NEXTYR not resolved.
MPRINT(GET): if effective_date <= "31mar&nextyr."d and (end_date >= "1apr2003"d or end_date = .) then output postal2003;
ERROR 77-185: Invalid number conversion on "31mar&nextyr."d.
You seem to be confusing CALL SYMPUT which is a data step statement that can only work WHILE THE DATA STEP IS RUNNING with %LET which can be used to assign macro variable values WHILE THE MACRO IS RUNNING.
%macro get();
%do year=2002 %to 2004;
%let nextyr = %eval(&year +1);
if effective_date <= "31mar&nextyr."d and (end_date >= "01apr&year."d or end_date = .) then output subdata&year;
%end;
%mend;
DATA d2002 d2003 d2004 subdata2002 subdata2003 subdata2004;
SET mydata
if effective_date <= '31mar2003'd and (end_date >= '01apr2002'd or end_date = .) then output d2002;
if effective_date <= '31mar2004'd and (end_date >= '01apr2003'd or end_date = .) then output d2003;
if effective_date <= '31mar2005'd and (end_date >= '01apr2004'd or end_date = .) then output d2004;
%get;
RUN;
You seem to be confusing CALL SYMPUT which is a data step statement that can only work WHILE THE DATA STEP IS RUNNING with %LET which can be used to assign macro variable values WHILE THE MACRO IS RUNNING.
%macro get();
%do year=2002 %to 2004;
%let nextyr = %eval(&year +1);
if effective_date <= "31mar&nextyr."d and (end_date >= "01apr&year."d or end_date = .) then output subdata&year;
%end;
%mend;
DATA d2002 d2003 d2004 subdata2002 subdata2003 subdata2004;
SET mydata
if effective_date <= '31mar2003'd and (end_date >= '01apr2002'd or end_date = .) then output d2002;
if effective_date <= '31mar2004'd and (end_date >= '01apr2003'd or end_date = .) then output d2003;
if effective_date <= '31mar2005'd and (end_date >= '01apr2004'd or end_date = .) then output d2004;
%get;
RUN;
Got it and I just tested and it worked great. Thanks so much.
For the second question, If I write DATA sub2002 - sub2014, SAS will give error messages? Or I have to spell each of the output dataset names DADA sub2002 subs2003 sub2004 sub2005....? Is there a way not to list each one of them?
DATA sub2002 - sub2014;
SET mydata
%macro get();
%do year=2002 %to 2014;
%let nextyr=%eval(&year.+1);
if effective_date <= "31mar&nextyr."d and (end_date >= "1apr&year."d or end_date = .) then output sub&year;
%end;
%mend;
%get;
RUN;
You cannot use ranges of datasets in a DATA statement. You could use macro code to generate the names.
It is probably easier if you generate the whole data step with the macro.
%macro get(start,stop,inputds);
DATA
%do year=&start %to &stop;
sub&year
%end;
;
SET &inputds;
%do year=&start %to &stop;
if effective_date <= "31mar%eval(&year+1)"d
and (end_date >= "1apr&year."d or end_date = .) then
output sub&year
;
%end;
RUN;
%mend get;
%get(start=2002,stop=2014,inputds=mydata);
Thanks. Saved the suggestions and will apply.
BTW it's such a great community to get help and response so quickly!
Thanks. I understood the reasoning and generally I won't do it either. It's a special case that calls for it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.