Hi, SAS Users-
This seems so basic that I should be able to solve it, but here goes. I want to cycle through a bunch of files and manipulate variables (recode, create, etc.) in each. Some of the files have one prefix ("ed"), some have another ("pdd"), and the files are successive years (in this example, just 2016-2017). The catch is that I would to apply some of the data manipulations only to the "ed" files and other data manipulations only to the "pdd" files using conditional ("IF/THEN/DO") programming. Many of these data manipulations are complicated and themselves require if-then statements.
My strategy is to start basic to see if I can create a double loop with IF/THEN/DO statements. This is what I came up with, and it doesn't work:
data _null_;
x = "pdd ed";
max = countw(x);
do i = 1 to max;
call symput("max",max);
call symput("type"||left(put(i, 1.)),scan(x,i));
end;
run;
%put &max &type1 &type2;
%macro joinfile(startyr,endyr);
data _null_;
%do j=1 %to &max.;
%do i=&startyr. %to &endyr.;
if &&type&j. = "pdd" then
do;
%put yes;
end;
else
do;
%put nope;
end;
%end;
%end;
run ;
%mend joinfile;
%joinfile(16,17);
I get this:
yes
nope
yes
nope
yes
nope
yes
nope
NOTE: Variable pdd is uninitialized.
NOTE: Variable ed is uninitialized.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
The more complicated code doesn't work either, but maybe for reasons distinct from why this doesn't work:
*Define macro variable "type" to specify ED or PDD files in loop below;
data _null_;
x = "pdd ed";
max = countw(x);
do i = 1 to max;
call symput("max",max);
call symput("type"||left(put(i, 1.)),scan(x,i));
end;
run;
%macro joinfile2(startyr,endyr);
%LET NUMDX=25;
%LET NUMEC=5;
%do j=1 %to &max.;
%do i=&startyr. %to &endyr.;
/***CONDITIONAL CODE***/
DATA &&type&j..&i._2;
SET &&type&j..&i.;
IF &&type&j.. = "pdd" THEN
DO
**********************************************************************************
STEP 2 (A): CREATE INJURY SUBSET: FOR HOSPITAL DISCHARGE DATA
SKIP THIS STEP IF YOU USE EMRGENCY DEPARTMENT DATA
***********************************************************************************
** USE PRINCIPAL DIAGNOSIS FIELD (1ST LISTED) TO IDENTIFY INJURY CASES;
**IDENTIFY VALID INJURY CASES IN THE PRINCIPAL DIAGNOSIS FIELD;
IF SUBSTR(ICD10_3F1,1,1)='S' OR 'T07'<=ICD10_3F1<='T34' OR
'T36'<=ICD10_3F1<='T50' OR 'T51'<=ICD10_3F1<='T65' OR
'T66'<=ICD10_3F1<='T77' OR ICD10_3F1='T79' THEN INJDX=1;
ELSE INJDX=0;
** IDENTIFY ADVERSE EFFECTS/UNDERDOSING, AND ASSIGN THEM AS NON-INJURY;
IF 'T36'<=ICD10_3F1<='T50' AND SUBSTR(ICD10_6F1,6,1) IN ('5','6') THEN INJDX=0;
IF ICD10_4F1 IN ('T369','T379','T399','T414','T427','T439','T459','T479','T499')
AND SUBSTR(ICD10_6F1,5,1) IN ('5','6') THEN INJDX=0;
IF ENCT1="S" THEN INJDX=0;
** REMOVE NON-INJURY CASES TO CREATE FINAL INJURY CASE SUBSET (EXCLUDES SEQUELAE);
IF INJDX=0 THEN DELETE;
END;
ELSE
DO;
**************************************************************************************************
STEP 2 (B): CREATE INJURY SUBSET: FOR EMERGENCY DEPARTMENT DATA
SKIP THIS STEP IF YOU USE HOSPITAL DISCHARGE DATA
**************************************************************************************************
** USE INJURY DIGNOSIS CODES IN THE PRINCIPAL DIAGNOSIS AND EXTERNAL CAUSE CODES (IN ALL DIAGNOSIS AND EXTERNAL CAUSE FIELDS) TO IDENTIFY INJURY CASES;
** SAS ARRAY IS TO SEARCH VALID EXTERNAL CAUSE CODES IN ALL DIAGNOSIS AND EXTERNAL CAUSE FIELDS;
ARRAY ICD10F {&NUMDX} $ ICD10F1 - ICD10F&NUMDX; ** ARRAY FOR FORMATTED DIAGNOSIS FIELDS;
ARRAY ICD10_6F {&NUMDX} $ ICD10_6F1 - ICD10_6F&NUMDX; ** ARRAY FOR FORMATTED DIAGNOSIS FIELDS LIMITED TO 6 CHARACTERS;
ARRAY ICD10_5F {&NUMDX} $ ICD10_5F1 - ICD10_5F&NUMDX; ** ARRAY FOR FORMATTED DIAGNOSIS FIELDS LIMITED TO 5 CHARACTERS;
ARRAY ICD10_4F {&NUMDX} $ ICD10_4F1 - ICD10_4F&NUMDX; ** ARRAY FOR FORMATTED DIAGNOSIS FIELDS LIMITED TO 4 CHARACTERS;
ARRAY ICD10_3F {&NUMDX} $ ICD10_3F1 - ICD10_3F&NUMDX; ** ARRAY FOR FORMATTED DIAGNOSIS FIELDS LIMITED TO 3 CHARACTERS;
ARRAY ENCT {&NUMDX} $ ENCT1 - ENCT&NUMDX; ** ARRAY FOR ENCOUNTER CHARACTER;
ARRAY CAUSE_D {&NUMDX} CAUSE_D1 - CAUSE_D&NUMDX; ** ARRAY FOR VALID EXTERNAL CAUSE CODE DERIVED FROM DIAGNOSIS FIELDS;
ARRAY ECODEF {&NUMEC} $ ECODEF1 - ECODEF&NUMEC; ** ARRAY FOR FORMATTED ECODE FIELDS;
ARRAY ECODE_6F {&NUMEC} $ ECODE_6F1 - ECODE_6F&NUMEC; ** ARRAY FOR FORMATTED EXTERNAL CAUSE FIELDS LIMITED TO 6 CHARACTERS;
ARRAY ECODE_5F {&NUMEC} $ ECODE_5F1 - ECODE_5F&NUMEC; ** ARRAY FOR FORMATTED EXTERNAL CAUSE FIELDS LIMITED TO 5 CHARACTERS;
ARRAY ECODE_4F {&NUMEC} $ ECODE_4F1 - ECODE_4F&NUMEC; ** ARRAY FOR FORMATTED EXTERNAL CAUSE FIELDS LIMITED TO 4 CHARACTERS;
ARRAY ECODE_3F {&NUMEC} $ ECODE_3F1 - ECODE_3F&NUMEC; ** ARRAY FOR FORMATTED EXTERNAL CAUSE FIELDS LIMITED TO 3 CHARACTERS;
ARRAY ENCTE {&NUMEC} $ ENCTE1 - ENCTE&NUMEC; ** ARRAY FOR ENCOUNTER CHARACTER;
ARRAY CAUSE_E {&NUMEC} CAUSE_E1 - CAUSE_E&NUMEC; ** ARRAY FOR VALID EXTERNAL CAUSE CODE DEDRIVED FROM EXTERNAL CAUSE FIELDS;
**IDENTIFY INJURY CASES USING INJURY DIAGNOSIS CODES IN THE PRINCIPAL DIAGNOSIS FIELD;
IF SUBSTR(ICD10_3F1,1,1)='S' OR 'T07'<=ICD10_3F1<='T34' OR
'T36'<=ICD10_3F1<='T50' OR 'T51'<=ICD10_3F1<='T65' OR
'T66'<=ICD10_3F1<='T77' OR ICD10_3F1='T79' THEN INJDX=1;
ELSE INJDX=0;
** IDENTIFY ADVERSE EFFECTS/UNDERDOSING, SEQUELAE IN THE PRINCIPLE DIAGNOSIS FIELD, AND ASSIGN THEM AS NON-INJURY;
IF 'T36'<=ICD10_3F1<='T50' AND SUBSTR(ICD10_6F1,6,1) IN ('5','6') THEN INJDX=0;
IF ICD10_4F1 IN ('T369','T379','T399','T414','T427','T439','T459','T479','T499')
AND SUBSTR(ICD10_6F1,5,1) IN ('5','6') THEN INJDX=0;
IF ENCT1='S' THEN INJDX=0;
** IDENTIFY VALID EXTERNAL CAUSE CODES IN ALL DIAGNOSIS FIELDS;
DO I=1 TO &NUMDX;
IF 'V00'<=ICD10_3F{I}<='V99' OR 'W00'<=ICD10_3F{I}<='X58' OR 'X71'<=ICD10_3F{I}<='X83' OR 'X92'<=ICD10_3F{I}<='Y09'
OR 'Y21'<=ICD10_3F{I}<='Y33' OR 'Y35'<=ICD10_3F{I}<='Y38' OR 'T15'<=ICD10_3F{I}<='T19' OR 'T36'<=ICD10_3F{I}<='T50'
OR 'T51'<=ICD10_3F{I}<='T65' OR ICD10_3F{I} IN ('T71', 'T73', 'T74','T76')
OR ICD10_5F{I}='T1491' OR ICD10_4F{I} IN ('T750', 'T752', 'T753') THEN CAUSE_D{I}=1;
ELSE CAUSE_D{I}=0;
** ASSIGN ADVERSE EFFECTS/UNDERDOSING AND SEQUELAE AS NON-VALID ECODE;
IF 'T36'<= ICD10_3F{I} <='T50' AND SUBSTR(ICD10_6F{I},6,1) IN ('5','6') THEN CAUSE_D{I}=0;
IF ICD10_4F{I} IN ('T369','T379','T399','T414','T427','T439','T459','T479','T499')
AND SUBSTR(ICD10_6F{I},5,1) IN ('5','6') THEN CAUSE_D{I}=0;
IF ENCT{I}='S' THEN CAUSE_D{I}=0;
END;
** IDENTIFY VALID EXTERNAL CAUSE CODES IN ALL EXTERNAL CAUSE FIELDS;
DO J=1 TO &NUMEC;
IF 'V00'<=ECODE_3F{J}<='V99' OR 'W00'<=ECODE_3F{J}<='X58' OR 'X71'<=ECODE_3F{J}<='X83' OR 'X92'<=ECODE_3F{J}<='Y09'
OR 'Y21'<=ECODE_3F{J}<='Y33' OR 'Y35'<=ECODE_3F{J}<='Y38' OR 'T15'<=ECODE_3F{J}<='T19' OR 'T36'<=ECODE_3F{J}<='T50'
OR 'T51'<=ECODE_3F{J}<='T65' OR ECODE_3F{J} IN ('T71', 'T73', 'T74','T76')
OR ECODE_5F{J}='T1491' OR ECODE_4F{J} IN ('T750', 'T752', 'T753') THEN CAUSE_E{J}=1;
ELSE CAUSE_E{J}=0;
** ASSIGN ADVERSE EFFECTS/UNDERDOSING AND SQUELAE AS NON-VALID ECODE;
IF 'T36'<= ECODE_3F{J} <='T50' AND SUBSTR(ECODE_6F{J},6,1) IN ('5','6') THEN CAUSE_E{J}=0;
IF ECODE_4F{J} IN ('T369','T379','T399','T414','T427','T439','T459','T479','T499')
AND SUBSTR(ECODE_6F{J},5,1) IN ('5','6') THEN CAUSE_E{J}=0;
IF ENCTE{J}='S' THEN CAUSE_E{J}=0;
END;
**IF A VALID EXTERNAL CAUSE CODE IS FOUND IN ANY DIAGNOSIS OR EXTERNAL CAUSE FIELDS==> INJURY CASE;
IF INJDX=0 THEN DO;
IF SUM(OF CAUSE_D1-CAUSE_D12)>0 OR SUM(OF CAUSE_E1-CAUSE_E2)>0 THEN INJDX=1;
END;
** REMOVE NON-INJURY CASES TO CREATE INJURY CASE SUBSET;
IF INJDX=0 THEN DELETE;
END;
RUN;
%end;
%end;
%mend joinfile2;
%joinfile2(16,17);
When I run that code, I get these errors (repeated four times for each of the 2 x 2 prefix-year combinations I have).
80: LINE and COLUMN cannot be determined.
NOTE: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and COLUMN where the error has occurred.
ERROR 80-322: Expecting a variable name.
76: LINE and COLUMN cannot be determined.
NOTE: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and COLUMN where the error has occurred.
ERROR 76-322: Syntax error, statement will be ignored.
NOTE: Line generated by the invoked macro "JOINFILE2".
165 ,1) IN ('5','6') THEN INJDX=0; IF ENCT1="S" THEN INJDX=0; ** REMOVE NON-INJURY CASES TO
165 ! CREATE FINAL INJURY CASE SUBSET (EXCLUDES SEQUELAE); IF INJDX=0 THEN DELETE; END; ELSE DO;
___
161
ERROR 161-185: No matching DO/SELECT statement.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.PDD16_2 may be incomplete. When this step was stopped there were 0 observations and 284 variables.
NOTE: DATA statement used (Total process time):
real time 0.11 seconds
cpu time 0.06 seconds
I realize this is a long post. I greatly appreciate your forbearance and any thoughts you have on how I can get this to work!
Regards,
David
I suspect that the following is not what you want:
if &&type&j. = "pdd" then
Here SAS is trying to translate &&type&j. into a SAS dataset variable, but I think that you actually want the macro variable, is this correct? This is why you get
Variable pdd is uninitialized
Variable ed is uninitialized
If so, then you need to use quotations on both sides.
if "&&type&j." = "pdd" then
You are also mixing data step code with macro code. So, the %put will always print, no matter if the if case is true or not. Try to add:
if 1 = 2 then do;
%put Impossible!!! ;
end;
So, if you want to see where you are, use put instead of %put.
I suspect that the following is not what you want:
if &&type&j. = "pdd" then
Here SAS is trying to translate &&type&j. into a SAS dataset variable, but I think that you actually want the macro variable, is this correct? This is why you get
Variable pdd is uninitialized
Variable ed is uninitialized
If so, then you need to use quotations on both sides.
if "&&type&j." = "pdd" then
You are also mixing data step code with macro code. So, the %put will always print, no matter if the if case is true or not. Try to add:
if 1 = 2 then do;
%put Impossible!!! ;
end;
So, if you want to see where you are, use put instead of %put.
Thanks, heffo, for your time. Your suggestion helped solve the smaller problem of getting the test code to run, with one adaptation: I had to put "yes" and "nope" in quotation marks. So:
data _null_;
x = "pdd ed";
max = countw(x);
do i = 1 to max;
call symput("max",max);
call symput("type"||left(put(i, 1.)),scan(x,i));
end;
run;
%put &max &type1 &type2;
%macro joinfile(startyr,endyr);
data _null_;
%do j=1 %to &max.;
%do i=&startyr. %to &endyr.;
if "&&type&j." = "pdd" then
do;
put "yes";
end;
else
do;
put "nope";
end;
%end;
%end;
run ;
%mend joinfile;
%joinfile(16,17);
1) AS you use macro program, if you add OPTION SPOOL you'll get better chance to locate error line in the log.
2) In the log there is
ERROR 161-185: No matching DO/SELECT statement.
pay attention to next lines in your code:
** REMOVE NON-INJURY CASES TO CREATE FINAL INJURY CASE SUBSET (EXCLUDES SEQUELAE);
IF INJDX=0 THEN DELETE;
END;
The END; statement has no DO or SELECT and should be canceled.
Thanks, Shmuel, for your time. I will use the option SPOOL ON and see if that helps. I did eliminate the END, but the code still doesn't work. It may be that the nested conditional structures are too complex, and it looks like there are other problems as well.
I'll look more closely at the log. Again, I appreciate the help.
Pack your two different calculations in two separate macros, and then let a data step handle the execution:
data _null_;
x = "pdd ed";
max = countw(x);
call symput("max",max);
do i = 1 to max;
call symput("type"||left(put(i, 1.)),scan(x,i));
end;
run;
%let startyr=16;
%let endyr=17;
data _null_;
do j = 1 to &max.;
do i=&startyr. to &endyr.;
if resolve(cats('&type',put(j,best.))) = "pdd"
then call execute ('%nrstr(%macro1))';
else call execute ('%nrstr(%macro2))';
end;
end;
run;
Some comments on coding in general:
data _null_; x = "pdd ed"; max = countw(x); do i = 1 to max; call symput("max",max); call symput("type"||left(put(i, 1.)),scan(x,i)); end; run;
The highlighted line returns the exact same result each iteration of the loop. This particular example won't mean much but habitual use of such will create very inefficient code that may be measurable later.
Letting macro variables "fall into" the middle of another macro is asking for hard to debug problems later.
%macro joinfile(startyr,endyr); data _null_; %do j=1 %to &max.; %do i=&startyr. %to &endyr.; if &&type&j. = "pdd" then do; %put yes; end; else do; %put nope; end; %end; %end; run ; %mend joinfile;
The highlighted macro variables would be much better off passed as parameters. The type likely passed as a single variable an parsed.
For your example may be something like:
%macro joinfile(startyr,endyr,x); %let max = %sysfunc(countw(&x.)); data _null_; %do j=1 %to &max.; %do i=&startyr. %to &endyr.; %let type = %scan(&x.,&j.); if "&type." = "pdd" then do; put "i= &i." +1 "type= &type." +1 "yes"; end; else do; put "i= &i." +1 "type= &type." +1"nope"; end; %end; %end; run ; %mend; %joinfile(16,17, pdd ed);
I added the &I and &type variables to the PUT statement so you could tell what condition was satisfied easier.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.