Hi all! I am having trouble creating a new variable "disease" that would determine whether or not the subject has the disease (1) or does not (0), depending on the symptoms listed on 5 different variables. If patient has sympt1-sympt3 (heartburns, sickness, and spasm) without sympt4 and/or sympt5 (temperature and/ or tiredness), then patient has disease (1). If the conditions are any different then disease should equal 0. CODE: options nodate nonumber; ****1.IMPORT; %macro P3 (a, b, c, d); proc import out= &a datafile= "C:\HW5\&b" dbms=xlsx replace; getnames=yes; run; proc sort data=&a; by &c &d; run; Proc print data = &a; Run; %mend P3; %P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no); ***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID; data sympt (drop=symptom_no symptom); set PROJECT3_F17; retain sympt1-sympt5; length sympt1-sympt5 $15; array symptoms (5) $15. sympt1-sympt5; by id_no; if first.id_no then call missing(of symptoms(*)); symptoms(symptom_no)=symptom; if last.id_no then output; run; proc print data=sympt; run; ***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT; data dis_01 (drop=answer); set sympt; retain disease; array dpthdisease (1) disease; by id_no; if first.id_no then dpthdisease (symp1-sympt5)=answer; If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1; else disease=0; if last.id_no then output; run; proc print data=disease; run; LOG: 1 options nodate nonumber; 2 ****1.IMPORT; 3 %macro P3 (a, b, c, d); 4 proc import out= &a 5 datafile= "C:\HW5\&b" 6 dbms=xlsx replace; 7 getnames=yes; 8 run; 9 proc sort data=&a; 10 by &c &d; 11 run; 12 Proc print data = &a; 13 Run; 14 %mend P3; 15 16 %P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no); NOTE: The import data set has 18082 observations and 3 variables. NOTE: WORK.PROJECT3_F17 data set was successfully created. NOTE: PROCEDURE IMPORT used (Total process time): real time 1.41 seconds cpu time 0.85 seconds NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17. NOTE: The data set WORK.PROJECT3_F17 has 18082 observations and 3 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.15 seconds cpu time 0.01 seconds NOTE: Writing HTML Body file: sashtml.htm NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17. NOTE: PROCEDURE PRINT used (Total process time): real time 4.96 seconds cpu time 3.20 seconds 17 ***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID; 18 data sympt (drop=symptom_no symptom); 19 set PROJECT3_F17; 20 retain sympt1-sympt5; 21 length sympt1-sympt5 $15; 22 array symptoms (5) $15. sympt1-sympt5; 23 by id_no; 24 if first.id_no then call missing(of symptoms(*)); 25 symptoms(symptom_no)=symptom; 26 if last.id_no then output; 27 run; NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17. NOTE: The data set WORK.SYMPT has 12549 observations and 6 variables. NOTE: DATA statement used (Total process time): real time 0.66 seconds cpu time 0.07 seconds 28 proc print data=sympt; 29 run; NOTE: There were 12549 observations read from the data set WORK.SYMPT. NOTE: PROCEDURE PRINT used (Total process time): real time 3.42 seconds cpu time 3.34 seconds 30 ***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT; 31 data dis_01 (drop=answer); 32 set sympt; 33 retain disease; 34 array dpthdisease (1) disease; 35 by id_no; 36 if first.id_no then dpthdisease (symp1-sympt5)=answer; 37 If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1; ------ 22 ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, [, ^=, {, |, ||, ~=. 38 else disease=0; 39 if last.id_no then output; 40 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 36:44 37:18 37:25 37:39 37:49 NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.DIS_01 may be incomplete. When this step was stopped there were 0 observations and 9 variables. NOTE: DATA statement used (Total process time): real time 0.23 seconds cpu time 0.04 seconds 53 proc print data=dis_01; 54 run; NOTE: No observations in data set WORK.DIS_01. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
... View more