Hi all!
I am having trouble creating a new variable "disease" that would determine whether or not the subject has the disease (1) or does not (0), depending on the symptoms listed on 5 different variables. If patient has sympt1-sympt3 (heartburns, sickness, and spasm) without sympt4 and/or sympt5 (temperature and/ or tiredness), then patient has disease (1). If the conditions are any different then disease should equal 0.
CODE:
options nodate nonumber;
****1.IMPORT;
%macro P3 (a, b, c, d);
proc import out= &a
datafile= "C:\HW5\&b"
dbms=xlsx replace;
getnames=yes;
run;
proc sort data=&a;
by &c &d;
run;
Proc print data = &a;
Run;
%mend P3;
%P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no);
***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID;
data sympt (drop=symptom_no symptom);
set PROJECT3_F17;
retain sympt1-sympt5;
length sympt1-sympt5 $15;
array symptoms (5) $15. sympt1-sympt5;
by id_no;
if first.id_no then call missing(of symptoms(*));
symptoms(symptom_no)=symptom;
if last.id_no then output;
run;
proc print data=sympt;
run;
***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT;
data dis_01 (drop=answer);
set sympt;
retain disease;
array dpthdisease (1) disease;
by id_no;
if first.id_no then dpthdisease (symp1-sympt5)=answer;
If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1;
else disease=0;
if last.id_no then output;
run;
proc print data=disease;
run;
LOG:
1 options nodate nonumber;
2 ****1.IMPORT;
3 %macro P3 (a, b, c, d);
4 proc import out= &a
5 datafile= "C:\HW5\&b"
6 dbms=xlsx replace;
7 getnames=yes;
8 run;
9 proc sort data=&a;
10 by &c &d;
11 run;
12 Proc print data = &a;
13 Run;
14 %mend P3;
15
16 %P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no);
NOTE: The import data set has 18082 observations and 3 variables.
NOTE: WORK.PROJECT3_F17 data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 1.41 seconds
cpu time 0.85 seconds
NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: The data set WORK.PROJECT3_F17 has 18082 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.15 seconds
cpu time 0.01 seconds
NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: PROCEDURE PRINT used (Total process time):
real time 4.96 seconds
cpu time 3.20 seconds
17 ***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID;
18 data sympt (drop=symptom_no symptom);
19 set PROJECT3_F17;
20 retain sympt1-sympt5;
21 length sympt1-sympt5 $15;
22 array symptoms (5) $15. sympt1-sympt5;
23 by id_no;
24 if first.id_no then call missing(of symptoms(*));
25 symptoms(symptom_no)=symptom;
26 if last.id_no then output;
27 run;
NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: The data set WORK.SYMPT has 12549 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.66 seconds
cpu time 0.07 seconds
28 proc print data=sympt;
29 run;
NOTE: There were 12549 observations read from the data set WORK.SYMPT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 3.42 seconds
cpu time 3.34 seconds
30 ***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT;
31 data dis_01 (drop=answer);
32 set sympt;
33 retain disease;
34 array dpthdisease (1) disease;
35 by id_no;
36 if first.id_no then dpthdisease (symp1-sympt5)=answer;
37 If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1;
------
22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, -, /, <, <=,
<>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, [,
^=, {, |, ||, ~=.
38 else disease=0;
39 if last.id_no then output;
40 run;
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
36:44 37:18 37:25 37:39 37:49
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DIS_01 may be incomplete. When this step was stopped there were 0
observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.23 seconds
cpu time 0.04 seconds
53 proc print data=dis_01;
54 run;
NOTE: No observations in data set WORK.DIS_01.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Is this for a course? This exact question with the same data has been asked and answered on here.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
