BookmarkSubscribeRSS Feed
aespinarey
Obsidian | Level 7

Hi all!

I am having trouble creating a new variable "disease" that would determine whether or not the subject has the disease (1) or does not (0), depending on the symptoms listed on 5 different variables. If patient has sympt1-sympt3 (heartburns, sickness, and spasm) without sympt4 and/or sympt5 (temperature and/ or tiredness), then patient has disease (1). If the conditions are any different then disease should equal 0.

project3 task2 table.PNG

CODE:

options nodate nonumber;
****1.IMPORT;
%macro P3 (a, b, c, d);
proc import out= &a
datafile= "C:\HW5\&b"
dbms=xlsx replace;
getnames=yes;
run;
proc sort data=&a;
by &c &d;
run;
Proc print data = &a;
Run;
%mend P3;

%P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no);
***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID;
data sympt (drop=symptom_no symptom);
set PROJECT3_F17;
retain sympt1-sympt5;
length sympt1-sympt5 $15;
array symptoms (5) $15. sympt1-sympt5;
by id_no;
if first.id_no then call missing(of symptoms(*));
symptoms(symptom_no)=symptom;
if last.id_no then output;
run;
proc print data=sympt;
run;
***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT;
data dis_01 (drop=answer);
set sympt;
retain disease;
array dpthdisease (1) disease;
by id_no;
if first.id_no then dpthdisease (symp1-sympt5)=answer;
If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1;
else disease=0;
if last.id_no then output;
run;
proc print data=disease;
run;

 

 

 LOG:

 

 

1 options nodate nonumber;
2 ****1.IMPORT;
3 %macro P3 (a, b, c, d);
4 proc import out= &a
5 datafile= "C:\HW5\&b"
6 dbms=xlsx replace;
7 getnames=yes;
8 run;
9 proc sort data=&a;
10 by &c &d;
11 run;
12 Proc print data = &a;
13 Run;
14 %mend P3;
15
16 %P3 (PROJECT3_F17, Project3.xlsx, id_no, symptom_no);

NOTE: The import data set has 18082 observations and 3 variables.
NOTE: WORK.PROJECT3_F17 data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 1.41 seconds
cpu time 0.85 seconds

 

NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: The data set WORK.PROJECT3_F17 has 18082 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.15 seconds
cpu time 0.01 seconds


NOTE: Writing HTML Body file: sashtml.htm

NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: PROCEDURE PRINT used (Total process time):
real time 4.96 seconds
cpu time 3.20 seconds


17 ***2.REORGANIZE variables: USE ARRAY STATEMENT ALONG WITH FIRST.ID and LAST.ID;
18 data sympt (drop=symptom_no symptom);
19 set PROJECT3_F17;
20 retain sympt1-sympt5;
21 length sympt1-sympt5 $15;
22 array symptoms (5) $15. sympt1-sympt5;
23 by id_no;
24 if first.id_no then call missing(of symptoms(*));
25 symptoms(symptom_no)=symptom;
26 if last.id_no then output;
27 run;

NOTE: There were 18082 observations read from the data set WORK.PROJECT3_F17.
NOTE: The data set WORK.SYMPT has 12549 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.66 seconds
cpu time 0.07 seconds


28 proc print data=sympt;
29 run;

NOTE: There were 12549 observations read from the data set WORK.SYMPT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 3.42 seconds
cpu time 3.34 seconds


30 ***3.Set disease=1 if sympt1+sympt2+sympt3 and no sympt4-sympt5 WORK OFF DATA SET SYMPT;
31 data dis_01 (drop=answer);
32 set sympt;
33 retain disease;
34 array dpthdisease (1) disease;
35 by id_no;
36 if first.id_no then dpthdisease (symp1-sympt5)=answer;
37 If answer EQ sympt1-sympt3 and NE sympt4 OR sympt5 then disease=1;
------
22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, -, /, <, <=,
<>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, [,
^=, {, |, ||, ~=.

38 else disease=0;
39 if last.id_no then output;
40 run;

NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
36:44 37:18 37:25 37:39 37:49
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DIS_01 may be incomplete. When this step was stopped there were 0
observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.23 seconds
cpu time 0.04 seconds

 

53 proc print data=dis_01;
54 run;

NOTE: No observations in data set WORK.DIS_01.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

1 REPLY 1
Reeza
Super User

Is this for a course? This exact question with the same data has been asked and answered on here.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 817 views
  • 0 likes
  • 2 in conversation