DATA Step, Macro, Functions and more

Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Reply
Contributor
Posts: 20

Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Create a new variable called disease and make it equal to 1 if a person has complaints of heartburns, sickness, and spasm, but no temperature or tiredness.

 

If the person does not have this exact symptom breakdown, make disease equal to 0.

 

Lastly, use PROC FREQ to determine what number and proportion of individuals in the dataset has the disease of interest.

 

I do not know how to do this. Any hints or help? I am studying for an exam and need to understand this program.

 

I have to use if and then statements.

 

This is what I have so far.... it is not working. 

 proc format; 
value symptom_no	1= "heartburns" 
					2= "Sickness"
					3= "Spasm"
					4= "Temperature"
					5= "Tiredness"; 
		
proc sort data=Project3 out= longsort; 
 	by id_no; 
run; 
data new; 
	set longsort; 
	by id_no; 
	Keep id_no sympt1 - sympt5 disease; 
	retain sympt1 - sympt5 disease; 
	disease=0;
	array New_a (1:5) $20 sympt1 - sympt5; 
	If first.id_no then
	do; 
	Do i = 1 to 5; 
		new_a (i) = .; 
		end; 

	new_a (symptom_no) = symptom; 
	if last.id_no then output; 
		run; 
	array New_b (1) disease; 
	If sympt1 ='heartburns' and sympt2='sickness' and sympt3='spasm' then disease='1';
	else disease='0'; 
		end; 
	end; 

		run; 
	proc print data= new; 
	run; 
SAS Super FREQ
Posts: 508

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Posted in reply to jessica_join

You need to get rid of the RUN statement in the middle of your data step.  It looks like there is an extra end statement.  This would be a lot easier to look at with reasonable indentation.

 

Occasional Contributor
Posts: 10

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Posted in reply to jessica_join

Hi,

 

Its a little unclear on the structure of your original data, so I've made an assumption that it simply has a single numeric symptom column to start with (with values 1 to 5) and multiple rows per ID depending on number of symptoms. In order to make the logic more transparent i.e. move away from arrays (for now). I reckon it might look something like:

 

data new; 
  set longsort; 
  by id_no; 
	
  retain heartburn sickness spasm temperature tiredness;
  
  if (first.id_no) then
  do;
    heartburn   =0;
    sickness    =0;
    spasm       =0;
    temperature =0;
    tiredness   =0;
  end;

  if symptom=1 then heartburn  =1; 
  if symptom=2 then sickness   =1; 
  if symptom=3 then spasm      =1;
  if symptom=4 then temperature=1; 
  if symptom=5 then tiredness  =1;

  if (heartburn)    and 
     (sickness)     and 
     (spasm)        and 
     ^(temperature) and 
     ^(tiredness)   then disease =1; else
                         disease =0;
                         
   if (last.id_no);
 run;
 
 proc freq data=new;
   table disease;
 run;
Super User
Posts: 24,027

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

@Enio This is a homework assignment, she has to use arrays.

Occasional Contributor
Posts: 10

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

You're right, and that's cool. Hopefully the code above helps to explain what the arrays are trying to do. With arrays it would probably look something like this:

 

data new(drop=symptom i); 
  set longsort; 
  by id_no; 
	
  retain sympt1 - sympt5;
  
  array new_a (1:5) sympt1 - sympt5;
  
  do i = 1 to 5 ;
  
    if (first.id_no) then
    do;
      new_a(i) =0;
    end;

    if symptom=i then new_a(i)  =1; 

  end;
  
  if (sympt1)    and 
     (sympt2)    and 
     (sympt3)    and 
     ^(sympt4)   and 
     ^(sympt5)   then disease =1; else
                      disease =0;
                         
   if (last.id_no);
 run;
 
 proc freq data=new;
   table disease;
 run;
Super User
Posts: 24,027

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

@Enio some small changes - noted by Warren earlier I think. That END after the line below should be moved up, or the do loop could be simplified.

 

    if symptom=i then new_a(i)  =1; 

end; *This needs to be moved up;

The i reference isn't correct in this case, because the diagnosis are being moved to specific points. Those are defined by symbol_num (sp?) variable. 

 

Though... symptoms in the previous question were text, but they appear to be character here so I'm slightly confused myself.

 

A previous version of this question is linked to below for your information.

https://communities.sas.com/t5/Base-SAS-Programming/SAS-new-variable/m-p/415698

https://communities.sas.com/t5/Base-SAS-Programming/Array-First-id-retain-last-id/m-p/415699

 

 

SAS Super FREQ
Posts: 508

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Also note that it seems like a big misconception among many programmers is that you need if/then/else to assign a binary value to a variable.  NOT TRUE!  This works just fine and requires only a single statement.

 

variable = boolean-expression

 

A boolean expression (and/or/eq/ne/gt/ge/lt/le etc) resolves to zero or one.  You can assign that value to a variable.

Super User
Posts: 10,623

Re: Using IF-THEN/ELSE statements to create a flag indicating presence or absence of a disease

Posted in reply to jessica_join

This is my preferred way of visual coding style:

data new; 
set longsort; 
by id_no; 
keep id_no sympt1 - sympt5 disease; 
retain sympt1 - sympt5 disease; 
array New_a (1:5) $20 sympt1 - sympt5;
disease = 0; 
If first.id_no
then do; 
  do i = 1 to 5; 
    new_a (i) = .; 
  end; 
  new_a (symptom_no) = symptom; 
  if last.id_no then output; 
  if sympt1 ='heartburns' and sympt2='sickness' and sympt3='spasm'
  then disease='1';
  else disease='0'; 
end;
run; 

I removed the surplus end (which sticks out like a beacon when the code is properly formatted) and the erroneous run.

Now you can see that your if last. is within the if first. block, and will only be executed if there's only one row per id_no. I guess that's not what yo wanted.

 

Although I'm famous for my notoriously cluttered desk, my codes are always neat and tidy.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 7 replies
  • 707 views
  • 3 likes
  • 5 in conversation