Help using Base SAS procedures

Create table to meet certain conditions

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

Create table to meet certain conditions

Hello. I need help with producing 2 separate tables from the original table, where one table contains observations with specific variables=Yes, but only those variables='Yes.'

 

For example, I only want patients who have HIV='Yes' or cancer='Yes', and nothing else='Yes'.

 

data have;

patientdiabetescholesterolHIVcancerHep B
patient_1 YesYesNoYesYes
patient_2YesNoYesNoNo
patient_3NoNoNoNoYes
patient_4NoNoYesNoNo
patient_5NoYesNoNoYes
patient_6NoNoYesYesNo
patient_7NoNoNoYesNo

 

 

/*only patients where HIV='Yes' or cancer='Yes' and everything else='No'*/

data want1;

set have;

patientdiabetescholesterolHIVcancerHep B
patient_4NoNoYesNoNo
patient_6NoNoYesYesNo
patient_7NoNoNoYesNo

 

 

/*everyone else*/

data want2;

set have;

patientdiabetescholesterolHIVcancerHep B
patient_1 YesYesNoYesYes
patient_2YesNoYesNoNo
patient_3NoNoNoNoYes
patient_5NoYesNoNoYes

 

 

Is there a way to do it in SQL or SAS datastep? Thank you!


Accepted Solutions
Solution
‎01-02-2018 11:07 AM
PROC Star
Posts: 8,112

Re: Create table to meet certain conditions

Here is how you could do it using a data step:

data want1 want2;
  set have;
  if (HIV eq 'Yes' or cancer eq 'Yes') and
    (diabetes NE 'Yes' and cholesterol NE 'Yes' and
     Hep_B ne 'Yes') then output want1;
  else output want2;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
PROC Star
Posts: 1,318

Re: Create table to meet certain conditions

[ Edited ]

why is patient1 in want2 when cancer='Yes' ? 

and for that matter patient2 in want2 when HIV='Yes'

Occasional Contributor
Posts: 11

Re: Create table to meet certain conditions

Posted in reply to novinosrin

because diabetes='Yes', cholesterol='Yes', and Hep B='Yes.'

 

Only patients where either HIV='Yes' or cancer='Yes' and everything else='No.'

PROC Star
Posts: 1,318

Re: Create table to meet certain conditions


data have;
infile datalines truncover;
input (patient diabetes cholesterol HIV cancer Hep_B) (:$10.);
datalines;
patient_1 Yes Yes No Yes Yes
patient_2 Yes No Yes No No
patient_3 No No No No Yes
patient_4 No No Yes No No
patient_5 No Yes No No Yes
patient_6 No No Yes Yes No
patient_7 No No No Yes No
;

 

data want1 want2;
set have;
array a(*) diabetes cholesterol Hep_B;
if cancer='Yes' and (cancer not in a) then _flag=1;
if HIV='Yes' and (hiv not in a) then __flag=1;
if _flag or __flag then output want1;
else if not(_flag) and not(__flag) then output want2;
drop _:;
run;

Highlighted
PROC Star
Posts: 8,112

Re: Create table to meet certain conditions

Posted in reply to novinosrin

@novinosrin: Actually, I like your use of an array in solving this problem, but didn't like your creating two flag variables.

Here is an alternative that uses arrays and, methinks, makes the code both more extensible and easier to read:

data want1 want2;
  set have;
  array musthaveone(*) HIV cancer;
  array canthaveany(*) diabetes cholesterol Hep_B;
  if 'Yes' in musthaveone  and
    'Yes' not in canthaveany then output want1;
  else output want2;
run;

Art, CEO, AnalystFinder.com

 

PROC Star
Posts: 1,318

Re: Create table to meet certain conditions

Thank you Art @art297   Wish the thought of using a constant operand check struck my mind earlier.  I often have this approach to avoid using constant operand as opposed to variable operand thinking to make it more squeaking clean. Nonetheless your demonstration proves otherwise. Well, I have learned now. Cheers! and Thanks again

PROC Star
Posts: 1,318

Re: Create table to meet certain conditions

@art297Art, How about still sticking to one array and making it all compound expression?

 

data want1 want2;
set have;
array a(*) diabetes cholesterol Hep_B;
if (cancer='Yes' and (cancer not in a)) or (HIV='Yes' and (hiv not in a)) then output want1;
else if not((cancer='Yes' and (cancer not in a)) or (HIV='Yes' and (hiv not in a))) then output want2;
run;

PROC Star
Posts: 8,112

Re: Create table to meet certain conditions

Posted in reply to novinosrin

@novinosrin: No! Takes longer to run, is less extensible (IMHO), and is (IMHO) harder to read.

 

Solution
‎01-02-2018 11:07 AM
PROC Star
Posts: 8,112

Re: Create table to meet certain conditions

Here is how you could do it using a data step:

data want1 want2;
  set have;
  if (HIV eq 'Yes' or cancer eq 'Yes') and
    (diabetes NE 'Yes' and cholesterol NE 'Yes' and
     Hep_B ne 'Yes') then output want1;
  else output want2;
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 13,029

Re: Create table to meet certain conditions

Are your values actually character 'Yes' and 'No' or are they numeric with a format applied?

 

In many cases splitting data yields less efficient use than having a variable to indicate group membership and using BY processing on that variable.

Super Contributor
Super Contributor
Posts: 260

Re: Create table to meet certain conditions

For completeness since SQL was mentioned:

 


data patients;
  input patient $9. diabetes:$3. cholesterol:$3. hiv:$3. cancer:$3. hep_b:$3.;
  datalines;
patient_1 Yes Yes No Yes Yes
patient_2 Yes No Yes No	No
patient_3 No No No No Yes
patient_4 No No	Yes	No	No
patient_5 No Yes No No Yes
patient_6 No No	Yes	Yes	No
patient_7 No No	No Yes No
;
run;

proc sql;
	create table want1 as 
		select * 
		from patients
		where (diabetes = 'No' and cholesterol = 'No' and hep_b = 'No') and (hiv = 'Yes' or cancer = 'Yes');
quit;

proc sql;
	create table want2 as 
		select * 
		from patients
		where (diabetes = 'Yes' or cholesterol = 'Yes' or hep_b = 'Yes');
quit;

Frequent Contributor
Posts: 78

Re: Create table to meet certain conditions

Hi,

 

The DATA step is an excellent way to achieve what you want as it allows you to create one or more datasets in a single DATA step.

 

In this case you have a dataset, WORK.HAVE that you want to split out into two datasets, WORK.WANT1 and WORK.WANT2.

 

Whilst the method shown below works perfectly well, you may want to review things if the number of ailments (diabetes, cholestorol etc.) increases and/or your selection criteria gets more complex.

 

Having said that, try this out and see if it does what you need. Any record that does not meet the first criterion for inclusion in WORK.WANT1 will be swallowed up by the catch-all ELSE statement and get sent to WORK.WANT2.

 

Notice that the HIV Or Cancer check is inside a pair of parentheses so that this is evaluated separately from the remainder of the IF statement.

 

****************************************************************;
*Read in your Patient dataa.                                    ;
****************************************************************;
Data HAVE;
 Input  patient	$9.
        diabetes $	
        cholesterol	$
        HIV	$
        cancer	$
        Hep_B $
        ;
DATALINES;
patient_1 Yes Yes No Yes Yes
patient_2 Yes No Yes No No
patient_3 No No No No Yes
patient_4 No No Yes No No
patient_5 No Yes No No Yes
patient_6 No No Yes Yes No
patient_7 No No No Yes No
;
Run;

****************************************************************;
*Decide which of the WANTn datasets a record should go to.      ;
****************************************************************;
Data WANT1
     WANT2
     ;

 Set HAVE;

 If (HIV="Yes" Or Cancer="Yes") 
    And
    Diabetes="No"
    And
    Cholesterol="No"
    And
    Hep_B="No"
    Then Output WANT1;

 Else Output WANT2;

Run;

 

Cheers,

 

Downunder Dave

Wellington.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 494 views
  • 13 likes
  • 6 in conversation