Help using Base SAS procedures

how to subset a dataset based on if a range of variables contain specific values?

Reply
Occasional Contributor
Posts: 18

how to subset a dataset based on if a range of variables contain specific values?

Suppose I have a dataset looks like the following:

 

ID .     age  MED1    MED2    MED3
001 .     1      1055        7832    3435
202 .      0     1051        3423    3455
204 .      3      5040         3423 .  3444
213 .       4     9999        3432 .  4546

 

Each patient has three prescriptions and each has a code comes with it. I want to see if patients age less than 5 have prescription MED1-MED3 in code (7832, 3435, 1051). 

I used proc sql the following but it gave me the same dataset. 

PROC SQL;
	CREATE TABLE WANT AS
		SELECT *
		FROM MEDS
                WHERE MED1 OR MED2 OR MED3 IN (7832, 3455, 1051) AND
AGE <= 5;
QUIT;

How can I correct this? Thanks. 

PROC Star
Posts: 1,762

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Viveme789

/* if i understand you correctly*/
PROC
SQL; CREATE TABLE WANT AS SELECT * FROM MEDS WHERE MED1=7832 and MED2=3455 and MED3=1051 AND AGE <= 5; QUIT;

 

Occasional Contributor
Posts: 18

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to novinosrin

I meant as if MED1, MED2 and MED3 has the values in the list. 

Super User
Posts: 23,659

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Viveme789

Viveme789 wrote:

Suppose I have a dataset looks like the following:

 

ID .     age  MED1    MED2    MED3
001 .     1      1055        7832    3435
202 .      0     1051        3423    3455
204 .      3      5040         3423 .  3444
213 .       4     9999        3432 .  4546

 

Each patient has three prescriptions and each has a code comes with it. I want to see if patients age less than 5 have prescription MED1-MED3 in code (7832, 3435, 1051). 

I used proc sql the following but it gave me the same dataset. 

PROC SQL;
	CREATE TABLE WANT AS
		SELECT *
		FROM MEDS
                WHERE MED1 OR MED2 OR MED3 IN (7832, 3455, 1051) AND
AGE <= 5;
QUIT;

How can I correct this? Thanks. 


 

SQL doesn't do variable lists or multiple variables well, it would be easier in a data step, otherwise you need to duplicate your condition:

 

WHERE MED1  IN (7832, 3455, 1051) 
OR MED2  IN (7832, 3455, 1051)
OR MED3 IN (7832, 3455, 1051) ...
Occasional Contributor
Posts: 18

Re: how to subset a dataset based on if a range of variables contain specific values?

Can you tell me how to do it in a data step, where I don't have to duplicate the list of values? The problem I have on hand actually have 98 codes in the list and 9 variables.... Thanks. 

Super User
Posts: 6,751

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Viveme789

How is your list of 98 codes stored?  Can it be a SAS data set with 98 observations and a single variable?

Occasional Contributor
Posts: 18

Re: how to subset a dataset based on if a range of variables contain specific values?

[ Edited ]
Posted in reply to Astounding

I meant that 9 variables MED1 to MED9 has to have values that contains in the list, which has 98 codes as a criteria to subset the data. For example:

 MED1 OR MED2 OR...OR MED9 IN (1, 2, ....98). Do I make sense?

Super User
Posts: 6,751

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Viveme789

100%.

 

But my question remains the same.  Can you put the list of 98 possible values into a SAS data set that contains 98 observations and 1 variable?

Occasional Contributor
Posts: 18

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Astounding

Yes

Super User
Posts: 6,751

Re: how to subset a dataset based on if a range of variables contain specific values?

Posted in reply to Viveme789

OK, assuming you have a separate SAS data set with 98 observations, and just one variable (MED):

 

data temp;

set _98_meds;

start=med;

end='In the List';

fmtname='_my98_';

run;

proc format cntlin=temp;

run;

 

This much gives you a format that translates any of the 98 meds into "In the List".  Then use that format to compare to the 9 meds (MED1 through MED9):

 

data want;

set have;

where age <= 5;

array med {9};

do _n_=1 to 9 until (flag='In the List');

   flag = put(med{_n_}, _my98_.);

end;

if flag='In the List';

drop flag;

run;

 

Ask a Question
Discussion stats
  • 9 replies
  • 169 views
  • 0 likes
  • 4 in conversation