BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
srmakwana2009
Fluorite | Level 6

I want to filter data according to if particular subject has treatment like "onsite administration" or "dose administration" I want to filter those subjects who has only "on site administation" only. some subjects have both treatment and some subjects has only one treatment. I want to filter only those subjects which have only one treatment. if i filter data like extrt="on site administration" then those subjects will get filtered which has taken both treatment. I want only those subject which has taken only "on site treatment" I have attached screenshot for better understanding.

 

Please help.

dose tratement.png

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You need to aggregate to the test.  MIN() and MAX() work well for BOOLEAN values.  If the MAX() is TRUE then at least one of the values is TRUE.  If the MIN() is TRUE then all of the values are TRUE. If the MAX() is FALSE then none of the values are TRUE.  If the MIN() is FALSE then at least one of the values is FALSE.

 

So to test if ALL of the values are on site administration the do something like:

select subjid 
     , min(lowcase(EXCAT)='on site administration') as ON_SITE
from have
group by subjid
;

So ON_SITE will be TRUE (have a value of 1) when every observation for that SUBJID was on site administration and FALSE (have a value of zero) otherwise.

 

Note: If you want to filter individual observations based on attributes derived from groups of observations then you will need to process the data twice.  Once to calculate the value you need to select the group you then need to remerge that value back onto all off the observations in the group.  Either by using two steps or perhaps taking advantage of PROC SQL's ability to automatically remerge aggregate statistics back onto all observations in the group.

create table ON_SITE as 
select *
from have
group by subjid
having min(lowcase(EXCAT)='on site administration')
;

 

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

You need to aggregate to the test.  MIN() and MAX() work well for BOOLEAN values.  If the MAX() is TRUE then at least one of the values is TRUE.  If the MIN() is TRUE then all of the values are TRUE. If the MAX() is FALSE then none of the values are TRUE.  If the MIN() is FALSE then at least one of the values is FALSE.

 

So to test if ALL of the values are on site administration the do something like:

select subjid 
     , min(lowcase(EXCAT)='on site administration') as ON_SITE
from have
group by subjid
;

So ON_SITE will be TRUE (have a value of 1) when every observation for that SUBJID was on site administration and FALSE (have a value of zero) otherwise.

 

Note: If you want to filter individual observations based on attributes derived from groups of observations then you will need to process the data twice.  Once to calculate the value you need to select the group you then need to remerge that value back onto all off the observations in the group.  Either by using two steps or perhaps taking advantage of PROC SQL's ability to automatically remerge aggregate statistics back onto all observations in the group.

create table ON_SITE as 
select *
from have
group by subjid
having min(lowcase(EXCAT)='on site administration')
;

 

srmakwana2009
Fluorite | Level 6

Thank you so much . it helped a lot

JackieJ_SAS
SAS Employee

As a general heuristic using a DATA step, what I would do here is:
1. Create a dataset with subjects that took the first treatment.

2. Create a dataset with subjects that took the second treatment.

3. Merge those datasets together to form a third dataset. When doing that, use IN= dataset options, so you know which subjects came from which dataset.

4. Output from the third dataset only those subjects that appeared in exactly one of the datasets.

These steps could be combined into one DATA step, so something like:

proc sort data=original_data out=one(keep=subject trt) nodupkey;

by subject trt;

run;

data three;

merge one(where=(trt="TRT1") in=in1)

       one(where=(trt="TRT2") in=in2);

by subject;

if sum(in1,in2)=1;

keep subject;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1463 views
  • 5 likes
  • 3 in conversation