Solved: how to filter data to check which subject has that particular treatmne...

srmakwana2009 · Posted 06-22-2023 11:48 AM

I want to filter data according to if particular subject has treatment like "onsite administration" or "dose administration" I want to filter those subjects who has only "on site administation" only. some subjects have both treatment and some subjects has only one treatment. I want to filter only those subjects which have only one treatment. if i filter data like extrt="on site administration" then those subjects will get filtered which has taken both treatment. I want only those subject which has taken only "on site treatment" I have attached screenshot for better understanding.

Please help.

Tom · Posted 06-22-2023 01:55 PM

You need to aggregate to the test. MIN() and MAX() work well for BOOLEAN values. If the MAX() is TRUE then at least one of the values is TRUE. If the MIN() is TRUE then all of the values are TRUE. If the MAX() is FALSE then none of the values are TRUE. If the MIN() is FALSE then at least one of the values is FALSE.

So to test if ALL of the values are on site administration the do something like:

select subjid 
     , min(lowcase(EXCAT)='on site administration') as ON_SITE
from have
group by subjid
;

So ON_SITE will be TRUE (have a value of 1) when every observation for that SUBJID was on site administration and FALSE (have a value of zero) otherwise.

Note: If you want to filter individual observations based on attributes derived from groups of observations then you will need to process the data twice. Once to calculate the value you need to select the group you then need to remerge that value back onto all off the observations in the group. Either by using two steps or perhaps taking advantage of PROC SQL's ability to automatically remerge aggregate statistics back onto all observations in the group.

create table ON_SITE as 
select *
from have
group by subjid
having min(lowcase(EXCAT)='on site administration')
;

View solution in original post

Tom · Posted 06-22-2023 01:55 PM

You need to aggregate to the test. MIN() and MAX() work well for BOOLEAN values. If the MAX() is TRUE then at least one of the values is TRUE. If the MIN() is TRUE then all of the values are TRUE. If the MAX() is FALSE then none of the values are TRUE. If the MIN() is FALSE then at least one of the values is FALSE.

So to test if ALL of the values are on site administration the do something like:

select subjid 
     , min(lowcase(EXCAT)='on site administration') as ON_SITE
from have
group by subjid
;

So ON_SITE will be TRUE (have a value of 1) when every observation for that SUBJID was on site administration and FALSE (have a value of zero) otherwise.

Note: If you want to filter individual observations based on attributes derived from groups of observations then you will need to process the data twice. Once to calculate the value you need to select the group you then need to remerge that value back onto all off the observations in the group. Either by using two steps or perhaps taking advantage of PROC SQL's ability to automatically remerge aggregate statistics back onto all observations in the group.

create table ON_SITE as 
select *
from have
group by subjid
having min(lowcase(EXCAT)='on site administration')
;

srmakwana2009 · Posted 06-26-2023 11:01 PM

Thank you so much . it helped a lot

JackieJ_SAS · Posted 06-23-2023 10:23 AM

As a general heuristic using a DATA step, what I would do here is:
1. Create a dataset with subjects that took the first treatment.

2. Create a dataset with subjects that took the second treatment.

3. Merge those datasets together to form a third dataset. When doing that, use IN= dataset options, so you know which subjects came from which dataset.

4. Output from the third dataset only those subjects that appeared in exactly one of the datasets.

These steps could be combined into one DATA step, so something like:

proc sort data=original_data out=one(keep=subject trt) nodupkey;

by subject trt;

run;

data three;

merge one(where=(trt="TRT1") in=in1)

one(where=(trt="TRT2") in=in2);

by subject;

if sum(in1,in2)=1;

keep subject;

run;

srmakwana2009 · Posted 06-26-2023 11:00 PM

Thank you so much

how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Re: how to filter data to check which subject has that particular treatmnet or not?

Click image to register for webinar

Classroom Training Available!