SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Manhort
Obsidian | Level 7

Hello all,

 

I have a large dataset. Individuals can appear on more than 1 row based on an ID. I need to subset the data to have individuals that are given both treatment A and B.

 

Example data:
ID treatment
1 A
1 B
1 A
2 A
2 A
3 B
3 B
3 B
3 B
4 A
4 B
5 A
5 A
6 B


Desired Result:

ID treatment
1 A
1 B
1 A
4 A
4 B

 

 

 

Any help is appreciated. 
Here's the input version.


DATA have;
INPUT id treatment $;
CARDS;
1 A
1 B
1 A
2 A
2 A
3 B
3 B
3 B
3 B
4 A
4 B
5 A
5 A
6 B
;
RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could ask PROC SQL to do exactly what you just described.

proc sql;
create table want as
select a.* from have a
where exists (select id from have b where a.id=b.id and b.treatment='A')
  and exists (select id from have c where a.id=c.id and c.treatment='B')
;
quit;

Or if there are just two treatments perhaps it is easier to use GROUP BY instead?

proc sql;
create table want2 as
select * from have
group by id
having count(distinct treatment)=2
;
quit;

But if your dataset is really large (and already sorted by ID) then a data step will be faster.

data want3;
  merge have(in=in1)
        have(in=in2 keep=id treatment rename=(treatment=trtA) where=(trtA='A'))
        have(in=in3 keep=id treatment rename=(treatment=trtB) where=(trtB='B'))
 ;
 by id;
 if first.id then selected = (in1 and in2 and in3);
 retain selected ;
 if in1 and selected then output;
 drop trtA trtB selected;
run;

Or perhaps a data step with a double DOW loop?

data want4;
  do until (last.id);
    set have;
    by id;
    A=sum(A,treatment='A');    
    B=sum(B,treatment='B');
  end;
  do until (last.id);
    set have;
    by id;
    if A and B then output;
  end;
  drop A B ;
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

You could ask PROC SQL to do exactly what you just described.

proc sql;
create table want as
select a.* from have a
where exists (select id from have b where a.id=b.id and b.treatment='A')
  and exists (select id from have c where a.id=c.id and c.treatment='B')
;
quit;

Or if there are just two treatments perhaps it is easier to use GROUP BY instead?

proc sql;
create table want2 as
select * from have
group by id
having count(distinct treatment)=2
;
quit;

But if your dataset is really large (and already sorted by ID) then a data step will be faster.

data want3;
  merge have(in=in1)
        have(in=in2 keep=id treatment rename=(treatment=trtA) where=(trtA='A'))
        have(in=in3 keep=id treatment rename=(treatment=trtB) where=(trtB='B'))
 ;
 by id;
 if first.id then selected = (in1 and in2 and in3);
 retain selected ;
 if in1 and selected then output;
 drop trtA trtB selected;
run;

Or perhaps a data step with a double DOW loop?

data want4;
  do until (last.id);
    set have;
    by id;
    A=sum(A,treatment='A');    
    B=sum(B,treatment='B');
  end;
  do until (last.id);
    set have;
    by id;
    if A and B then output;
  end;
  drop A B ;
run;
GeorgeBonanza
Quartz | Level 8
Very interesting approaches and all four are different than the way I would've done it. You learn something new every day. Thanks for sharing.
Manhort
Obsidian | Level 7
Thanks @Tom. I found the second method you gave to be very useful.

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1151 views
  • 2 likes
  • 3 in conversation