Selecting specific subjects without samples from list

Reply
N/A
Posts: 0

Selecting specific subjects without samples from list

Hi,
I have the following data:

ID Specimen
001 02
001 06
001 05
001 03
001 04
002 02
002 05
002 06
003 02
003 03
003 04

I would like to pull out ID# 002 because they don't have specimen '03' and '04'. What is the code I use for that?

Thank you!
jbc
N/A
Posts: 1

Re: Selecting specific subjects without samples from list

data b; set a;
where id=002;
proc print;

other

data b; set a;

if id=002;
proc print;
SAS Super FREQ
Posts: 8,645

Re: Selecting specific subjects without samples from list

I interpreted the question a bit differently. I thought what was asked for was an automated way to output ALL the observations for ID 002, because ID 002 did NOT have specimen 3 or 4. So this is what I came up with. Of course, if they already know which ID does not have specimen 3 or 4, then the where/subsetting if is a good choice.

Basically this program shows 2 approaches in one program.

Approach 1) create a duplicate of the original file with a flag variable added that says whether a particular ID has the specimen that you're looking for; or
Approach 2) split the original file into 2 files based on whether the ID has the specimen or not
[pre]
** first, read the data into a file;
** capture the IDs that are OK ids (have the right specimens);
data All_ID wantthese(keep=ID);
retain wantid;
infile cards;
input ID specimen;
orig_obsno = _n_;
** this is the condition you want to base your selection on;
if specimen in (3,4) then do;
output wantthese;
end;
** All_ID is the SAS dataset of ALL the observations;
output All_ID;
format ID z3. specimen z2.;
return;
cards;
001 02
001 06
001 05
001 03
001 04
002 02
002 05
002 06
003 02
003 03
003 04
;
run;

proc sort data=All_ID;
by id;
run;

** get rid of duplicates;
proc sort data=wantthese out=uniqID nodupkey;
by id;
run;

** merge the 2 files together. ;
** Approach 1: ;
** the All_ID_flagged file will be a copy of the original file with a "flag" added;

** Approach 2: ;
** the have_3_4 file will be all the obs for IDs that HAD specimen 3 or specimen 4;
** the pull_out file will be all the obs for the IDs that did NOT have specimen 3 or 4;

data have_3_4 (keep= ID specimen orig_obsno)
pull_out(keep=ID specimen orig_obsno)
All_ID_flagged(keep=ID specimen flag);
length flag $45;
merge All_ID(in=Allfile) uniqID(in=got_3_4);
by id;
if Allfile = 1 then do;
if got_3_4=1 then do;
flag = 'Yes: ID group has specimen 3 or 4';
output All_ID_flagged have_3_4;
end;
else if got_3_4=0 then do;
flag = 'No: ID group does NOT have specimen 3 or 4';
output All_ID_flagged pull_out;
end;
end;
run;

*** Using Approach 1 file;
*** Dealing with 1 file that now has identifying flags;
proc print data=ALL_ID_flagged;
title 'all the obs with their "flag" status';
run;

proc print data=ALL_ID_flagged;
title 'only the obs with flag has Yes';
where flag contains 'Yes:';
run;

proc print data=ALL_ID_flagged;
title 'only the obs with flag has No';
where flag contains 'No:';
run;

*** Using Approach 2 files;
*** Dealing with the obs that have been split into 2 groups;
proc print data=have_3_4;
title 'all obs for IDs that had specimen 03 and 04 ';
run;

proc print data=pull_out;
title 'Pull out all obs for IDs that did NOT have specimen 03 or 04';
run;

[/pre]
Post a Question
Discussion Stats
  • 2 replies
  • 293 views
  • 0 likes
  • 3 in conversation