Hello SAS Folks,
i have a data set having multiple subj Id , where few subjects have multiple visits and few have only a single visit. Let say, the Day 0 is baseline visit and following visits are post baseline .
i need to select only subjects who have at least one post baseline visit and to eliminate subjects who have no post baseline . But here i have one subject who have post baseline visits but no data is recorded for those post baseline . i also need to eliminate these subjects who have post baseline visit but where no data is recorded.
However if any subject having at least one non missing post baseline data , we have to select its all observation regardless weather it is missing or not.
i have attached the data ( sample) for your reference.
Many thanks in advance.
regards
Ved
_USUBJID _VISNAME _PAGNAME VSBPSYS
01/001 DAY0 PAGE007 130
01/001 WEEK01 PAGE010
01/001 WEEK02 PAGE013
01/001 WEEK04 PAGE016
01/001 WEEK08 PAGE018
01/001 WEEK12 PAGE020
01/004 DAY0 PAGE007 130
01/004 WEEK01 PAGE010 130
01/004 WEEK02 PAGE013 149
01/004 WEEK04 PAGE016 144
02/003 DAY0 PAGE007 122
03/010 DAY0 PAGE007 129
03/010 WEEK01 PAGE010
03/010 WEEK02 PAGE013 135
so as per above , i need subjid 01/004 and 03/010.
I am using following code. its selecting few but not all as desired.
data vs_Rand1(keep=_usubjid _visname vsbpsys x _pagname);
set vs_Rand;
by _usubjid;
if (first._usubjid and last._usubjid) or (not(first._usubjid or last._usubjid) and lag(vsbpsys) eq .)
then x=1;
run;
You can use exists() in SQL:
proc sql;
create table WANT as
select A.*
from HAVE A
where exists(select distinct USUBJID from HAVE where USUBJID=A.USUBJID and VISIT=0)
and exists(select distinct USUBJID from HAVE where USUBJID=A.USUBJID and VISIT > 0 and SOMEDATA is not null);
quit;
If I understood what you mean.
data have;
infile cards truncover;
input (_USUBJID _VISNAME _PAGNAME VSBPSYS) ($);
cards;
01/001 DAY0 PAGE007 130
01/001 WEEK01 PAGE010
01/001 WEEK02 PAGE013
01/001 WEEK04 PAGE016
01/001 WEEK08 PAGE018
01/001 WEEK12 PAGE020
01/004 DAY0 PAGE007 130
01/004 WEEK01 PAGE010 130
01/004 WEEK02 PAGE013 149
01/004 WEEK04 PAGE016 144
02/003 DAY0 PAGE007 122
03/010 DAY0 PAGE007 129
03/010 WEEK01 PAGE010
03/010 WEEK02 PAGE013 135
;
run;
data want;
do until(last._USUBJID );
set have;
by _USUBJID ;
if _VISNAME =: 'WEEK' then do;
has_week=1;
if not missing(VSBPSYS) then has_value=1;
end;
end;
do until(last._USUBJID );
set have;
by _USUBJID ;
if has_week=1 and has_value=1 then output;
end;
drop has_week has_value;
run;
Sorry. I have no time to do that.
It is DOW skill. You can search it at this forum or at support.sas.com
You would get tons of paper or post to talk about it .
Xia Keshan
xia keshan : Thanks for your valuable time. well in addition i would like to ask what if visit values are not 'WEEK' or just to say lets add few more visits say 'EOS' OR 'LOGS' in above data. How to select obs in that case.
many thanks in advance.
regards
ved
That will output matched group with 'EOS' OR 'LOGS' .
If you are certain your baseline is always there and have a VSBPSYS, you can simply count it. Otherwise, it needs small tweak. (Raw input is credited to )
data have;
infile cards truncover;
input (_USUBJID _VISNAME _PAGNAME VSBPSYS) ($);
cards;
01/001 DAY0 PAGE007 130
01/001 WEEK01 PAGE010
01/001 WEEK02 PAGE013
01/001 WEEK04 PAGE016
01/001 WEEK08 PAGE018
01/001 WEEK12 PAGE020
01/004 DAY0 PAGE007 130
01/004 WEEK01 PAGE010 130
01/004 WEEK02 PAGE013 149
01/004 WEEK04 PAGE016 144
02/003 DAY0 PAGE007 122
03/010 DAY0 PAGE007 129
03/010 WEEK01 PAGE010
03/010 WEEK02 PAGE013 135
;
run;
proc sql;
create table want as
select * from have
group by _USUBJID
having count(VSBPSYS) > 1;
quit;
No , some times baseline visit is there but doesn't have the vsbpsys value . no all such cases where we have missing baseline value of vsbpsys , should not be included and must be eliminated from the dataset.( along with its subsequent missing post baseline values).
Then a small tweak would do,
proc sql;
create table want as
select * from have
group by _USUBJID
having sum(_VISNAME='DAY0' AND NOT MISSING(VSBPSYS))=1 AND count(VSBPSYS) > 1;
quit;
since you ask for it here is a simpler code:
data have2 have3;
set have;
by _usubjid;
where nmiss(vbspsys);
if first.usubjid+last.usubjid=2 then output have3;
else output have2;
run;
this will remove all records with missing results.
we don't need to remove missing records. records should be removed only for subjects, who don't have any post baseline visits.
Ok Try this:
proc sort data have;
by id descending res vis page
run;
data have;
set have;
by id descending res vis page;
retain count;
if first.id then count=0;
if not missing(res) and vis^="DAY0" then count=count+1;
if count=0 and vis^="DAY0" then delete; **** remove post baseline records if all are misisng;
run;
data have;
set have;
by id;
if first.id+last.id=2 then delete; *** remove baseline record; if no post baseline;
run;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.