Hi,
I have a dataset consists of patients id, date, disease status. Each patient has multiple observations with different dates. What I want is to only extract the observations WITHIN 30 days WITHIN a patient whenever the disease status is known. Could anybody help me to solve this? Thanks in advance!
Z
Example code:
----------------------------------------------
data have;
infile datalines missover;
input ID date anydtdte10. disease $ ;
datalines;
8657586 03/02/2011
8657586 03/12/2011
8657586 05/10/2011
8600380 08/01/2012 no
8600380 11/19/2012 no
8600380 11/23/2012 no
8600380 12/12/2012
8600380 04/05/2013
9519797 01/02/2013
9519797 02/01/2013
9519797 09/02/2013
9519797 09/12/2013
9519797 09/24/2013 yes
9519797 09/30/2013
9519797 12/03/2013
9519797 02/03/2014
;
run;
proc print data=have;
run;
/* I want this: */
8600380 08/01/2012 no
8600380 11/19/2012 no
8600380 11/23/2012 no
8600380 12/12/2012
9519797 09/02/2013
9519797 09/12/2013
9519797 09/24/2013 yes
9519797 09/30/2013
9519797 12/03/2013
Something like this, although your logic doesn't seem right, I don't see why a couple of your records would come out (you can get the from the code, there are three additional in your output)
proc sql;
create table WANT as
select A.*
from HAVE A
where exists(select distinct ID from HAVE where ID=A.ID and DATE <= A.DATE <= (DATE + 30) and DISEASE is not null);
quit;
Something like this, although your logic doesn't seem right, I don't see why a couple of your records would come out (you can get the from the code, there are three additional in your output)
proc sql;
create table WANT as
select A.*
from HAVE A
where exists(select distinct ID from HAVE where ID=A.ID and DATE <= A.DATE <= (DATE + 30) and DISEASE is not null);
quit;
Hello,
I know that is not the answer because i didn't understood correctly what you meen with 30 days. But......
Maybe it help you
best regards,
/* order the data */
proc sort data=have;
by id date;
run;
/* get the patient ids */
data idS;
set have;
by id date;
format date prev_date date9. days 8.;
retain confirmation 'N' total_days 0;
if first.id then do;
confirmation='N';
total_days=0;
end;
prev_date=lag1(date);
days=intck('DAY',prev_date,date);
if strip(disease) ne '' then confirmation='Y';
if confirmation eq 'Y' then total_days=sum(total_days,days);
*if last.id and confirmation eq 'Y' and total_days ge 30 then output;
*keep id;
run;
data want;
merge ids(in=a) have(in=b);
by id;
if a and b;
run;
This worked, thank you very much! I do want the observations 30 days prior to the known disease status, so i addes "DATE -30' to your code.
Z
Last record for 12/03 is not within 30 days of 09/24. Slightly lengthy version will give you all the rest:
proc sort data=have out=have1;
by id date;
where cmiss(id,disease)<=0;
run;
data have1;
set have1;
by id date;
retain ord;
if first.id then ord=0;
ord=ord+1;
run;
proc transpose data=have1 out=have1t prefix=datex;
by id;
var date;
id ord;
run;
data want;
merge have(in=a) have1t(in=b);
by id;
if a and b;
format date date10.;
run;
data want(drop=i datex: _:);
set want;
array temp{*} datex:;
do i =1 to dim(temp);
if nmiss(temp[i])=0 then do;
if temp[i]-30<=date<=temp[i]+30 then output;
end;
end;
run;
proc sort data=want nodupkey;
by id date disease;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.