Hi,
I have a dataset consists of patients id, date, disease status. Each patient has multiple observations with different dates. What I want is to only extract the observations WITHIN 30 days WITHIN a patient whenever the disease status is known. Could anybody help me to solve this? Thanks in advance!
Z
Example code:
----------------------------------------------
data have;
infile datalines missover;
input ID date anydtdte10. disease $ ;
datalines;
8657586 03/02/2011
8657586 03/12/2011
8657586 05/10/2011
8600380 08/01/2012 no
8600380 11/19/2012 no
8600380 11/23/2012 no
8600380 12/12/2012
8600380 04/05/2013
9519797 01/02/2013
9519797 02/01/2013
9519797 09/02/2013
9519797 09/12/2013
9519797 09/24/2013 yes
9519797 09/30/2013
9519797 12/03/2013
9519797 02/03/2014
;
run;
proc print data=have;
run;
/* I want this: */
8600380 08/01/2012 no
8600380 11/19/2012 no
8600380 11/23/2012 no
8600380 12/12/2012
9519797 09/02/2013
9519797 09/12/2013
9519797 09/24/2013 yes
9519797 09/30/2013
9519797 12/03/2013
Something like this, although your logic doesn't seem right, I don't see why a couple of your records would come out (you can get the from the code, there are three additional in your output)
proc sql;
create table WANT as
select A.*
from HAVE A
where exists(select distinct ID from HAVE where ID=A.ID and DATE <= A.DATE <= (DATE + 30) and DISEASE is not null);
quit;
Something like this, although your logic doesn't seem right, I don't see why a couple of your records would come out (you can get the from the code, there are three additional in your output)
proc sql;
create table WANT as
select A.*
from HAVE A
where exists(select distinct ID from HAVE where ID=A.ID and DATE <= A.DATE <= (DATE + 30) and DISEASE is not null);
quit;
Hello,
I know that is not the answer because i didn't understood correctly what you meen with 30 days. But......
Maybe it help you
best regards,
/* order the data */
proc sort data=have;
by id date;
run;
/* get the patient ids */
data idS;
set have;
by id date;
format date prev_date date9. days 8.;
retain confirmation 'N' total_days 0;
if first.id then do;
confirmation='N';
total_days=0;
end;
prev_date=lag1(date);
days=intck('DAY',prev_date,date);
if strip(disease) ne '' then confirmation='Y';
if confirmation eq 'Y' then total_days=sum(total_days,days);
*if last.id and confirmation eq 'Y' and total_days ge 30 then output;
*keep id;
run;
data want;
merge ids(in=a) have(in=b);
by id;
if a and b;
run;
This worked, thank you very much! I do want the observations 30 days prior to the known disease status, so i addes "DATE -30' to your code.
Z
Last record for 12/03 is not within 30 days of 09/24. Slightly lengthy version will give you all the rest:
proc sort data=have out=have1;
by id date;
where cmiss(id,disease)<=0;
run;
data have1;
set have1;
by id date;
retain ord;
if first.id then ord=0;
ord=ord+1;
run;
proc transpose data=have1 out=have1t prefix=datex;
by id;
var date;
id ord;
run;
data want;
merge have(in=a) have1t(in=b);
by id;
if a and b;
format date date10.;
run;
data want(drop=i datex: _:);
set want;
array temp{*} datex:;
do i =1 to dim(temp);
if nmiss(temp[i])=0 then do;
if temp[i]-30<=date<=temp[i]+30 then output;
end;
end;
run;
proc sort data=want nodupkey;
by id date disease;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.