I'm working on a health utilization project. I need to figure out how to identify patients using a specific algorithm for:
Patients with a specific (diagnosis) DX that has 2 outpatient vistis or more in any rolling 12 month period. I can do it on a set period (e.g. 2014) but my study is over a 4 year period, so it should be any rolling 12 month period over the 4 years.
Hi. In your post you mentioned ...
" ... admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014 ... "
but there was no such data in the posted data. I added data to match that in your description and used the SQL method I posted earlier and the results seem to match your request...
* your data with my additions for ID 1 and ID 2;
data x;
input id (dx1 dx2) (:$5.) date :date9.;
format date mmddyy10.;
datalines;
1 49390 5409 01JAN2012
2 2512 49392 19DEC2013
3 49390 72888 03JUN2014
3 49390 3094 26OCT2013
4 2752 49392 06OCT2013
5 49381 72252 26NOV2013
6 49320 7220 28APR2014
7 49390 55329 19DEC2013
8 49390 48882 08JUL2014
9 490 49392 08DEC2013
1 999 99999 07MAR2013
2 888 88888 01JAN2014
;
* add observation number;
data x;
set x;
nobs = _n_;
run;
proc sql;
create table y (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, a.dx1 as dx1_check, a.dx2 as dx2_check,
b.date, b.nobs, b.dx1, b.dx2,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.date,b.date),max(a.date,b.date)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs
order id, date_check;
quit;
* get rid of duplicates;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run; hi mmddyy10.;
run;
RESULTS ...
id=2
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
1 2 12/19/2013 2512 49392 12 01/01/2014 888 88888
id=3
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
2 4 10/26/2013 49390 3094 3 06/03/2014 49390 72888
If you have example data that will be helpful.
This is sample data. The years of the whole data set is 2010-2014. So I need to identify sepcific Pat_ID with 2 or more ADMDATEs in a continuous 12 month period of time. For exampme if Pat_ID #1 had a DX1 or DX2 admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014, I do want to count. Thanks!
PAT_ID DX2 DX1 ADMDATE
1 | 49390 0 | 5409 | 05APR2014 |
2 | 2512 | 49392 0 | 19DEC2013 |
3 | 49390 0 | 72888 | 03JUN2014 |
3 | 49390 1 | 3094 | 26OCT2013 |
4 | 2752 | 49392 0 | 06OCT2013 |
5 | 49381 0 | 72252 | 26NOV2013 |
6 | 49320 0 | 7220 | 28APR2014 |
7 | 49390 0 | 55329 | 19DEC2013 |
8 | 49390 0 | 48882 | 08JUL2014 |
9 | 490 | 49392 0 | 08DEC2013 |
Hi. In your post you mentioned ...
" ... admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014 ... "
but there was no such data in the posted data. I added data to match that in your description and used the SQL method I posted earlier and the results seem to match your request...
* your data with my additions for ID 1 and ID 2;
data x;
input id (dx1 dx2) (:$5.) date :date9.;
format date mmddyy10.;
datalines;
1 49390 5409 01JAN2012
2 2512 49392 19DEC2013
3 49390 72888 03JUN2014
3 49390 3094 26OCT2013
4 2752 49392 06OCT2013
5 49381 72252 26NOV2013
6 49320 7220 28APR2014
7 49390 55329 19DEC2013
8 49390 48882 08JUL2014
9 490 49392 08DEC2013
1 999 99999 07MAR2013
2 888 88888 01JAN2014
;
* add observation number;
data x;
set x;
nobs = _n_;
run;
proc sql;
create table y (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, a.dx1 as dx1_check, a.dx2 as dx2_check,
b.date, b.nobs, b.dx1, b.dx2,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.date,b.date),max(a.date,b.date)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs
order id, date_check;
quit;
* get rid of duplicates;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run; hi mmddyy10.;
run;
RESULTS ...
id=2
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
1 2 12/19/2013 2512 49392 12 01/01/2014 888 88888
id=3
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
2 4 10/26/2013 49390 3094 3 06/03/2014 49390 72888
Thank you!!
Hi @MikeZdeb. A while ago you gave the the solution to a question I have about identifying observations in a rolling 12 month period. Can you help me modify this syntax to identify encounters that have 5 ER visits in any rolling 12 month period (regardless of diagnosis 'dx'). I feel like if should be similar to this syntax, but not quite sure how I should work this. All the observations in the table are ER visits, but I just need those that have 5 or more in any rolling 12 month period (table contains 5 years of ER data).
Thank you!!
Hi, I think this works OK. Maybe you can adapt it to your data. It joins the data with itself and looks for dates with an ID that are with one year. I also did a hash solution and got the same results. Of course it is possible that I'm really zero-for-two, but I think it's OK.
Since original post ...
#1 fixed the INTNX function, should be 12 months not 1 year (forgot about that interval b oundary stuff)
#2 fixed the duplicate output (e.g. nobs 7 matched with nobs 10 duplicated with nobs 10 matched with nobs 7) using the variable MINMAX
* make some data, IDs 1 through 50, 1 to 5 observations per ID, random dates;
data x;
do id=1 to 50;
do j=1 to ceil(5*ranuni(99));
date = '01jan2005'd + ceil(2000*ranuni(99));
nobs+1;
output;
end;
end;
keep id nobs date;
format date mmddyy10.;
run;
* join the data set with itself and look for dates within ID within one year;
proc sql;
create table join (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, b.date, b.nobs,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.nobs,b.nobs),max(a.nobs,b.nobs)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs;
quit;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run;
Here's data set x, IDs 1 through 3 ...
Obs id date nobs
1 1 03/20/2005 1
2 1 03/12/2008 2
3 2 04/24/2006 3
4 2 04/02/2010 4
5 2 02/19/2009 5
6 2 06/29/2008 6
7 3 03/27/2008 7
8 3 08/15/2005 8
9 3 09/24/2009 9
10 3 09/01/2007 10
11 3 07/27/2008 11
Here's data set JOIN for the IDs ...
id=2
nobs_
Obs check date_check date nobs
1 5 02/19/2009 06/29/2008 6
id=3
nobs_
Obs check date_check date nobs
2 10 09/01/2007 07/27/2008 11
3 7 03/27/2008 09/01/2007 10
4 7 03/27/2008 07/27/2008 11
FYI ... the hash version with help from Better Hashing in SAS® 9.2 ...
https://support.sas.com/resources/papers/sgf2008/better-hashing-sas92.pdf
and the HTML screws up a statement, the &colon you see below should be a real colon as in :
data y (keep=id nobs date_check nobs_check date minmax);
retain id date nobs 0;
dcl hash h(dataset : 'x', multidata :'y');
h.definekey('id');
h.definedata('date', 'nobs');
h.definedone();
do until(last);
set x end=last;
date_check = date;
nobs_check = nobs;
lo = intnx('month', date_check, -12);
hi = intnx('month', date_check, 12);
h.find();
minmax = catx(' ',min(nobs,nobs_check),max(nobs,nobs_check));
if date ge lo and date le hi and nobs ne nobs_check then output;
h.has_next(result: r);
do while(r ne 0);
h.find_next();
minmax = catx(' ',min(nobs,nobs_check),max(nobs,nobs_check));
if date ge lo and date le hi and nobs ne nobs_check then output;
h.has_next(result: r);
end;
end;
stop;
format date_check mmddyy10.;
run;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.