I'm working on a health utilization project. I need to figure out how to identify patients using a specific algorithm for:
Patients with a specific (diagnosis) DX that has 2 outpatient vistis or more in any rolling 12 month period. I can do it on a set period (e.g. 2014) but my study is over a 4 year period, so it should be any rolling 12 month period over the 4 years.
Hi. In your post you mentioned ...
" ... admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014 ... "
but there was no such data in the posted data. I added data to match that in your description and used the SQL method I posted earlier and the results seem to match your request...
* your data with my additions for ID 1 and ID 2;
data x;
input id (dx1 dx2) (:$5.) date :date9.;
format date mmddyy10.;
datalines;
1 49390 5409 01JAN2012
2 2512 49392 19DEC2013
3 49390 72888 03JUN2014
3 49390 3094 26OCT2013
4 2752 49392 06OCT2013
5 49381 72252 26NOV2013
6 49320 7220 28APR2014
7 49390 55329 19DEC2013
8 49390 48882 08JUL2014
9 490 49392 08DEC2013
1 999 99999 07MAR2013
2 888 88888 01JAN2014
;
* add observation number;
data x;
set x;
nobs = _n_;
run;
proc sql;
create table y (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, a.dx1 as dx1_check, a.dx2 as dx2_check,
b.date, b.nobs, b.dx1, b.dx2,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.date,b.date),max(a.date,b.date)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs
order id, date_check;
quit;
* get rid of duplicates;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run; hi mmddyy10.;
run;
RESULTS ...
id=2
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
1 2 12/19/2013 2512 49392 12 01/01/2014 888 88888
id=3
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
2 4 10/26/2013 49390 3094 3 06/03/2014 49390 72888
If you have example data that will be helpful.
This is sample data. The years of the whole data set is 2010-2014. So I need to identify sepcific Pat_ID with 2 or more ADMDATEs in a continuous 12 month period of time. For exampme if Pat_ID #1 had a DX1 or DX2 admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014, I do want to count. Thanks!
PAT_ID DX2 DX1 ADMDATE
1 | 49390 0 | 5409 | 05APR2014 |
2 | 2512 | 49392 0 | 19DEC2013 |
3 | 49390 0 | 72888 | 03JUN2014 |
3 | 49390 1 | 3094 | 26OCT2013 |
4 | 2752 | 49392 0 | 06OCT2013 |
5 | 49381 0 | 72252 | 26NOV2013 |
6 | 49320 0 | 7220 | 28APR2014 |
7 | 49390 0 | 55329 | 19DEC2013 |
8 | 49390 0 | 48882 | 08JUL2014 |
9 | 490 | 49392 0 | 08DEC2013 |
Hi. In your post you mentioned ...
" ... admitted once in Jan2012 and once in Mar2013...I'm do not want to count. If Pat_ID #2 had Dx1 or Dx2 admitted once in Dec 2013 and again in Jan2014 ... "
but there was no such data in the posted data. I added data to match that in your description and used the SQL method I posted earlier and the results seem to match your request...
* your data with my additions for ID 1 and ID 2;
data x;
input id (dx1 dx2) (:$5.) date :date9.;
format date mmddyy10.;
datalines;
1 49390 5409 01JAN2012
2 2512 49392 19DEC2013
3 49390 72888 03JUN2014
3 49390 3094 26OCT2013
4 2752 49392 06OCT2013
5 49381 72252 26NOV2013
6 49320 7220 28APR2014
7 49390 55329 19DEC2013
8 49390 48882 08JUL2014
9 490 49392 08DEC2013
1 999 99999 07MAR2013
2 888 88888 01JAN2014
;
* add observation number;
data x;
set x;
nobs = _n_;
run;
proc sql;
create table y (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, a.dx1 as dx1_check, a.dx2 as dx2_check,
b.date, b.nobs, b.dx1, b.dx2,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.date,b.date),max(a.date,b.date)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs
order id, date_check;
quit;
* get rid of duplicates;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run; hi mmddyy10.;
run;
RESULTS ...
id=2
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
1 2 12/19/2013 2512 49392 12 01/01/2014 888 88888
id=3
nobs_ dx1_ dx2_
Obs check date_check check check nobs date dx1 dx2
2 4 10/26/2013 49390 3094 3 06/03/2014 49390 72888
Thank you!!
Hi @MikeZdeb. A while ago you gave the the solution to a question I have about identifying observations in a rolling 12 month period. Can you help me modify this syntax to identify encounters that have 5 ER visits in any rolling 12 month period (regardless of diagnosis 'dx'). I feel like if should be similar to this syntax, but not quite sure how I should work this. All the observations in the table are ER visits, but I just need those that have 5 or more in any rolling 12 month period (table contains 5 years of ER data).
Thank you!!
Hi, I think this works OK. Maybe you can adapt it to your data. It joins the data with itself and looks for dates with an ID that are with one year. I also did a hash solution and got the same results. Of course it is possible that I'm really zero-for-two, but I think it's OK.
Since original post ...
#1 fixed the INTNX function, should be 12 months not 1 year (forgot about that interval b oundary stuff)
#2 fixed the duplicate output (e.g. nobs 7 matched with nobs 10 duplicated with nobs 10 matched with nobs 7) using the variable MINMAX
* make some data, IDs 1 through 50, 1 to 5 observations per ID, random dates;
data x;
do id=1 to 50;
do j=1 to ceil(5*ranuni(99));
date = '01jan2005'd + ceil(2000*ranuni(99));
nobs+1;
output;
end;
end;
keep id nobs date;
format date mmddyy10.;
run;
* join the data set with itself and look for dates within ID within one year;
proc sql;
create table join (drop=lo hi) as
select a.id, a.date as date_check, a.nobs as nobs_check, b.date, b.nobs,
intnx('month',a.date, -12) as lo, intnx('month',a.date,12) as hi,
catx(' ',min(a.nobs,b.nobs),max(a.nobs,b.nobs)) as minmax
from x as a, x as b
where a.id = b.id and b.date between calculated lo and calculated hi and a.nobs ^= b.nobs;
quit;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run;
Here's data set x, IDs 1 through 3 ...
Obs id date nobs
1 1 03/20/2005 1
2 1 03/12/2008 2
3 2 04/24/2006 3
4 2 04/02/2010 4
5 2 02/19/2009 5
6 2 06/29/2008 6
7 3 03/27/2008 7
8 3 08/15/2005 8
9 3 09/24/2009 9
10 3 09/01/2007 10
11 3 07/27/2008 11
Here's data set JOIN for the IDs ...
id=2
nobs_
Obs check date_check date nobs
1 5 02/19/2009 06/29/2008 6
id=3
nobs_
Obs check date_check date nobs
2 10 09/01/2007 07/27/2008 11
3 7 03/27/2008 09/01/2007 10
4 7 03/27/2008 07/27/2008 11
FYI ... the hash version with help from Better Hashing in SAS® 9.2 ...
https://support.sas.com/resources/papers/sgf2008/better-hashing-sas92.pdf
and the HTML screws up a statement, the &colon you see below should be a real colon as in :
data y (keep=id nobs date_check nobs_check date minmax);
retain id date nobs 0;
dcl hash h(dataset : 'x', multidata :'y');
h.definekey('id');
h.definedata('date', 'nobs');
h.definedone();
do until(last);
set x end=last;
date_check = date;
nobs_check = nobs;
lo = intnx('month', date_check, -12);
hi = intnx('month', date_check, 12);
h.find();
minmax = catx(' ',min(nobs,nobs_check),max(nobs,nobs_check));
if date ge lo and date le hi and nobs ne nobs_check then output;
h.has_next(result: r);
do while(r ne 0);
h.find_next();
minmax = catx(' ',min(nobs,nobs_check),max(nobs,nobs_check));
if date ge lo and date le hi and nobs ne nobs_check then output;
h.has_next(result: r);
end;
end;
stop;
format date_check mmddyy10.;
run;
proc sort data=y out=y (drop=minmax) nodupkey;
by id minmax;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.