Your data structure is confusing.
Shouldn't you have one dataset with the diagnosis dates and a second dataset with the dates you want the test whether or not there were prior diagnoses?
data have;
input ID dx fst_dt :yymmdd. prior_dx inpatient start_dt :yymmdd.;
format fst_dt start_dt yymmdd10.;
cards;
1 1 2017-1-1 1 1 2016-8-1
2 1 2018-1-1 1 0 2016-1-1
2 1 2019-1-1 1 0 2017-1-1
3 1 2017-2-1 0 1 2015-2-1
3 1 2018-3-1 1 0 2018-1-1
3 1 2019-1-2 1 0 2018-12-1
;
data diag ;
set have;
where dx=1;
keep id start_dt inpatient;
run;
data visits;
set have;
keep id fst_dt;
run;
Now the question is to join the two and calculate the count of diagnoses and inpatient diagnoses within the time window. Not sure what measure you are planning for 12 months. Here is code that uses 365 days. Some people standardize to 30 day months so would use 360 days instead. You could also use INTCK() to test for one year or 12 month intervals.
proc sql ;
create table want as
select distinct
a.id
, a.fst_dt
, count(inpatient) as n_dx
, sum( inpatient) as n_ip_dx
, calculated n_dx>1 or calculated n_ip_dx>0 as exclude
from visits a
left join diag b
on a.id = b.id
and (0 < (fst_dt - start_dt) <= 365)
group by a.id, s.fst_dt
;
quit;
So the EXCLUDE flag is what you could use to decide whether or not to include this subject.
Obs ID fst_dt n_dx n_ip_dx exclude
1 1 2017-01-01 1 1 1
2 2 2018-01-01 1 0 0
3 2 2019-01-01 1 0 0
4 3 2017-02-01 2 0 1
5 3 2018-03-01 2 0 1
6 3 2019-01-02 2 0 1
You probably will want to then further reduce this to just the minimum date for each ID.
data cases;
set want;
by id;
where not exclude ;
if first.id;
run;
... View more