The task: For each id, see if the maximum cross-vaccine date range, for vaccinations taken at age 60+, is at least sixty days. If true then all vaccination records for that id get flag=1. Other id's get flag=0.
This program does that. It assumes only that the data are sorted by ID (date order within id is not required):
data have;
input person_id birth_date :mmddyy10. vaccine :$6. vaccine_date :mmddyy10.;
format birth_date vaccine_date mmddyy10.;
datalines;
1 5/1/1940 PCV13 3/1/2013
1 5/1/1940 PCV13 5/8/1982
1 5/1/1940 PPSV23 9/7/1989
1 5/1/1940 PPSV23 4/9/2010
1 5/1/1940 PPSV23 7/20/2016
2 11/6/1950 PCV13 6/1/1968
2 11/6/1950 PCV13 5/2/1986
2 11/6/1950 PPSV23 8/1/2015
3 10/6/1955 PCV13 1/19/2013
3 10/6/1955 PPSV23 2/1/2013
3 10/6/1955 PPSV23 12/20/2015
4 3/12/1990 PCV13 1/6/2005
4 3/12/1990 PPSV23 2/6/2005
run;
data want (drop=v);
set have (in=sixty_years_old
where=(intck('year',birth_date,vaccine_date,'continuous')>=60))
have (in=keepin);
by person_id;
array mindate_ {2} _temporary_;
array maxdate_ {2} _temporary_;
retain flag ;
if first.person_id then do;
call missing(of mindate_{*}, of maxdate_{*});
flag=0;
end;
if sixty_years_old and flag=0 then do;
v=findw('PCV13 PPSV23',trim(vaccine),' ','E'); /*PCV13:v=1, PPSV23:v=2*/
mindate_{v}=min(mindate_{v},vaccine_date);
maxdate_{v}=max(maxdate_{v},vaccine_date);
if range(mindate_{1},maxdate_{2})>=60 or
range(mindate_{2},maxdate_{1})>=60 then
flag=1;
end;
if keepin;
run;
This works because I use SET with a BY id statement, in which the SET statement has two targets: (1) all age 60+ records for a given ID, followed by (2) records for ALL ages for the same id. This allows preliminary examination of the age 60+ records to determine maximum cross-vaccine date range, establishing the FLAG value. The flag value is then retained when rereading and outputting all records for the same id.
The other "trick" here is to determine the vaccine index (1 for PCV13, 2 for PPSV23) using the findw function (the 'E' modifier tells sas to return the position based on word count, not letter count).
Also I use the RANGE function instead of subtracting mindate_{1} from maxdate_{2} (and vice versa), because that would generate notes when only 1 vaccine has been taken. The range function returns a missing value in such cases, but no notes.
... View more