DATA Step, Macro, Functions and more

count days apart with condition on multiple different events per person

Reply
Contributor
Posts: 36

count days apart with condition on multiple different events per person

Hi everyone, I come across this problem and I hope you could help me with:

 

I am trying to count "How many people received both PCV13 and PPSV23 after they turned 60 years of age, and the days between the two vaccine is at least 60 days apart".

 

I don't care which vaccine was taken the first after they turn 60, only the length of the period in between; and a person may have got the same type of vaccine more than once after they turn 60, so I care the max gap- the earliest date of one vaccine and the latest date of the other

 

I create a sample of what the input data looks like:

person_idbirth_datevaccinevaccine_date
15/1/1940PCV133/1/2013
15/1/1940PCV135/8/1982
15/1/1940PPSV239/7/1989
15/1/1940PPSV234/9/2010
15/1/1940PPSV237/20/2016
211/6/1950PCV136/1/1968
211/6/1950PCV135/2/1986
211/6/1950PPSV238/1/2015
310/6/1955PCV131/19/2013
310/6/1955PPSV232/1/2013
310/6/1955PPSV2312/20/2015
43/12/1990PCV131/6/2005
43/12/1990PPSV232/6/2005

 

My thinking:

- first create a flag if the vaccine is received after they turned 60, then exclude those rows that vaccine_date is earlier than their 60 years birth date anniversary, for example, person 4 in the above example will be filtered out.

- But then I haven't figure out how to use INTCK and other conditions to let SAS calculate the max gap (because I don't care which vaccine they receive first, some people receive PCV13 first, some received PPSV23 first, and some received 2 PPSV23, but only 1 is after 60 days they received PCV13 - all kinds of possibilities)

 

My vision of the output is something like:

person_idbirth_datevaccinevaccine_date60Days_Flag
15/1/1940PCV133/1/20131
15/1/1940PCV135/8/19821
15/1/1940PPSV239/7/19891
15/1/1940PPSV234/9/20101
15/1/1940PPSV237/20/20161
211/6/1950PCV136/1/19680
211/6/1950PCV135/2/19860
211/6/1950PPSV238/1/20150
310/6/1955PCV131/19/20131
310/6/1955PPSV232/1/20131
310/6/1955PPSV2312/20/20151
43/12/1990PCV131/6/20050
43/12/1990PPSV232/6/20050

 

Any idea would be greatly appreciated!

PROC Star
Posts: 311

Re: count days apart with condition on multiple different events per person

Posted in reply to LisaYIN9309

How are you calculating age? My initial poke at this says that ID 3 only has one vaccine after 60, so should not be flagged?

Super Contributor
Super Contributor
Posts: 265

Re: count days apart with condition on multiple different events per person

[ Edited ]
Posted in reply to LisaYIN9309

So as long as they are 60, it doesn't matter how many vaccines they have had in total, how many of each kind, or in what order they have had them, all you really care about is did they have one of each type more than 60 days apart?

 

Edit:  I posted an answer but it was wrong. I neglected to account for only vaccines after ago 60. Not just 60 year old people but vaccinated after 60.   Try this now.

 

Data vaccine_records;
  input person_id birth_date:mmddyy10. vaccine:$6.	vaccine_date:mmddyy10.;
  format birth_date vaccine_date yymmdd10.;
  datalines;
1	5/1/1940	PCV13	3/1/2013
1	5/1/1940	PCV13	5/8/1982
1	5/1/1940	PPSV23	9/7/1989
1	5/1/1940	PPSV23	4/9/2010
1	5/1/1940	PPSV23	7/20/2016
2	11/6/1950	PCV13	6/1/1968
2	11/6/1950	PCV13	5/2/1986
2	11/6/1950	PPSV23	8/1/2015
3	10/6/1955	PCV13	1/19/2013
3	10/6/1955	PPSV23	2/1/2013
3	10/6/1955	PPSV23	12/20/2015
4	3/12/1990	PCV13	1/6/2005
4	3/12/1990	PPSV23	2/6/2005
5	5/2/1940	PCV13	3/2/2013
5	5/2/1940	PPSV23	4/1/2011
6	3/12/1940	PCV13	2/6/2005
6	3/12/1940	PPSV23	1/6/2005
;
run;

*get only the records where the recipient was over 60 at the time of vaccination;
proc sql;
	create table validvaccination as 
	select *   
	from vaccine_records
	where floor((vaccine_date - birth_date)/365) > 60;
quit;


*find min and max dates for pcv13;
proc sql;
	create table minmaxpcv13 as 
	select person_id, min(vaccine_date) as minpcv13, max(vaccine_date) as maxpcv13  
	from validvaccination
	where vaccine = 'PCV13'
	group by person_id;
quit;

*find min and max dates for ppsv23;
proc sql;
	create table minmaxppsv23 as 
	select person_id, min(vaccine_date) as minppsv23, max(vaccine_date) as maxppsv23  
	from validvaccination
	where vaccine = 'PPSV23'
	group by person_id;
quit;

*find everybody with a 60 day gap between (minpcv13 and min or maxppsv23) or (maxpcv13 and min or maxppsv23); 
proc sql;
	create table sixtydaygap as 
	select a.person_id 
	from minmaxpcv13 a inner join minmaxppsv23 b on a.person_id = b.person_id
	where abs(a.minpcv13 - b.minppsv23) > 60 or abs(a.minpcv13 - b.maxppsv23) > 60  
	or abs(a.maxpcv13 - b.minppsv23) > 60 or abs(a.maxpcv13 - b.maxppsv23) > 60;
quit;

 

 

 

 That picks up 1 and 5.  One and five are old enough and have a gap.  Two and six don't have a gap and three and four aren't old enough. 

PROC Star
Posts: 311

Re: count days apart with condition on multiple different events per person

Posted in reply to LisaYIN9309

So the issue with the calculation of age aside, which might suggest that I don't fully get what you want, this is one possible approach (see notes in code):

 

 

data have;
    input person_id birth_date :mmddyy10. vaccine $ vaccine_date :mmddyy10.;
    format birth_date vaccine_date mmddyy10.;
    age_at_vaccination = yrdif(birth_date, vaccine_date, 'AGE');

    if age_at_vaccination >= 60; /* Toss out any where person was < 60 at vaccination. */

datalines;
1 5/1/1940 PCV13 3/1/2013
1 5/1/1940 PCV13 5/8/1982
1 5/1/1940 PPSV23 9/7/1989
1 5/1/1940 PPSV23 4/9/2010
1 5/1/1940 PPSV23 7/20/2016
2 11/6/1950 PCV13 6/1/1968
2 11/6/1950 PCV13 5/2/1986
2 11/6/1950 PPSV23 8/1/2015
3 10/6/1955 PCV13 1/19/2013
3 10/6/1955 PPSV23 2/1/2013
3 10/6/1955 PPSV23 12/20/2015
4 3/12/1990 PCV13 1/6/2005
4 3/12/1990 PPSV23 2/6/2005
;

proc sql;
    /* General idea: Join the data with itself by person_id
       where the vaccine is different and the date of the right
       table came after the date on the left table.

       This just shows all cases where a different vaccine came
       later and how many days were between them. */
    SELECT DISTINCT aa.person_id, aa.vaccine, aa.vaccine_date, 
          bb.vaccine AS later_type, 
          bb.vaccine_date AS later_date,
          bb.vaccine_date - aa.vaccine_date AS days
    FROM have AS aa
    LEFT JOIN have AS bb
    ON aa.person_id = bb.person_id AND
        aa.vaccine ne bb.vaccine AND
        bb.vaccine_date > aa.vaccine_date
    ORDER BY aa.person_id, aa.vaccine_date;
quit;

proc sql;
    /* To produce just the requested flag, we can
       merge the above on as a subquery that only produces
       the IDs that match the criterion of 60 days. */
    CREATE TABLE want AS
    SELECT a.*, not missing(b.person_id) AS flag
    FROM have AS a
    LEFT JOIN (SELECT DISTINCT aa.person_id
               FROM have AS aa
               INNER JOIN have AS bb
               ON aa.person_id = bb.person_id AND
                  aa.vaccine ne bb.vaccine AND
                  bb.vaccine_date - aa.vaccine_date > 60) AS b
    ON a.person_id = b.person_id;
quit;
Trusted Advisor
Posts: 1,289

Re: count days apart with condition on multiple different events per person

Posted in reply to LisaYIN9309

You apparently want to examine maximum cross-vaccine date ranges for vaccinations taken at age 60 or more, correct?

 

There is only one vaccination in that age range for id 3 (born 10/6/1955, so 60th birthday is on 10/6/2015).  So why have you assigned flag=1 for that id?   After all, only one vaccine was taken on or after that date.

Trusted Advisor
Posts: 1,289

Re: count days apart with condition on multiple different events per person

Posted in reply to LisaYIN9309

The task: For each id, see if the maximum cross-vaccine date range, for vaccinations taken at age 60+, is at least sixty days.  If true then all vaccination records for that id get flag=1.  Other id's get flag=0.

 

This program does that.  It assumes only that the data are sorted by ID (date order within id is not required):

 

data have;
  input person_id birth_date :mmddyy10. vaccine :$6. vaccine_date :mmddyy10.;
  format birth_date vaccine_date mmddyy10.;
datalines;
1 5/1/1940 PCV13 3/1/2013 
1 5/1/1940 PCV13 5/8/1982 
1 5/1/1940 PPSV23 9/7/1989 
1 5/1/1940 PPSV23 4/9/2010 
1 5/1/1940 PPSV23 7/20/2016 
2 11/6/1950 PCV13 6/1/1968 
2 11/6/1950 PCV13 5/2/1986 
2 11/6/1950 PPSV23 8/1/2015 
3 10/6/1955 PCV13 1/19/2013 
3 10/6/1955 PPSV23 2/1/2013 
3 10/6/1955 PPSV23 12/20/2015 
4 3/12/1990 PCV13 1/6/2005 
4 3/12/1990 PPSV23 2/6/2005 
run;

data want (drop=v);
  set have (in=sixty_years_old
            where=(intck('year',birth_date,vaccine_date,'continuous')>=60))
      have (in=keepin);
  by person_id;

  array mindate_ {2} _temporary_;
  array maxdate_ {2} _temporary_;
  retain flag ;


  if first.person_id then do;
    call missing(of mindate_{*}, of maxdate_{*});
    flag=0;
  end;

  if sixty_years_old and flag=0 then do;
    v=findw('PCV13 PPSV23',trim(vaccine),' ','E'); /*PCV13:v=1, PPSV23:v=2*/
    mindate_{v}=min(mindate_{v},vaccine_date);
    maxdate_{v}=max(maxdate_{v},vaccine_date);
    if range(mindate_{1},maxdate_{2})>=60 or 
    range(mindate_{2},maxdate_{1})>=60 then
    flag=1;
  end;

  if keepin;
run;

 

This works because I use SET with a BY id statement, in which the SET statement has two targets: (1) all age 60+ records for a given ID, followed by (2) records for ALL ages for the same id.  This allows preliminary examination of the age 60+ records to determine maximum cross-vaccine date range, establishing the FLAG value.  The flag value is then retained when rereading and outputting all records for the same id.

 

The other "trick" here is to determine the vaccine index (1 for PCV13, 2 for PPSV23) using the findw function (the 'E' modifier tells sas to return the position based on word count, not letter count).

 

Also I use the RANGE function instead of  subtracting mindate_{1} from maxdate_{2} (and vice versa), because that would generate notes when only 1 vaccine has been taken.  The range function returns a missing value in such cases, but no notes.

Ask a Question
Discussion stats
  • 5 replies
  • 131 views
  • 0 likes
  • 4 in conversation