Pre and post diagnosis date

CathyVI · Posted 07-02-2024 11:24 AM

Hello,

I have a data where some ID have multiple records and some only have one record. I have created a variable for first ischemic stroke and first AFIB [ i don't know how to let this variable be in dates instead of 0,1]. Some records have ischemic stroke and AFIB on a different date while some have both stroke and afib dx on the same date as seen below.

I would like to find records that if date of AFIB comes before the date of Ischemic stroke, That is

If AFIB (dx_date) is less than stroke (d_event) then stroke = “Pre”

If AFIB (dx_date) comes after stroke, then stroke = “Post” ;

Else if both dates are the same then stroke = “same day”.

I tried to code in sas but am not getting it.

data have;
input  ID $ D_EVENT First_stroke  First_afib  dx_date ;
format d_event dx_date date9.;
datalines;	
2411312	06APR2010	.	 1	 06APR2010	     
2411312	06APR2010	1	 .	    .	
2094284	28OCT2016	.	 1      28OCT2016
2094284	19FEB2019	1	 .	    .
2303167	23APR2010	.	 1	 23APR2010
2224656	05JUL2018	.	 1	 05JUL2018	
2224656	20APR2005	1	 .	    .
2188033	02JAN2019	.	1	 02JAN2019	
2188033	23APR2012	1	.	    .		
;;
run;

data want;
do until(last.id);
set have;
length stroke 10.;
format dx_date date9.;
by id;
if first_afib then do;
dx_date=d_event;
end;
end;
if dx_date < d_event then stroke = 'Pre';
    else if dx_date> d_event then stroke='Post';
	else stroke='Same day';
run;

This is what I wanted as output

ID	D_EVENT	First_stroke_date	First_AFIB_date	Stroke
2411312	06APR2010	06APR2010	06APR2010	Same day
2094284	28OCT2016	19FEB2019	28OCT2016	Pre
2303167	23APR2010	23APR2010	23APR2010	Same day
2224656	05JUL2018	20APR2005	05JUL2018	Post

Thank you in advance.

ChrisNZ · Posted 07-03-2024 01:00 AM

1. Your code does not run. Please provide running code => Test it after posting by copying from your post.

2. Unsure what you mean by

i don't know how to let this variable be in dates instead of 0,1]

Maybe you don't need a flag at all? Having a date is enough to identify whether an event took place.

Also unsure why STROKE_DTE is always populated.

3. This seems to do what you want. It's only valid if there are 2 records at most per ID.

data HAVE(index=(ID));
  input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
  format STROKE_DTE AFIB_DTE date9.;
  informat STROKE_DTE AFIB_DTE date9.;
datalines;  
2411312 06APR2010 . 1 06APR2010       
2411312 06APR2010 1 . . 
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018  
2224656 20APR2005 1 . .
2188033 02JAN2019 . 1 02JAN2019  
2188033 23APR2012 1 . .   
run;

data WANT;
  merge HAVE(keep=ID STROKE_DTE STROKE_FLG where=(STROKE_FLG))
        HAVE(keep=ID AFIB_DTE   AFIB_FLG   where=(AFIB_FLG  ))
        ;
  by ID;
  length TIMING $8;
  TIMING = ifc(STROKE_DTE < AFIB_DTE, 'Pre'
          ,ifc(STROKE_DTE > AFIB_DTE, 'Post'
          ,                           'Same day'));
run;

High-Performance SAS Coding - Third Edition

CathyVI · Posted 07-03-2024 02:08 PM

@ChrisNZ Thank you. If i have a recurrent event for each of the disease. I have modified your table by adding additional roles to each record. How do I flag only the first and second recurrent date? I want to keep the first record and the first recurrent of each disease.

For example, I want to keep 2188033 23APR2012 1 . .
2188033 02JAN2019 . 1 02JAN2019

data HAVE(index=(ID));
  input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
  format STROKE_DTE AFIB_DTE date9.;
  informat STROKE_DTE AFIB_DTE date9.;
datalines;  
2411312 06APR2010 . 1 06APR2010 
2411312 10JUN2010 . 1 10JUN2010      
2411312 06APR2010 1 . . 
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018  
2224656 20APR2005 1 . .
2188033 23APR2012 1 . .  
2188033 02JAN2019 . 1 02JAN2019 
2188033 01JUN2019 . 1 01JUN2019
2188033 11JUL2020 . 1 11JUL2020  
run;

ChrisNZ · Posted 07-03-2024 07:33 PM

So you want to keep the first 2 records?

Shouldn't your table be sorted by ID and by STROKE_DTE to ensure these records are always in the expected order?

High-Performance SAS Coding - Third Edition

CathyVI · Posted 07-05-2024 02:25 PM

@ChrisNZ You are right. I am suppose to sort by ID and STROKE_DTE so I can see all order of events.

I have converted the FLG into dates since they are in dates format in my original data. Also I have other variables in the original data that i will like to add so I can find all first diagnosis(dx) and recurrent dx. This is how my current data looks and this is what I want as output.

data HAVE(index=(ID));
  input ID $ EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE ;
  format EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
  informat EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
datalines;  
2411312 05APR2010 05APR2010 . . . . . . . . .
2411312 06APR2010 . 06APR2010 . . . . . . . .
2411312 07APR2010 . 07APR2010 . . . . . . . .
2411312 08MAY2010 . 08MAY2010 . . . . . . . .
2411312 10JUN2010 . 10JUN2010 . . . . . . . .
2094284 28OCT2016 . . 28OCT2016 . . . . . . .
2094284 19FEB2019 . . . 19FEB2019 . . . . . .
2094284 23MAR2019 . . . 23MAR2019 . . . . . .
2094284 19APR2019 . . . 19APR2019 . . . . . .
2094284 29MAY2019 . . . 29MAY2019 . . . . . .
2094284 10JUL2019 . . . 10JUL2019 . . . . . .
2094284 19SEP2019 . . . 19SEP2019 . . . . . .
2094284 10FEB2020 . . . 10FEB2020 . . . . . .
2303167 23APR2010 . . . . 23APR2010 . . . . .
2224656 01APR2005 01APR2005 . . . . . . . . .
2224656 05JUL2006 . . . . . . 05JUL2006 . . .
2224656 15JUL2006 . . . . . . . 15JUL2006 . .
2224656 20APR2015 . . . . . . . 20APR2015 . .
2224656 29APR2015 . . . . . . . 29APR2015 . .
2188033 02JAN2019 02JAN2019 . . . . . . . . .
2188033 23APR2012 . 23APR2012 . . . . . . . .
2188033 23APR2013 . . . . . . . . 23APR2013 .
2188033 25MAY2013 . . . . . . . . . 25MAY2013
;
run;

OUTPUT: I will like all first diagnosis (dx) and recurrent dx to be in one row per ID.

For example

ID

STROKE_DTE

R_STROKE_DTE 1

R_STROKE_DTE 2

R_STROKE_DTE 3

R_STROKE_DTE 4

AFIB_DTE

R_AFIB_DTE 1

R_AFIB_DTE 2

R_AFIB_DTE 3

R_AFIB_DTE 4

R_AFIB_DTE 5

R_AFIB_DTE 6

R_AFIB_DTE 7

DEMENTIA_DTE

2411312

05APR2010

06APR2010

07APR2010

08MAY2010

10JUN2010

2094284

28OCT2016

19FEB2019

23MAR2019

19APR2019

29MAY2019

10JUL2019

19SEP2019

10FEB2020

ChrisNZ · Posted 07-06-2024 02:14 AM

You're moving the goal posts. 😞

There's no timing in your output, and you're transforming a long data set into a wide one with no obvious logic.

I have no idea what you want. Do you?

High-Performance SAS Coding - Third Edition

Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Re: Pre and post diagnosis date

Registration is open

SAS Training: Just a Click Away