Hello,
I have a data where some ID have multiple records and some only have one record. I have created a variable for first ischemic stroke and first AFIB [ i don't know how to let this variable be in dates instead of 0,1]. Some records have ischemic stroke and AFIB on a different date while some have both stroke and afib dx on the same date as seen below.
I would like to find records that if date of AFIB comes before the date of Ischemic stroke, That is
If AFIB (dx_date) is less than stroke (d_event) then stroke = “Pre”
If AFIB (dx_date) comes after stroke, then stroke = “Post” ;
Else if both dates are the same then stroke = “same day”.
I tried to code in sas but am not getting it.
data have; input ID $ D_EVENT First_stroke First_afib dx_date ; format d_event dx_date date9.; datalines; 2411312 06APR2010 . 1 06APR2010 2411312 06APR2010 1 . . 2094284 28OCT2016 . 1 28OCT2016 2094284 19FEB2019 1 . . 2303167 23APR2010 . 1 23APR2010 2224656 05JUL2018 . 1 05JUL2018 2224656 20APR2005 1 . . 2188033 02JAN2019 . 1 02JAN2019 2188033 23APR2012 1 . . ;; run; data want; do until(last.id); set have; length stroke 10.; format dx_date date9.; by id; if first_afib then do; dx_date=d_event; end; end; if dx_date < d_event then stroke = 'Pre'; else if dx_date> d_event then stroke='Post'; else stroke='Same day'; run;
This is what I wanted as output
ID | D_EVENT | First_stroke_date | First_AFIB_date | Stroke |
2411312 | 06APR2010 | 06APR2010 | 06APR2010 | Same day |
2094284 | 28OCT2016 | 19FEB2019 | 28OCT2016 | Pre |
2303167 | 23APR2010 | 23APR2010 | 23APR2010 | Same day |
2224656 | 05JUL2018 | 20APR2005 | 05JUL2018 | Post |
Thank you in advance.
1. Your code does not run. Please provide running code => Test it after posting by copying from your post.
2. Unsure what you mean by
i don't know how to let this variable be in dates instead of 0,1]
Maybe you don't need a flag at all? Having a date is enough to identify whether an event took place.
Also unsure why STROKE_DTE
is always populated.
3. This seems to do what you want. It's only valid if there are 2 records at most per ID.
data HAVE(index=(ID));
input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
format STROKE_DTE AFIB_DTE date9.;
informat STROKE_DTE AFIB_DTE date9.;
datalines;
2411312 06APR2010 . 1 06APR2010
2411312 06APR2010 1 . .
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018
2224656 20APR2005 1 . .
2188033 02JAN2019 . 1 02JAN2019
2188033 23APR2012 1 . .
run;
data WANT;
merge HAVE(keep=ID STROKE_DTE STROKE_FLG where=(STROKE_FLG))
HAVE(keep=ID AFIB_DTE AFIB_FLG where=(AFIB_FLG ))
;
by ID;
length TIMING $8;
TIMING = ifc(STROKE_DTE < AFIB_DTE, 'Pre'
,ifc(STROKE_DTE > AFIB_DTE, 'Post'
, 'Same day'));
run;
@ChrisNZ Thank you. If i have a recurrent event for each of the disease. I have modified your table by adding additional roles to each record. How do I flag only the first and second recurrent date? I want to keep the first record and the first recurrent of each disease.
For example, I want to keep 2188033 23APR2012 1 . .
2188033 02JAN2019 . 1 02JAN2019
data HAVE(index=(ID));
input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
format STROKE_DTE AFIB_DTE date9.;
informat STROKE_DTE AFIB_DTE date9.;
datalines;
2411312 06APR2010 . 1 06APR2010
2411312 10JUN2010 . 1 10JUN2010
2411312 06APR2010 1 . .
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018
2224656 20APR2005 1 . .
2188033 23APR2012 1 . .
2188033 02JAN2019 . 1 02JAN2019
2188033 01JUN2019 . 1 01JUN2019
2188033 11JUL2020 . 1 11JUL2020
run;
So you want to keep the first 2 records?
Shouldn't your table be sorted by ID and by STROKE_DTE to ensure these records are always in the expected order?
@ChrisNZ You are right. I am suppose to sort by ID and STROKE_DTE so I can see all order of events.
I have converted the FLG into dates since they are in dates format in my original data. Also I have other variables in the original data that i will like to add so I can find all first diagnosis(dx) and recurrent dx. This is how my current data looks and this is what I want as output.
data HAVE(index=(ID));
input ID $ EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE ;
format EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
informat EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
datalines;
2411312 05APR2010 05APR2010 . . . . . . . . .
2411312 06APR2010 . 06APR2010 . . . . . . . .
2411312 07APR2010 . 07APR2010 . . . . . . . .
2411312 08MAY2010 . 08MAY2010 . . . . . . . .
2411312 10JUN2010 . 10JUN2010 . . . . . . . .
2094284 28OCT2016 . . 28OCT2016 . . . . . . .
2094284 19FEB2019 . . . 19FEB2019 . . . . . .
2094284 23MAR2019 . . . 23MAR2019 . . . . . .
2094284 19APR2019 . . . 19APR2019 . . . . . .
2094284 29MAY2019 . . . 29MAY2019 . . . . . .
2094284 10JUL2019 . . . 10JUL2019 . . . . . .
2094284 19SEP2019 . . . 19SEP2019 . . . . . .
2094284 10FEB2020 . . . 10FEB2020 . . . . . .
2303167 23APR2010 . . . . 23APR2010 . . . . .
2224656 01APR2005 01APR2005 . . . . . . . . .
2224656 05JUL2006 . . . . . . 05JUL2006 . . .
2224656 15JUL2006 . . . . . . . 15JUL2006 . .
2224656 20APR2015 . . . . . . . 20APR2015 . .
2224656 29APR2015 . . . . . . . 29APR2015 . .
2188033 02JAN2019 02JAN2019 . . . . . . . . .
2188033 23APR2012 . 23APR2012 . . . . . . . .
2188033 23APR2013 . . . . . . . . 23APR2013 .
2188033 25MAY2013 . . . . . . . . . 25MAY2013
;
run;
OUTPUT: I will like all first diagnosis (dx) and recurrent dx to be in one row per ID.
For example
ID | STROKE_DTE | R_STROKE_DTE 1 | R_STROKE_DTE 2 | R_STROKE_DTE 3 | R_STROKE_DTE 4 | AFIB_DTE | R_AFIB_DTE 1 | R_AFIB_DTE 2 | R_AFIB_DTE 3 | R_AFIB_DTE 4 | R_AFIB_DTE 5 | R_AFIB_DTE 6 | R_AFIB_DTE 7 | DEMENTIA_DTE |
2411312 | 05APR2010 | 06APR2010 | 07APR2010 | 08MAY2010 | 10JUN2010 |
|
|
|
|
|
|
|
|
|
2094284 |
|
|
|
|
| 28OCT2016 | 19FEB2019 | 23MAR2019 | 19APR2019 | 29MAY2019 | 10JUL2019 | 19SEP2019 | 10FEB2020 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You're moving the goal posts. 😞
There's no timing in your output, and you're transforming a long data set into a wide one with no obvious logic.
I have no idea what you want. Do you?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.