Hello,
I have a data where some ID have multiple records and some only have one record. I have created a variable for first ischemic stroke and first AFIB [ i don't know how to let this variable be in dates instead of 0,1]. Some records have ischemic stroke and AFIB on a different date while some have both stroke and afib dx on the same date as seen below.
I would like to find records that if date of AFIB comes before the date of Ischemic stroke, That is
If AFIB (dx_date) is less than stroke (d_event) then stroke = “Pre”
If AFIB (dx_date) comes after stroke, then stroke = “Post” ;
Else if both dates are the same then stroke = “same day”.
I tried to code in sas but am not getting it.
data have; input ID $ D_EVENT First_stroke First_afib dx_date ; format d_event dx_date date9.; datalines; 2411312 06APR2010 . 1 06APR2010 2411312 06APR2010 1 . . 2094284 28OCT2016 . 1 28OCT2016 2094284 19FEB2019 1 . . 2303167 23APR2010 . 1 23APR2010 2224656 05JUL2018 . 1 05JUL2018 2224656 20APR2005 1 . . 2188033 02JAN2019 . 1 02JAN2019 2188033 23APR2012 1 . . ;; run; data want; do until(last.id); set have; length stroke 10.; format dx_date date9.; by id; if first_afib then do; dx_date=d_event; end; end; if dx_date < d_event then stroke = 'Pre'; else if dx_date> d_event then stroke='Post'; else stroke='Same day'; run;
This is what I wanted as output
ID | D_EVENT | First_stroke_date | First_AFIB_date | Stroke |
2411312 | 06APR2010 | 06APR2010 | 06APR2010 | Same day |
2094284 | 28OCT2016 | 19FEB2019 | 28OCT2016 | Pre |
2303167 | 23APR2010 | 23APR2010 | 23APR2010 | Same day |
2224656 | 05JUL2018 | 20APR2005 | 05JUL2018 | Post |
Thank you in advance.
1. Your code does not run. Please provide running code => Test it after posting by copying from your post.
2. Unsure what you mean by
i don't know how to let this variable be in dates instead of 0,1]
Maybe you don't need a flag at all? Having a date is enough to identify whether an event took place.
Also unsure why STROKE_DTE
is always populated.
3. This seems to do what you want. It's only valid if there are 2 records at most per ID.
data HAVE(index=(ID));
input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
format STROKE_DTE AFIB_DTE date9.;
informat STROKE_DTE AFIB_DTE date9.;
datalines;
2411312 06APR2010 . 1 06APR2010
2411312 06APR2010 1 . .
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018
2224656 20APR2005 1 . .
2188033 02JAN2019 . 1 02JAN2019
2188033 23APR2012 1 . .
run;
data WANT;
merge HAVE(keep=ID STROKE_DTE STROKE_FLG where=(STROKE_FLG))
HAVE(keep=ID AFIB_DTE AFIB_FLG where=(AFIB_FLG ))
;
by ID;
length TIMING $8;
TIMING = ifc(STROKE_DTE < AFIB_DTE, 'Pre'
,ifc(STROKE_DTE > AFIB_DTE, 'Post'
, 'Same day'));
run;
@ChrisNZ Thank you. If i have a recurrent event for each of the disease. I have modified your table by adding additional roles to each record. How do I flag only the first and second recurrent date? I want to keep the first record and the first recurrent of each disease.
For example, I want to keep 2188033 23APR2012 1 . .
2188033 02JAN2019 . 1 02JAN2019
data HAVE(index=(ID));
input ID $ STROKE_DTE STROKE_FLG AFIB_FLG AFIB_DTE ;
format STROKE_DTE AFIB_DTE date9.;
informat STROKE_DTE AFIB_DTE date9.;
datalines;
2411312 06APR2010 . 1 06APR2010
2411312 10JUN2010 . 1 10JUN2010
2411312 06APR2010 1 . .
2094284 28OCT2016 . 1 28OCT2016
2094284 19FEB2019 1 . .
2303167 23APR2010 . 1 23APR2010
2224656 05JUL2018 . 1 05JUL2018
2224656 20APR2005 1 . .
2188033 23APR2012 1 . .
2188033 02JAN2019 . 1 02JAN2019
2188033 01JUN2019 . 1 01JUN2019
2188033 11JUL2020 . 1 11JUL2020
run;
So you want to keep the first 2 records?
Shouldn't your table be sorted by ID and by STROKE_DTE to ensure these records are always in the expected order?
@ChrisNZ You are right. I am suppose to sort by ID and STROKE_DTE so I can see all order of events.
I have converted the FLG into dates since they are in dates format in my original data. Also I have other variables in the original data that i will like to add so I can find all first diagnosis(dx) and recurrent dx. This is how my current data looks and this is what I want as output.
data HAVE(index=(ID));
input ID $ EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE ;
format EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
informat EVENT_DTE STROKE_DTE R_STROKE_DTE AFIB_DTE R_AFIB_DTE DEMENTIA_DTE R_DEMENTIA_DTE PFO_DTE R_PFO_DTE SEIZURE_DTE R_SEIZURE_DTE date9.;
datalines;
2411312 05APR2010 05APR2010 . . . . . . . . .
2411312 06APR2010 . 06APR2010 . . . . . . . .
2411312 07APR2010 . 07APR2010 . . . . . . . .
2411312 08MAY2010 . 08MAY2010 . . . . . . . .
2411312 10JUN2010 . 10JUN2010 . . . . . . . .
2094284 28OCT2016 . . 28OCT2016 . . . . . . .
2094284 19FEB2019 . . . 19FEB2019 . . . . . .
2094284 23MAR2019 . . . 23MAR2019 . . . . . .
2094284 19APR2019 . . . 19APR2019 . . . . . .
2094284 29MAY2019 . . . 29MAY2019 . . . . . .
2094284 10JUL2019 . . . 10JUL2019 . . . . . .
2094284 19SEP2019 . . . 19SEP2019 . . . . . .
2094284 10FEB2020 . . . 10FEB2020 . . . . . .
2303167 23APR2010 . . . . 23APR2010 . . . . .
2224656 01APR2005 01APR2005 . . . . . . . . .
2224656 05JUL2006 . . . . . . 05JUL2006 . . .
2224656 15JUL2006 . . . . . . . 15JUL2006 . .
2224656 20APR2015 . . . . . . . 20APR2015 . .
2224656 29APR2015 . . . . . . . 29APR2015 . .
2188033 02JAN2019 02JAN2019 . . . . . . . . .
2188033 23APR2012 . 23APR2012 . . . . . . . .
2188033 23APR2013 . . . . . . . . 23APR2013 .
2188033 25MAY2013 . . . . . . . . . 25MAY2013
;
run;
OUTPUT: I will like all first diagnosis (dx) and recurrent dx to be in one row per ID.
For example
ID | STROKE_DTE | R_STROKE_DTE 1 | R_STROKE_DTE 2 | R_STROKE_DTE 3 | R_STROKE_DTE 4 | AFIB_DTE | R_AFIB_DTE 1 | R_AFIB_DTE 2 | R_AFIB_DTE 3 | R_AFIB_DTE 4 | R_AFIB_DTE 5 | R_AFIB_DTE 6 | R_AFIB_DTE 7 | DEMENTIA_DTE |
2411312 | 05APR2010 | 06APR2010 | 07APR2010 | 08MAY2010 | 10JUN2010 |
|
|
|
|
|
|
|
|
|
2094284 |
|
|
|
|
| 28OCT2016 | 19FEB2019 | 23MAR2019 | 19APR2019 | 29MAY2019 | 10JUL2019 | 19SEP2019 | 10FEB2020 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You're moving the goal posts. 😞
There's no timing in your output, and you're transforming a long data set into a wide one with no obvious logic.
I have no idea what you want. Do you?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.