Hello,
I'm relatively new to SAS and having some trouble with coding for a variable that I want. I have a dataset for patients, with each line representing an admission. I have created a new variable that indicates an admission when they had a surgery, coined "Index Admission". I am looking to find out if they have been readmitted after that date, specifically within 30-days or 1-year (I imagine the code will be similar for both instances).
My current dataset looks like :
data patients; input patient admission adm_date index; datalines; 1 1 01/12/2013 1 1 2 06/24/2013 0 2 1 02/10/2013 0 2 2 01/05/2014 1 2 3 03/06/2014 0 3 1 02/11/2011 1 3 2 01/12/2012 0 4 1 03/21/2010 0 4 2 04/06/2010 0 4 3 09/05/2015 1 4 4 09/11/2016 0 ;
And I hope to have output that adds a variable for readmission in 1-year:
data patients; input patient admission adm_date index readm_1y; datalines; 1 1 01/12/2013 1 0 1 2 06/24/2013 0 1 1 3 08/12/2013 0 1 2 1 02/10/2013 0 0 2 2 01/05/2014 1 0 2 3 03/06/2014 0 1 3 1 02/11/2011 1 0 3 2 01/12/2012 0 1 4 1 03/21/2010 0 0 4 2 04/06/2010 0 0 4 3 09/05/2015 1 0 4 4 09/11/2016 0 1 ;
Please let me know if my question isn't clear. Thank you in advance!
proc sort data = patients;
by patient admission;
run;
data patients2(drop = index_date);
format index_date e8601da.;
set patients;
by patient admission;
retain index_date;
if first.patient then index_date=.; /*set index_date to missing when you encounter new patient ID*/
if index=1 then index_date=adm_date; /*carry forward index_date to other patient ID records within same ID*/
if index=0 and adm_date <= index_date + 365 then readm_1y=1;
*if readmitted within 1 year then set readm_1y to 1, else set it to 0;
else readm_1y=0;
proc print;
run;
I do not believe your last row is correct in your desired output. Readmission within 1 year of 09/05/2015 is 09/05/2016. 09/11/2016 would be outside of that window.
Does each patient have only a single index admission?
If so then that is easy. Make a dataset with PATIENT and INDEX_DATE and merge it with the existing dataset and then subtract the INDEX_DATE from the ADM_DATE and see if how many days difference there is.
If you want to treat every admission as a new patient/index_date pair then you will need to use SQL join instead so that you can do a many to many match.
You have honored the "form" of providing sample data in a DATA step. Thank you for that.
But the code you provide does not honor the actual purpose of this common request - namely to have a data step that actually works. Yours does not. Here is the log it produces.
1 data patients;
2 input patient admission adm_date index;
3 datalines;
NOTE: Invalid data for adm_date in line 4 9-18.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
4 1 1 01/12/2013 1
patient=1 admission=1 adm_date=. index=1 _ERROR_=1 _N_=1
NOTE: Invalid data for adm_date in line 5 9-18.
5 1 2 06/24/2013 0
patient=1 admission=2 adm_date=. index=0 _ERROR_=1 _N_=2
NOTE: Invalid data for adm_date in line 6 9-18.
6 2 1 02/10/2013 0
patient=2 admission=1 adm_date=. index=0 _ERROR_=1 _N_=3
NOTE: Invalid data for adm_date in line 7 9-18.
7 2 2 01/05/2014 1
patient=2 admission=2 adm_date=. index=1 _ERROR_=1 _N_=4
NOTE: Invalid data for adm_date in line 8 9-18.
8 2 3 03/06/2014 0
patient=2 admission=3 adm_date=. index=0 _ERROR_=1 _N_=5
NOTE: Invalid data for adm_date in line 9 9-18.
9 3 1 02/11/2011 1
patient=3 admission=1 adm_date=. index=1 _ERROR_=1 _N_=6
NOTE: Invalid data for adm_date in line 10 9-18.
10 3 2 01/12/2012 0
patient=3 admission=2 adm_date=. index=0 _ERROR_=1 _N_=7
NOTE: Invalid data for adm_date in line 11 9-18.
11 4 1 03/21/2010 0
patient=4 admission=1 adm_date=. index=0 _ERROR_=1 _N_=8
NOTE: Invalid data for adm_date in line 12 8-17.
12 4 2 04/06/2010 0
patient=4 admission=2 adm_date=. index=0 _ERROR_=1 _N_=9
NOTE: Invalid data for adm_date in line 13 8-17.
13 4 3 09/05/2015 1
patient=4 admission=3 adm_date=. index=1 _ERROR_=1 _N_=10
NOTE: Invalid data for adm_date in line 14 8-17.
14 4 4 09/11/2016 0
patient=4 admission=4 adm_date=. index=0 _ERROR_=1 _N_=11
NOTE: The data set WORK.PATIENTS has 11 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.01 seconds
The resulting data set has missing values for each date, making it impossible to generate intervals between admissions.
Of course, most of us can make the needed corrections, but it's a good idea to make sure the sample data code does what it is supposed to do. Could you do the correction for us?
Help us help you.
data patients;
format adm_date e8601da.;
input patient admission adm_date mmddyy10. index;
datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;
proc sort data=patients;
by patient adm_date;
run;
data want;
set patients;
by patient;
previous = _n_ - 1;
if first.patient then readm_1y=0;
else do;
set patients(rename=(adm_date = _adm_date) keep=adm_date) point=previous;
diff = adm_date - _adm_date;
readm_1y = 30<=diff<=365;
end;
drop _adm_date;
run;
Could you let me know if this meets your requirements?
You might want to consider generating cutoff dates for READ_1Y and READ_30D. Then just compare ADM_DATE to each cutoff. The code below uses the sample DATA step provided by @Mazi:
data patients;
format adm_date e8601da.;
input patient admission adm_date mmddyy10. index;
datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;
data want (drop=cutoff_:);
set patients;
by patient ;
retain cutoff_date_1y cutoff_date_30d;
if first.patient then call missing(of cutoff_:);
if index=1 then do;
cutoff_date_1y=intnx('year',adm_date,1,'sameday');
cutoff_date_30d=adm_date+30;
end;
readm_1y=(index=0 and adm_date<=cutoff_date_1y);
readm_30d=(index=0 and adm_date<=cutoff_date_30d);
run;
proc sort data = patients;
by patient admission;
run;
data patients2(drop = index_date);
format index_date e8601da.;
set patients;
by patient admission;
retain index_date;
if first.patient then index_date=.; /*set index_date to missing when you encounter new patient ID*/
if index=1 then index_date=adm_date; /*carry forward index_date to other patient ID records within same ID*/
if index=0 and adm_date <= index_date + 365 then readm_1y=1;
*if readmitted within 1 year then set readm_1y to 1, else set it to 0;
else readm_1y=0;
proc print;
run;
I do not believe your last row is correct in your desired output. Readmission within 1 year of 09/05/2015 is 09/05/2016. 09/11/2016 would be outside of that window.
/*Assuming I understood what you mean.*/
data patients;
format adm_date e8601da.;
input patient admission adm_date mmddyy10. index;
datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;
data want ;
set patients;
by patient ;
readm_1y=0;
retain temp_date ;
if first.patient then call missing(temp_date);
if index=1 then temp_date=adm_date;
else do;
if not missing(temp_date) and intck('year',temp_date,adm_date,'c')=0 then readm_1y=1;
end;
drop temp_date;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.