BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Bumble_15
Fluorite | Level 6

Hello,

I'm relatively new to SAS and having some trouble with coding for a variable that I want. I have a dataset for patients, with each line representing an admission. I have created a new variable that indicates an admission when they had a surgery, coined "Index Admission". I am looking to find out if they have been readmitted after that date, specifically within 30-days or 1-year (I imagine the code will be similar for both instances). 

 

My current dataset looks like :

data patients;
   input patient admission adm_date index;
datalines;
1  1    01/12/2013   1
1  2    06/24/2013   0
2  1    02/10/2013   0
2  2    01/05/2014   1
2  3    03/06/2014   0
3  1    02/11/2011   1
3  2    01/12/2012   0
4  1    03/21/2010   0
4  2   04/06/2010    0
4  3   09/05/2015    1
4  4   09/11/2016    0
;

And I hope to have output that adds a variable for readmission in 1-year:

data patients;
   input patient admission adm_date index readm_1y;
datalines;
1  1    01/12/2013   1   0
1  2    06/24/2013   0   1
1  3    08/12/2013   0   1
2  1    02/10/2013   0   0
2  2    01/05/2014   1   0
2  3    03/06/2014   0   1
3  1    02/11/2011   1   0
3  2    01/12/2012   0   1
4  1    03/21/2010   0   0
4  2   04/06/2010    0   0
4  3   09/05/2015    1   0
4  4   09/11/2016    0   1
;

Please let me know if my question isn't clear. Thank you in advance!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12
proc sort data = patients;
   by patient admission;
run;

data patients2(drop = index_date);
   format index_date e8601da.;
   set patients;
   by patient admission;
   retain index_date;
   if first.patient then index_date=.; /*set index_date to missing when you encounter new patient ID*/
   if index=1 then index_date=adm_date; /*carry forward index_date to other patient ID records within same ID*/
   if index=0 and adm_date <= index_date + 365 then readm_1y=1; 
   *if readmitted within 1 year then set readm_1y to 1, else set it to 0;
   else readm_1y=0;
proc print;
run;

I do not believe your last row is correct in your desired output.  Readmission within 1 year of 09/05/2015 is 09/05/2016. 09/11/2016 would be outside of that window.

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

Does each patient have only a single index admission?

If so then that is easy.  Make a dataset with PATIENT and INDEX_DATE and merge it with the existing dataset and then subtract the INDEX_DATE from the ADM_DATE and see if how many days difference there is.

If you want to treat every admission as a new patient/index_date pair then you will need to use SQL join instead so that you can do a many to many match.

mkeintz
PROC Star

@Bumble_15 

You have honored the "form" of providing sample data in a DATA step.  Thank you for that.

 

But the code you provide does not honor the actual purpose of this common request - namely to have a data step that actually works.  Yours does not.  Here is the log it produces.

 

1    data patients;
2       input patient admission adm_date index;
3    datalines;

NOTE: Invalid data for adm_date in line 4 9-18.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
4          1  1    01/12/2013   1
patient=1 admission=1 adm_date=. index=1 _ERROR_=1 _N_=1
NOTE: Invalid data for adm_date in line 5 9-18.
5          1  2    06/24/2013   0
patient=1 admission=2 adm_date=. index=0 _ERROR_=1 _N_=2
NOTE: Invalid data for adm_date in line 6 9-18.
6          2  1    02/10/2013   0
patient=2 admission=1 adm_date=. index=0 _ERROR_=1 _N_=3
NOTE: Invalid data for adm_date in line 7 9-18.
7          2  2    01/05/2014   1
patient=2 admission=2 adm_date=. index=1 _ERROR_=1 _N_=4
NOTE: Invalid data for adm_date in line 8 9-18.
8          2  3    03/06/2014   0
patient=2 admission=3 adm_date=. index=0 _ERROR_=1 _N_=5
NOTE: Invalid data for adm_date in line 9 9-18.
9          3  1    02/11/2011   1
patient=3 admission=1 adm_date=. index=1 _ERROR_=1 _N_=6
NOTE: Invalid data for adm_date in line 10 9-18.
10         3  2    01/12/2012   0
patient=3 admission=2 adm_date=. index=0 _ERROR_=1 _N_=7
NOTE: Invalid data for adm_date in line 11 9-18.
11         4  1    03/21/2010   0
patient=4 admission=1 adm_date=. index=0 _ERROR_=1 _N_=8
NOTE: Invalid data for adm_date in line 12 8-17.
12         4  2   04/06/2010    0
patient=4 admission=2 adm_date=. index=0 _ERROR_=1 _N_=9
NOTE: Invalid data for adm_date in line 13 8-17.
13         4  3   09/05/2015    1
patient=4 admission=3 adm_date=. index=1 _ERROR_=1 _N_=10
NOTE: Invalid data for adm_date in line 14 8-17.
14         4  4   09/11/2016    0
patient=4 admission=4 adm_date=. index=0 _ERROR_=1 _N_=11
NOTE: The data set WORK.PATIENTS has 11 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds

The resulting data set has missing values for each date, making it impossible to generate intervals between admissions.

 

Of course, most of us can make the needed corrections, but it's a good idea to make sure the sample data code does what it is supposed to do.  Could you do the correction for us?

 

Help us help you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Mazi
Pyrite | Level 9
data patients;
	format adm_date e8601da.;
	input patient admission adm_date mmddyy10. index;
	datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;

proc sort data=patients;
	by patient adm_date;
run;

data want;
	set patients;
	by patient;
	previous = _n_ - 1;
	if first.patient then readm_1y=0;
	else do;
		set patients(rename=(adm_date = _adm_date) keep=adm_date) point=previous;
		diff = adm_date - _adm_date;
		readm_1y = 30<=diff<=365;
	end;
	drop _adm_date;
run;

Could you let me know if this meets your requirements?

mkeintz
PROC Star

You might want to consider generating cutoff dates for READ_1Y and READ_30D.  Then just compare ADM_DATE to each cutoff.  The code below uses the sample DATA step provided by @Mazi:

 

data patients;
	format adm_date e8601da.;
	input patient admission adm_date mmddyy10. index;
	datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;

data want (drop=cutoff_:);
  set patients;
  by patient ;

  retain cutoff_date_1y cutoff_date_30d; 
  if first.patient then call missing(of cutoff_:);

  if index=1 then do;
    cutoff_date_1y=intnx('year',adm_date,1,'sameday');
    cutoff_date_30d=adm_date+30;
  end;

  readm_1y=(index=0 and adm_date<=cutoff_date_1y);
  readm_30d=(index=0 and adm_date<=cutoff_date_30d);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tarheel13
Rhodochrosite | Level 12
proc sort data = patients;
   by patient admission;
run;

data patients2(drop = index_date);
   format index_date e8601da.;
   set patients;
   by patient admission;
   retain index_date;
   if first.patient then index_date=.; /*set index_date to missing when you encounter new patient ID*/
   if index=1 then index_date=adm_date; /*carry forward index_date to other patient ID records within same ID*/
   if index=0 and adm_date <= index_date + 365 then readm_1y=1; 
   *if readmitted within 1 year then set readm_1y to 1, else set it to 0;
   else readm_1y=0;
proc print;
run;

I do not believe your last row is correct in your desired output.  Readmission within 1 year of 09/05/2015 is 09/05/2016. 09/11/2016 would be outside of that window.

Bumble_15
Fluorite | Level 6
Thank you so much! this worked well 🙂
Ksharp
Super User
/*Assuming I understood what you mean.*/
data patients;
	format adm_date e8601da.;
	input patient admission adm_date mmddyy10. index;
	datalines;
1 1 01/12/2013 1
1 2 06/24/2013 0
1 3 08/12/2013 0
2 1 02/10/2013 0
2 2 01/05/2014 1
2 3 03/06/2014 0
3 1 02/11/2011 1
3 2 01/12/2012 0
4 1 03/21/2010 0
4 2 04/06/2010 0
4 3 09/05/2015 1
4 4 09/11/2016 0
;
run;

data want ;
  set patients;
  by patient ;
readm_1y=0;
retain temp_date ;
if first.patient then call missing(temp_date);
if index=1 then temp_date=adm_date;
 else do;
   if not missing(temp_date) and intck('year',temp_date,adm_date,'c')=0 then readm_1y=1;
 end;
drop temp_date;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2792 views
  • 2 likes
  • 6 in conversation