Hello SAS community!!!
I hope you are all doing well during these unprecedented times wherever in the world you are. And you and your families are safe and sound!!
Now to the question:
I have a dataset with multiple years of admissions and discharges. I want to be able to run a data step where I am calculating caredays dependent on the admission and discharge date. I have five years of data. So I am repeating the data statement dependent on the date variables. I cant figure out how to change the step into either a macro or array (or maybe it needs both) where I can change the dates once and it loops the data step depending on the time frames I have entered. Below is a sample of the datelines and the data step that I have, that works I just want to simplify it.
Here is the data step to make the table. This is of course a dummy table:
data have; length Hosp $1. ID $3. service $2. ; input Hosp ID service admission :mmddyy10. discharge :mmddyy10.; format admission mmddyy10. discharge mmddyy10.; datalines; A 123 IN 01/01/2018 01/05/2018 A 123 ER 04/01/2019 04/03/2019 A 123 OU 06/05/2017 06/30/2017 A 123 OU 08/01/2018 08/29/2019 A 123 ER 09/01/2019 09/05/2019 A 123 IN 10/01/2017 10/05/2017 A 456 IN 01/05/2015 01/15/2015 A 456 OU 01/25/2016 01/30/2016 A 456 IN 02/05/2018 02/15/2018 A 456 ER 02/16/2018 02/18/2019 ; run;
Following is the data step that I would like to simplify. You will notice that I have made flags for each years (see variable srv15, srv16 etc.) and then I use those in an if then do statement. Further I restrict each do loop with the admission and discharge criteria and I also calculate the care days for each year. You will notice that the do loops repeat by the years change. I tried running a macro and also an array the issue I am having is how to refer the date values.
data have2; set have; if (admission<='31dec2015'd and discharge >='01jan2015'd) then Srv15='1'; else Srv15='0'; if (admission<='31dec2016'd and discharge >='01jan2016'd ) then Srv16='1'; else Srv16='0'; if (admission<='31dec2017'd and discharge >='01jan2017'd ) then Srv17='1'; else Srv17='0'; if (admission<='31dec2018'd and discharge >='01jan2018'd ) then Srv18='1'; else Srv18='0'; if (admission<='31dec2019'd and discharge >='01jan2019'd ) then Srv19='1'; else Srv19='0'; if Srv15='1' then do; if admission <'01jan2015'd then adcajul2015='01jan2015'd; else adcajul2015=admission; if discharge >'31dec2015'd then adcrjul2015='31dec2015'd; else if discharge =< '31dec2015'd then adcrjul2015= discharge; caredays15 = adcrjul2015 - adcajul2015; end; if Srv16='1' then do; if admission <'01jan2016'd then adcajul2016='01jan2016'd; else adcajul2016=admission; if discharge >'31dec2016'd then adcrjul2016='31dec2016'd; else if discharge =< '31dec2016'd then adcrjul2016= discharge; caredays16 = adcrjul2016 - adcajul2016; end; if Srv17='1' then do; if admission <'01jan2017'd then adcajul2017='01jan2017'd; else adcajul2017=admission; if discharge >'31dec2017'd then adcrjul2017='31dec2017'd; else if discharge =< '31dec2017'd then adcrjul2017= discharge; caredays17 = adcrjul2017 - adcajul2017; end; if Srv18='1' then do; if admission <'01jan2018'd then adcajul2018='01jan2018'd; else adcajul2018=admission; if discharge >'31dec2018'd then adcrjul2018='31dec2018'd; else if discharge =< '31dec2018'd then adcrjul2018= discharge; caredays18 = adcrjul2018 - adcajul2018; end; if Srv19='1' then do; if admission <'01jan2019'd then adcajul2019='01jan2019'd; else adcajul2019=admission; if discharge >'31dec2019'd then adcrjul2019='31dec2019'd; else if discharge =< '31dec2019'd then adcrjul2019= discharge; caredays19 = adcrjul2019 - adcajul2019; end; format adcajul2015 adcrjul2015 adcajul2016 adcrjul2016 adcajul2017 adcrjul2017 adcajul2018 adcrjul2018 adcajul2019 adcrjul2019 mmddyy10.; run;
Your help would be great!
Stay Safe!
Thanks!!!
I think this will give you what you're looking for. Please double-check that I have duplicated your logic. I did modify your variable names so that they consistently have the 4-digit year in them. You should just have to modify the start_yr and end_yr macro variables if your data set contains different years.
%let start_yr=2015;
%let end_yr=2019;
data have2;
set have;
array Srv{&start_yr.:&end_yr.} $1 Srv&start_yr.-Srv&end_yr.;
array adca{&start_yr.:&end_yr.} adcajul&start_yr.-adcajul&end_yr.;
array adcr{&start_yr.:&end_yr.} adcrjul&start_yr.-adcrjul&end_yr.;
array cdays{&start_yr.:&end_yr.} caredays&start_yr.-caredays&end_yr.;
format adca: adcr: mmddyys10.;
do i = lbound(srv) to hbound(srv);
if year(admission) = i and year(discharge) = i then srv(i) = '1';
else srv(i) = '0';
if srv{i} = '1' then do;
if year(admission) < i then adca{i} = mdy(1,1,i);
else adca{i} = admission;
if year(discharge) > i then adcr{i} = mdy(12,31,i);
else adcr{i} = discharge;
cdays{i} = adcr{i} - adca{i};
end;
end;
drop i;
run;
You could use arrays and loops and macros (who is Marco?), but really, there is no point for such a short program. You'd add complexity for no good reason.
You can make it more compact (and legible) though:
if (admission<='31dec2015'd and discharge >='01jan2015'd) then Srv15='1'; else Srv15='0';
if (admission<='31dec2016'd and discharge >='01jan2016'd ) then Srv16='1'; else Srv16='0';
if (admission<='31dec2017'd and discharge >='01jan2017'd ) then Srv17='1'; else Srv17='0';
if (admission<='31dec2018'd and discharge >='01jan2018'd ) then Srv18='1'; else Srv18='0';
if (admission<='31dec2019'd and discharge >='01jan2019'd ) then Srv19='1'; else Srv19='0';
if Srv15='1' then do;
if admission <'01jan2015'd then adcajul2015='01jan2015'd;
else adcajul2015=admission;
if discharge >'31dec2015'd then adcrjul2015='31dec2015'd;
else if discharge =< '31dec2015'd then adcrjul2015= discharge;
caredays15 = adcrjul2015 - adcajul2015;
end;
can be written as:
SRV15 = (admission<='31dec2015'd and discharge >='01jan2015'd) ;
SRV16 = (admission<='31dec2016'd and discharge >='01jan2016'd) ;
SRV17 = (admission<='31dec2017'd and discharge >='01jan2017'd) ;
SRV18 = (admission<='31dec2018'd and discharge >='01jan2018'd) ;
SRV19 = (admission<='31dec2019'd and discharge >='01jan2019'd) ;
if SRV15 then do;
ADCAJUL2015 = max(ADCAJUL2015, '01jan2015'd);
ADCRJUL2015 = min(ADCRJUL2015 ,'31dec2015'd);
CAREDAYS15 = ADCRJUL2015 - ADCAJUL2015;
end;
Not sure that I fully understand what your desired result is. Does any of below two options give you what you're after? And if not what would you need differently?
data have;
length Hosp $1. ID $3. service $2.;
input Hosp ID service admission :mmddyy10. discharge :mmddyy10.;
format admission mmddyy10. discharge mmddyy10.;
datalines;
A 123 IN 01/01/2018 01/05/2018
A 123 ER 04/01/2019 04/03/2019
A 123 OU 06/05/2017 06/30/2017
A 123 OU 08/01/2018 08/29/2019
A 123 ER 09/01/2019 09/05/2019
A 123 IN 10/01/2017 10/05/2017
A 456 IN 01/05/2015 01/15/2015
A 456 OU 01/25/2016 01/30/2016
A 456 IN 02/05/2018 02/15/2018
A 456 ER 02/16/2018 02/18/2019
;
data caredays_total;
set have;
caredays=discharge -admission +1;
run;
proc print;
run;
data caredays_byYear(drop=_:);
set have;
_year_cnt=intck('year',admission,discharge);
do _i=0 to _year_cnt;
year=put(intnx('year',admission,_i,'b'),year4.);
if _year_cnt=0 then
caredays=discharge -admission +1;
else if _i<_year_cnt then
caredays=intnx('year',admission,0,'e') -admission +1;
else
caredays=discharge -intnx('year',discharge,0,'b') +1;
output;
end;
run;
proc print;
run;
@Patrick : calculating the caredays can be done easier:
data want;
set have;
do year = year(admission) to year(discharge);
caredays = min(discharge,mdy(12,31,year)) - max(admission,mdy(1,1,year)) + 1;
output;
end;
run;
@Kurt_Bremser Nice one. I "felt" already while writing the code that there must be a simpler approach but didn't have a better idea at the time.
So, in the end, you want to know the care days per patient and year?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.