Hello,
I have a complicated sas code question (well complicated for myself, but I am sure not for the folks here, as the sas community has helped me so many times in the past).
I have two dataset, one data set has admission and discharge visits by clients (data have1 below) and the other data set has a record of services by visit type provided each day (data have 2 below) between the admission and discharge. I want to join the two datasets into one where the count of visits by visit type that occurred within the admission and discharge date is added to the joined table and if no services provided then the output for that row is 0 (see table below for what I am looking for).
You will note that some services occurred outside the admission and discharge time period, as such I do not want to count them, and some services didn't occur at all as such we have 0.
Suggestions on how to accomplish this? either via SAS datasets or even proc SQL steps?
data have1;
informat client 2. admit_date mmddyy10. discharge_date mmddyy10.;
input client admit_date discharge_date;
format admit_date discharge_date mmddyy10.;
datalines;
1 04122021 04202021
1 05012021 05132021
1 06142021 06202021
2 04152021 04252021
2 04292021 05162021
3 04172021 04252021
4 04182021 04192021
4 04222021 04242021
;
run;
data have2;
informat client 2. date mmddyy10. visit_type $1.;
input client date visit_type;
format date mmddyy10.;
datalines;
1 04122021 1
1 04132021 1
1 04142021 1
1 04152021 2
1 04162021 2
1 04182021 1
1 06142021 1
1 06182021 1
1 06202021 1
2 04172021 2
2 04182021 2
2 06162021 1
3 04222021 1
;
run;
Table of what I am looking for:
client | admission date | discharge date | visit type 1 count | visit type 2 count |
1 | 4/12/2021 | 4/20/2021 | 4 | 4 |
1 | 5/1/2021 | 5/13/2021 | 0 | 0 |
1 | 6/14/2021 | 6/20/2021 | 3 | 0 |
2 | 4/15/2021 | 4/25/2021 | 0 | 2 |
2 | 4/29/2021 | 5/16/2021 | 0 | 0 |
3 | 4/17/2021 | 4/25/2021 | 1 | 0 |
4 | 4/18/2021 | 4/19/2021 | 0 | 0 |
4 | 4/22/2021 | 4/24/2021 | 0 | 0 |
Thank you in advance!
Below one way to go. Thank you for providing sample data that didn't require any rework.
data have1;
informat client 2. admit_date mmddyy10. discharge_date mmddyy10.;
input client admit_date discharge_date;
format admit_date discharge_date mmddyy10.;
datalines;
1 04122021 04202021
1 05012021 05132021
1 06142021 06202021
2 04152021 04252021
2 04292021 05162021
3 04172021 04252021
4 04182021 04192021
4 04222021 04242021
;
data have2;
informat client 2. date mmddyy10. visit_type $1.;
input client date visit_type;
format date mmddyy10.;
datalines;
1 04122021 1
1 04132021 1
1 04142021 1
1 04152021 2
1 04162021 2
1 04182021 1
1 06142021 1
1 06182021 1
1 06202021 1
2 04172021 2
2 04182021 2
2 06162021 1
3 04222021 1
;
proc sql;
select
h1.*,
sum
(
case
when h2.visit_type='1' then 1
else 0
end
) as visit_type_1_cnt,
sum
(
case
when h2.visit_type='2' then 1
else 0
end
) as visit_type_2_cnt
from
have1 h1
left join
have2 h2
on h1.client=h2.client and h2.date between h1.admit_date and h1.discharge_date
group by h1.client, h1.admit_date, h1.discharge_date
;
quit;
P.S: I've added the discharge_date to the Group By clause just in case you've got DQ issues with overlapping date ranges.
Tôi là Thiep Nguyen, CEO công ty Tín Hưng chuyên lắp đặt và thi công kho lạnh giá tốt nhất. https://kholanhtinhung.com/
data have1;
informat client 2. admit_date mmddyy10. discharge_date mmddyy10.;
input client admit_date discharge_date;
format admit_date discharge_date mmddyy10.;
datalines;
1 04122021 04202021
1 05012021 05132021
1 06142021 06202021
2 04152021 04252021
2 04292021 05162021
3 04172021 04252021
4 04182021 04192021
4 04222021 04242021
;
data have2;
informat client 2. date mmddyy10. visit_type $1.;
input client date visit_type;
format date mmddyy10.;
datalines;
1 04122021 1
1 04132021 1
1 04142021 1
1 04152021 2
1 04162021 2
1 04182021 1
1 06142021 1
1 06182021 1
1 06202021 1
2 04172021 2
2 04182021 2
2 06162021 1
3 04222021 1
;
data want(drop = date visit_type);
if _N_ = 1 then do;
dcl hash h(dataset : "have2", multidata : "Y");
h.definekey("client");
h.definedata("date", "visit_type");
h.definedone();
end;
set have1;
if 0 then set have2;
visittype1count = 0;
visittype2count = 0;
do while (h.do_over() = 0);
if admit_date <= date <= discharge_date & visit_type = "1" then visittype1count + 1;
else if admit_date <= date <= discharge_date & visit_type = "2" then visittype2count + 1;
end;
run;
Result:
client admit_date discharge_date visittype1count visittype2count 1 04/12/2021 04/20/2021 4 2 1 05/01/2021 05/13/2021 0 0 1 06/14/2021 06/20/2021 3 0 2 04/15/2021 04/25/2021 0 2 2 04/29/2021 05/16/2021 0 0 3 04/17/2021 04/25/2021 1 0 4 04/18/2021 04/19/2021 0 0 4 04/22/2021 04/24/2021 0 0
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.