I want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic. I am not considering order of the stays. I do not want to count a client if they have a different stay in the same clinic (see client L).
Thank you for your help!
data have;
input
client$ Span_Begin :DATE9. Span_End :DATE9. clinic$ stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
Here are the results from the above example:
Clinic | Count | Cost |
3 | 1 | $9,588 |
4 | 1 | $13,697 |
5 | 1 | $10,501 |
8 | 1 | $2,403 |
10 | 1 | $9,588 |
11 | 1 | $5,935 |
12 | 2 | $12,358 |
Grand Total | 8 | $64,070 |
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct clinic,client from have;
create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
from temp as a,have as b
where a.client=b.client and a.clinic ne b.clinic;
create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
from temp2
group by clinic;
create table temp3 as
select distinct clinic,_clinic from temp2;
quit;
data want2;
do until(last.clinic);
set temp3;
by clinic;
length other_clinic $ 200;
other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;
data final_want;
merge want1 want2;
by clinic;
run;
Thank you for providing working code to create dataset HAVE.
You "want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic."
Your desired result for clinic 3 shows a COUNT of 1 and a COST of 9,588.
Now the only client for clinic 3 that has stays at other clinics client L, which explains why you have a COUNT of 1. But, with the following data for L
Client | Span_Begin | Span_End | Clinic | Stay | Cost |
L |
07/01/2022 |
07/15/2022 | 1 | 3 | 9,588 |
L | 07/26/2022 | 08/17/2022 | 12 | 1 | 2,060 |
L | 09/23/2022 | 02/19/2023 | 12 | 1 | 5,149 |
why does your sample desired result have COST of 9,588? Shouldn't it be 7,209 (2,060 + 5,149 for two visits to clinic 12)?
Yes, you are correct! I apologize. These should be the results:
Clinic | Count | Other Clinic | Sum of Cost |
3 | 1 | 12 | 7209 |
4 | 1 | 5 | 10501 |
5 | 1 | 4 | 13697 |
8 | 1 | 12 | 5149 |
10 | 1 | 11 | 5935 |
11 | 1 | 10 | 9588 |
12 | 2 | 3,8 | 11991 |
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct clinic,client from have;
create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
from temp as a,have as b
where a.client=b.client and a.clinic ne b.clinic;
create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
from temp2
group by clinic;
create table temp3 as
select distinct clinic,_clinic from temp2;
quit;
data want2;
do until(last.clinic);
set temp3;
by clinic;
length other_clinic $ 200;
other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;
data final_want;
merge want1 want2;
by clinic;
run;
Thank you Ksharp, this gave me exactly what I needed! Now I am trying to make a table that counts stays in a different clinic that considers the chronological order of the stays. I also want to add a readmissions column where readmissions= the count of movements/placements into a different clinic. I was able to modify your code to give me the correct client count for this but my readmissions and cost are being double counted for clients who have 2 back-to-back stays in the same clinic, followed by a stay in a different clinic. For example, client EE's readmission into clinic 4 is being counted twice where I only want it to be counted once as they only moved into a different clinic one time (from clinic 5 to clinic 4). Is there a way to fix this so it doesn't duplicate these readmissions and cost?
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
C 12Jul2022 25Jul2022 6 2 343
C 26Aug2022 15May2023 10 3 687
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct Clinic,Client, span_begin from have;
create table temp2 as
select a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
from temp as a,have as b
where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin ;
create table want1 as
select Clinic,count(distinct Client) as count, count(client)as readmissions, sum(Cost) as sum_of_Cost
from temp2
group by Clinic;
create table temp3 as
select distinct Clinic,_Clinic from temp2;
quit;
data want2;
do until(last.Clinic);
set temp3;
by Clinic;
length other_Clinic $ 200;
other_Clinic=catx(',',other_Clinic,_Clinic);
end;
keep Clinic other_Clinic;
run;
data final_want;
merge want1 want2;
by Clinic;
run;
proc print data=final_want noobs;
run;
Just add DISTINCT keyword before SQL,and use this dataset to calculate these readmissions and cost.
create table temp2 as select distinct a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost from temp as a,have as b where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.