I want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic. I am not considering order of the stays. I do not want to count a client if they have a different stay in the same clinic (see client L).
Thank you for your help!
data have;
input
client$ Span_Begin :DATE9. Span_End :DATE9. clinic$ stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
Here are the results from the above example:
Clinic | Count | Cost |
3 | 1 | $9,588 |
4 | 1 | $13,697 |
5 | 1 | $10,501 |
8 | 1 | $2,403 |
10 | 1 | $9,588 |
11 | 1 | $5,935 |
12 | 2 | $12,358 |
Grand Total | 8 | $64,070 |
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct clinic,client from have;
create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
from temp as a,have as b
where a.client=b.client and a.clinic ne b.clinic;
create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
from temp2
group by clinic;
create table temp3 as
select distinct clinic,_clinic from temp2;
quit;
data want2;
do until(last.clinic);
set temp3;
by clinic;
length other_clinic $ 200;
other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;
data final_want;
merge want1 want2;
by clinic;
run;
Thank you for providing working code to create dataset HAVE.
You "want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic."
Your desired result for clinic 3 shows a COUNT of 1 and a COST of 9,588.
Now the only client for clinic 3 that has stays at other clinics client L, which explains why you have a COUNT of 1. But, with the following data for L
Client | Span_Begin | Span_End | Clinic | Stay | Cost |
L |
07/01/2022 |
07/15/2022 | 1 | 3 | 9,588 |
L | 07/26/2022 | 08/17/2022 | 12 | 1 | 2,060 |
L | 09/23/2022 | 02/19/2023 | 12 | 1 | 5,149 |
why does your sample desired result have COST of 9,588? Shouldn't it be 7,209 (2,060 + 5,149 for two visits to clinic 12)?
Yes, you are correct! I apologize. These should be the results:
Clinic | Count | Other Clinic | Sum of Cost |
3 | 1 | 12 | 7209 |
4 | 1 | 5 | 10501 |
5 | 1 | 4 | 13697 |
8 | 1 | 12 | 5149 |
10 | 1 | 11 | 5935 |
11 | 1 | 10 | 9588 |
12 | 2 | 3,8 | 11991 |
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct clinic,client from have;
create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
from temp as a,have as b
where a.client=b.client and a.clinic ne b.clinic;
create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
from temp2
group by clinic;
create table temp3 as
select distinct clinic,_clinic from temp2;
quit;
data want2;
do until(last.clinic);
set temp3;
by clinic;
length other_clinic $ 200;
other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;
data final_want;
merge want1 want2;
by clinic;
run;
Thank you Ksharp, this gave me exactly what I needed! Now I am trying to make a table that counts stays in a different clinic that considers the chronological order of the stays. I also want to add a readmissions column where readmissions= the count of movements/placements into a different clinic. I was able to modify your code to give me the correct client count for this but my readmissions and cost are being double counted for clients who have 2 back-to-back stays in the same clinic, followed by a stay in a different clinic. For example, client EE's readmission into clinic 4 is being counted twice where I only want it to be counted once as they only moved into a different clinic one time (from clinic 5 to clinic 4). Is there a way to fix this so it doesn't duplicate these readmissions and cost?
data have;
input client$ Span_Begin :DATE9. Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A 17Nov2022 04Jan2023 8 1 1373
B 01Jul2022 29Jul2022 12 1 2746
B 21Nov2022 04Apr2023 12 2 1373
C 01Jul2022 07Jul2022 8 1 2403
C 12Jul2022 25Jul2022 6 2 343
C 26Aug2022 15May2023 10 3 687
D 01Jul2022 12Jul2022 10 1 5479
E 01Jul2022 23Mar2023 9 1 159000
F 22Nov2022 19Mar2023 8 1 6522
G 26Aug2022 15May2023 12 1 687
H 01Oct2022 10Apr2023 6 1 5479
I 30Dec2022 19Jan2023 5 1 8675
J 01Jul2022 01Nov2022 3 1 4566
K 11Nov2022 10Jan2023 3 2 10044
L 01Jul2022 15Jul2022 3 3 9588
L 26Jul2022 17Aug2022 12 1 2060
L 23Sep2022 19Feb2023 12 1 5149
O 20Sep2022 20Feb2023 5 1 5022
P 01Jul2022 02Aug2022 5 1 913
Q 06Jul2022 01Sep2022 8 1 343
R 01Jul2022 01Sep2022 1 1 457
S 01Jul2022 24Oct2022 2 1 10957
T 01Jul2022 04Dec2022 2 1 1826
U 28Aug2022 18Sep2022 8 1 6179
V 19Dec2022 31May2023 7 2 10501
W 08Jun2023 18Jun2023 7 3 5022
X 01Jul2022 10Jul2022 8 1 3433
Y 04Aug2022 17Nov2022 10 2 7305
Z 13Oct2022 07Nov2022 8 1 2403
Z 09Feb2023 30Jun2023 12 2 5149
AA 08Sep2022 04Oct2022 8 1 2746
BB 01Jul2022 26Apr2023 2 1 10501
CC 09Aug2022 19Jun2023 12 1 1373
DD 01Jul2022 21Aug2022 10 1 9588
DD 17Oct2022 13Jun2023 11 2 5935
EE 01Jul2022 15Jul2022 5 1 3196
EE 26Jul2022 17Aug2022 5 2 7305
EE 23Sep2022 19Feb2023 4 3 13697
;;;
run;
proc sql;
create table temp as
select distinct Clinic,Client, span_begin from have;
create table temp2 as
select a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
from temp as a,have as b
where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin ;
create table want1 as
select Clinic,count(distinct Client) as count, count(client)as readmissions, sum(Cost) as sum_of_Cost
from temp2
group by Clinic;
create table temp3 as
select distinct Clinic,_Clinic from temp2;
quit;
data want2;
do until(last.Clinic);
set temp3;
by Clinic;
length other_Clinic $ 200;
other_Clinic=catx(',',other_Clinic,_Clinic);
end;
keep Clinic other_Clinic;
run;
data final_want;
merge want1 want2;
by Clinic;
run;
proc print data=final_want noobs;
run;
Just add DISTINCT keyword before SQL,and use this dataset to calculate these readmissions and cost.
create table temp2 as select distinct a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost from temp as a,have as b where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin ;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.