BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Whitlea
Obsidian | Level 7

 I want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic.  I am not considering order of the stays. I do not want to count a client if they have a different stay in the same clinic (see client L).

Thank you for your help!

data have;
input
client$	Span_Begin :DATE9.	Span_End :DATE9. clinic$ stay cost;
 

format Span_Begin MMDDYY10. Span_End MMDDYY10. ;

datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697

;;;
run;

 

Here are the results from the above example:

 

 

Clinic  Count  Cost
3 1 $9,588
4 1 $13,697
5 1 $10,501
8 1 $2,403
10 1 $9,588
11 1 $5,935
12 2 $12,358
Grand Total 8 $64,070

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

 

 

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;

proc sql;
create table temp as
select distinct clinic,client from have;

create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
 from temp as a,have as b
  where a.client=b.client and a.clinic ne b.clinic;

create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
 from temp2
  group by clinic;

create table temp3 as
select distinct clinic,_clinic from temp2;
quit;

data want2;
do until(last.clinic);
 set temp3;
 by clinic;
 length other_clinic $ 200;
 other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;


data final_want;
 merge want1 want2;
 by clinic;
run;

View solution in original post

5 REPLIES 5
mkeintz
PROC Star

Thank you for providing working code to create dataset HAVE.

 

You "want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic."

 

Your desired result for clinic 3 shows a COUNT of 1 and a COST of 9,588.

 

Now the only client for clinic 3 that has stays at other clinics client L, which explains why you have a COUNT of 1.  But, with the following data for L

Client Span_Begin Span_End Clinic Stay Cost
L

07/01/2022

07/15/2022 1 3 9,588
L 07/26/2022 08/17/2022 12 1 2,060
L 09/23/2022 02/19/2023 12 1 5,149

 

why does your sample desired result have COST of 9,588?  Shouldn't it be 7,209 (2,060 + 5,149 for two visits to clinic 12)?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Whitlea
Obsidian | Level 7

Yes, you are correct! I apologize. These should be the results:

Clinic  Count Other Clinic Sum of Cost
3 1 12 7209
4 1 5 10501
5 1 4 13697
8 1 12 5149
10 1 11 5935
11 1 10 9588
12 2 3,8 11991
Ksharp
Super User

 

 

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;

proc sql;
create table temp as
select distinct clinic,client from have;

create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
 from temp as a,have as b
  where a.client=b.client and a.clinic ne b.clinic;

create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
 from temp2
  group by clinic;

create table temp3 as
select distinct clinic,_clinic from temp2;
quit;

data want2;
do until(last.clinic);
 set temp3;
 by clinic;
 length other_clinic $ 200;
 other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;


data final_want;
 merge want1 want2;
 by clinic;
run;
Whitlea
Obsidian | Level 7

Thank you Ksharp, this gave me exactly what I needed!  Now I am trying to make a table that counts stays in a different clinic that considers the chronological order of the stays. I also want to add a readmissions column where readmissions= the count of movements/placements into a different clinic.  I was able to modify your code to give me the correct client count for this but my readmissions and cost are being double counted for clients who have 2 back-to-back stays in the same clinic, followed by a stay in a different clinic. For example, client EE's readmission into clinic 4 is being counted twice where I only want it to be counted once as they only moved into a different clinic one time (from clinic 5 to clinic 4).  Is there a way to fix this so it doesn't duplicate these readmissions and cost?  

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
C   12Jul2022	25Jul2022   6   2   343
C   26Aug2022	15May2023   10  3   687
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;
proc sql;
create table temp as
select distinct Clinic,Client, span_begin from have;

create table temp2 as
select a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
 from temp as a,have as b
  where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin  ;

create table want1 as
select Clinic,count(distinct Client) as count, count(client)as readmissions, sum(Cost) as sum_of_Cost
 from temp2
  group by Clinic;

create table temp3 as
select distinct Clinic,_Clinic from temp2;
quit;

data want2;
do until(last.Clinic);
 set temp3;
 by Clinic;
 length other_Clinic $ 200;
 other_Clinic=catx(',',other_Clinic,_Clinic);
end;
keep Clinic other_Clinic;
run;

data final_want;
 merge want1 want2;
 by Clinic;
run;

proc print data=final_want noobs;
run;
Ksharp
Super User

Just add DISTINCT keyword before SQL,and use this dataset to calculate  these readmissions and cost.

create table temp2 as
select distinct a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
 from temp as a,have as b
  where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin  ;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1137 views
  • 2 likes
  • 3 in conversation