BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Whitlea
Obsidian | Level 7

 I want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic.  I am not considering order of the stays. I do not want to count a client if they have a different stay in the same clinic (see client L).

Thank you for your help!

data have;
input
client$	Span_Begin :DATE9.	Span_End :DATE9. clinic$ stay cost;
 

format Span_Begin MMDDYY10. Span_End MMDDYY10. ;

datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697

;;;
run;

 

Here are the results from the above example:

 

 

Clinic  Count  Cost
3 1 $9,588
4 1 $13,697
5 1 $10,501
8 1 $2,403
10 1 $9,588
11 1 $5,935
12 2 $12,358
Grand Total 8 $64,070

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

 

 

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;

proc sql;
create table temp as
select distinct clinic,client from have;

create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
 from temp as a,have as b
  where a.client=b.client and a.clinic ne b.clinic;

create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
 from temp2
  group by clinic;

create table temp3 as
select distinct clinic,_clinic from temp2;
quit;

data want2;
do until(last.clinic);
 set temp3;
 by clinic;
 length other_clinic $ 200;
 other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;


data final_want;
 merge want1 want2;
 by clinic;
run;

View solution in original post

5 REPLIES 5
mkeintz
PROC Star

Thank you for providing working code to create dataset HAVE.

 

You "want to know how many clients in each clinic stayed at a different clinic and the cost of the stay at the different clinic."

 

Your desired result for clinic 3 shows a COUNT of 1 and a COST of 9,588.

 

Now the only client for clinic 3 that has stays at other clinics client L, which explains why you have a COUNT of 1.  But, with the following data for L

Client Span_Begin Span_End Clinic Stay Cost
L

07/01/2022

07/15/2022 1 3 9,588
L 07/26/2022 08/17/2022 12 1 2,060
L 09/23/2022 02/19/2023 12 1 5,149

 

why does your sample desired result have COST of 9,588?  Shouldn't it be 7,209 (2,060 + 5,149 for two visits to clinic 12)?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Whitlea
Obsidian | Level 7

Yes, you are correct! I apologize. These should be the results:

Clinic  Count Other Clinic Sum of Cost
3 1 12 7209
4 1 5 10501
5 1 4 13697
8 1 12 5149
10 1 11 5935
11 1 10 9588
12 2 3,8 11991
Ksharp
Super User

 

 

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;

proc sql;
create table temp as
select distinct clinic,client from have;

create table temp2 as
select a.clinic,a.client,b.clinic as _clinic,b.cost as cost
 from temp as a,have as b
  where a.client=b.client and a.clinic ne b.clinic;

create table want1 as
select clinic,count(distinct client) as count,sum(cost) as sum_of_cost
 from temp2
  group by clinic;

create table temp3 as
select distinct clinic,_clinic from temp2;
quit;

data want2;
do until(last.clinic);
 set temp3;
 by clinic;
 length other_clinic $ 200;
 other_clinic=catx(',',other_clinic,_clinic);
end;
keep clinic other_clinic;
run;


data final_want;
 merge want1 want2;
 by clinic;
run;
Whitlea
Obsidian | Level 7

Thank you Ksharp, this gave me exactly what I needed!  Now I am trying to make a table that counts stays in a different clinic that considers the chronological order of the stays. I also want to add a readmissions column where readmissions= the count of movements/placements into a different clinic.  I was able to modify your code to give me the correct client count for this but my readmissions and cost are being double counted for clients who have 2 back-to-back stays in the same clinic, followed by a stay in a different clinic. For example, client EE's readmission into clinic 4 is being counted twice where I only want it to be counted once as they only moved into a different clinic one time (from clinic 5 to clinic 4).  Is there a way to fix this so it doesn't duplicate these readmissions and cost?  

data have;
input client$	Span_Begin :DATE9.	Span_End :DATE9. clinic stay cost;
format Span_Begin MMDDYY10. Span_End MMDDYY10. ;
datalines;
A	17Nov2022	04Jan2023	8	1	1373
B	01Jul2022	29Jul2022	12	1	2746
B	21Nov2022	04Apr2023	12	2	1373
C	01Jul2022	07Jul2022	8	1	2403
C   12Jul2022	25Jul2022   6   2   343
C   26Aug2022	15May2023   10  3   687
D	01Jul2022	12Jul2022	10	1	5479
E	01Jul2022	23Mar2023	9	1	159000
F	22Nov2022	19Mar2023	8	1	6522
G	26Aug2022	15May2023	12	1	687
H	01Oct2022	10Apr2023	6	1	5479
I	30Dec2022	19Jan2023	5	1	8675
J	01Jul2022	01Nov2022	3	1	4566
K	11Nov2022	10Jan2023	3	2	10044
L	01Jul2022	15Jul2022	3	3	9588
L	26Jul2022	17Aug2022	12	1	2060
L	23Sep2022	19Feb2023	12	1	5149
O	20Sep2022	20Feb2023	5	1	5022
P	01Jul2022	02Aug2022	5	1	913
Q	06Jul2022	01Sep2022	8	1	343
R	01Jul2022	01Sep2022	1	1	457
S	01Jul2022	24Oct2022	2	1	10957
T	01Jul2022	04Dec2022	2	1	1826
U	28Aug2022	18Sep2022	8	1	6179
V	19Dec2022	31May2023	7	2	10501
W	08Jun2023	18Jun2023	7	3	5022
X	01Jul2022	10Jul2022	8	1	3433
Y	04Aug2022	17Nov2022	10	2	7305
Z	13Oct2022	07Nov2022	8	1	2403
Z	09Feb2023	30Jun2023	12	2	5149
AA	08Sep2022	04Oct2022	8	1	2746
BB	01Jul2022	26Apr2023	2	1	10501
CC	09Aug2022	19Jun2023	12	1	1373
DD	01Jul2022	21Aug2022	10	1	9588
DD	17Oct2022	13Jun2023	11	2	5935
EE	01Jul2022	15Jul2022	5	1	3196
EE	26Jul2022	17Aug2022	5	2	7305
EE	23Sep2022	19Feb2023	4	3	13697
;;;
run;
proc sql;
create table temp as
select distinct Clinic,Client, span_begin from have;

create table temp2 as
select a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
 from temp as a,have as b
  where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin  ;

create table want1 as
select Clinic,count(distinct Client) as count, count(client)as readmissions, sum(Cost) as sum_of_Cost
 from temp2
  group by Clinic;

create table temp3 as
select distinct Clinic,_Clinic from temp2;
quit;

data want2;
do until(last.Clinic);
 set temp3;
 by Clinic;
 length other_Clinic $ 200;
 other_Clinic=catx(',',other_Clinic,_Clinic);
end;
keep Clinic other_Clinic;
run;

data final_want;
 merge want1 want2;
 by Clinic;
run;

proc print data=final_want noobs;
run;
Ksharp
Super User

Just add DISTINCT keyword before SQL,and use this dataset to calculate  these readmissions and cost.

create table temp2 as
select distinct a.Clinic,a.Client,b.Clinic as _Clinic,b.Cost as Cost
 from temp as a,have as b
  where a.Client=b.Client and a.Clinic ne _Clinic and a.Span_Begin lt b.Span_Begin  ;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 928 views
  • 2 likes
  • 3 in conversation