Solved: Proc Tabulate-Percent and sum column with percent for only certain var...

Whitlea · Posted 03-28-2024 03:21 PM

I have a proc tabulate that gives me the sum I want but I would like to add a percent column for each clinic. I only want the percentage of clients for gender and Hispanic/non-Hispanic, for each clinic. To clarify, the denominator for these percentages is the total number of clients for that clinic. I would also really like to have the dollar and percent formats of the bottom example if it's possible. I have the formats defined in another datastep but the preloadfmt is not using those defined formats. I have tried several different things and am unable to get the results I want.

This is my code followed by my result.

proc tabulate data=Total2 format=8. missing ;  
class Clinic blankrow /preloadfmt order=data;
var Total__Days Total_count Total_Stays Total_cost avg__Days avg_cost min_age max_age median_age M_count F_count yes_count No_count ;
keylabel n=' ' sum=' ' mean=' ' std=' ' pctn=' ' pctsum=' ' ;
tables   Total_count='Clients'
blankrow=' '*[s=[foreground=white]] 
M_count="Males" 
F_Count="Females" 
blankrow=''*[s=[foreground=white]] 
yes_count='Hispanic'
No_count='Non-Hispanic'
blankrow=''*[s=[foreground=white]] 
min_age='Min Age' max_age='Max Age' median_age='Median Age'
blankrow=''*[s=[foreground=white]] 
Total__Days='Total   Days' 
Total_Stays='Numberof Stays' 
blankrow=' '*[s=[foreground=white]] 
Total_cost='Cost' 
avg__Days='Average Days' 
avg_cost='Average Cost Per Client' 
, Clinic="" all='Totals'  / rts=50 row=float misstext=' ' ;

run;

This is what I get

This is what I want:

Ksharp · Posted 04-01-2024 11:54 PM

/*
You are creating a very complicated report,
that is not suited for PROC REPORT or PROC TABULATE.
I would use PROC SQL for it.
*/

Data Have;
input  Clinic$	Gender$	ID$	Distinct_Count Cost Hispanic$ SFY$ Clinic_Type$ Stays Days Age;
datalines;
A	M	1	1	457	NO	2023	1	1	62	16
A	M	2	1	2739	NO	2023	1	1	340     17
A	M	3	1	3652	NO	2023	1	1	7 	12
A	M	4	1	5479	YE	2023	1	1	164	15
A	M	5	1	3196	NO	2023	1	1	37      14
A	M	6	1	6392	NO	2023	1	1	253     13
A	M	7	1	9588	NO	2023	1	1	51      17
A	M	8	1	8218	NO	2023	1	1	140     16
A	M	9	1	10501	NO	2023	1	1	116     15
B	F	10	1	10957	NO	2023	1	1	115     12
B	M	11	1	1826	NO	2023	1	1	156     13
B	F	12	1	10501	NO	2023	1	1	299     14
B	F	13	1	5935	NO	2023	1	1	19      15
B	F	14	1	8675	NO	2023	1	1	86      17
B	M	15	1	3196	NO	2023	1	1	179     16
B	F	16	1	6848	YE	2023	1	1	251     17
B	M	17	1	9131	NO	2023	1	1	112     17
B	F	18	1	11871	YE	2023	1	1	85      17
B	F	19	1	457	NO	2023	1	1	153     17
B	F	20	1	3196	NO	2023	1	1	37      17
;
run;



proc format;
value $gender
'M'=' Male'
'F'='Female'
;
value $eth
'YE'=' Hispanic'
'NO'='Non-Hispanic'
;
run;
proc sql;
create table report as
select 1 as id1,'GENDER' as a length=40,Clinic,
       1 as id2,put(Gender,$gender.) as b length=40,
	   1 as id3,count(*) as c format=comma20.,
	   1 as id4,count(*)/(select count(*) from have where Clinic=a.Clinic) as d format=percent8.2,
	   1 as id5,sum(cost) as e format=dollar20.,
	   1 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Gender
union all
select 2 as id1,'ETHNICITY' as a,Clinic,
       2 as id2,put(Hispanic,$eth.) as b,
	   2 as id3,count(*) as c,
	   2 as id4,count(*)/(select count(*) from have where Clinic=a.Clinic) as d format=percent8.2,
	   2 as id5,sum(cost) as e format=dollar20.,
	   2 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Hispanic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'  Min Age' as b,
	   3 as id3,min(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,' Max Age' as b,
	   3 as id3,max(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'Avg Age' as b,
	   3 as id3,mean(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'    Avg cost per client' as b,
	   6 as id3,. as c,
	   6 as id4,.,
	   6 as id5,mean(cost),
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'   Avg Days per client' as b,
	   6 as id3,mean(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'  Total Stays' as b,
	   6 as id3,sum(stays) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,' Total Days' as b,
	   6 as id3,sum(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'Total' as b,
	   6 as id3,count(*) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
;quit;
options missing=' ';
proc report data=report nowd;
column id1 a b clinic,(c d e f);
define id1/group noprint;
define a/group noprint ' ';
define b/group ' ';
define c/analysis 'Clients';
define d/analysis '% of Clients';
define e/analysis 'Cost';
define f/analysis '% of Cost';

define clinic/across '';
compute before a/style={just=l color=black};
if id1=1 then len=0;
 else len=10;
x=' ';
line x $varying10. len;
line a $40.;
endcomp;
run;

View solution in original post

ballardw · Posted 03-28-2024 05:51 PM

I STRONGLY suggest that you provide example data. Preferably in the form of working data step code.

The way you have your code with apparently separate values for Male and Female (and indeed each row) means that the tools Proc Tabulate uses to do percentages are likely not going to work at all.

Typically I would expect to see a variable like SEX with values of Male and Female, then a percentage of the count for the variable SEX can be done.

You have included the option PRELOADFMT. So we would need to know the definition of the format (Proc Format code) and which variable(s) are using which format(s).

You have also cutout the Totals column and don't show what percentages should show there.

The way tabulate builds tables when you cross dimensions then the dimension result has to be valid for all the rows in a single table. So your choice is going to involve single table or choice of report procedure.

The ODS destination may also have an impact in creating desired appearance.

Whitlea · Posted 04-01-2024 04:18 PM

After reading the responses I have decided that I need to use proc report as proc tabulate will not give me all the statistics I want. I hope you don't mind, I am changing a few things from my original post on this. Please let me know if you suggest starting a new discussion rather than continuing in this discussion. My current Proc Report code is creating columns where I want rows. I think I might need to summarize or reconstruct my data in order to get the results I want but I'm not sure how to go about that. I also wanted to add in percent of clients (percent of total clients for that particular clinic) and percent of cost columns (percent of total cost for that particular clinic). Below is my sample data, proc report and a new example of what I would like my results to look like.

Data Have;
input 

Clinic$	Gender$	ID$	Distinct_Count Cost Hispanic$ SFY$ Clinic_Type$ Stays Days Age;

datalines;
A	M	1	1	457	NO	2023	1	1	62	16
A	M	2	1	2739	NO	2023	1	1	340     17
A	M	3	1	3652	NO	2023	1	1	7 	12
A	M	4	1	5479	YE	2023	1	1	164	15
A	M	5	1	3196	NO	2023	1	1	37      14
A	M	6	1	6392	NO	2023	1	1	253     13
A	M	7	1	9588	NO	2023	1	1	51      17
A	M	8	1	8218	NO	2023	1	1	140     16
A	M	9	1	10501	NO	2023	1	1	116     15
B	F	10	1	10957	NO	2023	1	1	115     12
B	M	11	1	1826	NO	2023	1	1	156     13
B	F	12	1	10501	NO	2023	1	1	299     14
B	F	13	1	5935	NO	2023	1	1	19      15
B	F	14	1	8675	NO	2023	1	1	86      17
B	M	15	1	3196	NO	2023	1	1	179     16
B	F	16	1	6848	YE	2023	1	1	251     17
B	M	17	1	9131	NO	2023	1	1	112     17
B	F	18	1	11871	YE	2023	1	1	85      17
B	F	19	1	457	NO	2023	1	1	153     17
B	F	20	1	3196	NO	2023	1	1	37      17


;
run;

proc report data=have;
	
	columns  Age Gender Hispanic Clinic, (Distinct_Count Cost) ;
	define Cost / analysis sum "Expendeture" format=dollar12.;
    define Distinct_Count / analysis sum "Clients" format=comma10.;
	define Gender / group "Gender" style=header;
	define Hispanic / group "Ethnic" style=header;
	define Clinic / across "Table" ;
	define Age / min max mean  style=header;

run;

Ksharp · Posted 04-01-2024 11:54 PM

/*
You are creating a very complicated report,
that is not suited for PROC REPORT or PROC TABULATE.
I would use PROC SQL for it.
*/

Data Have;
input  Clinic$	Gender$	ID$	Distinct_Count Cost Hispanic$ SFY$ Clinic_Type$ Stays Days Age;
datalines;
A	M	1	1	457	NO	2023	1	1	62	16
A	M	2	1	2739	NO	2023	1	1	340     17
A	M	3	1	3652	NO	2023	1	1	7 	12
A	M	4	1	5479	YE	2023	1	1	164	15
A	M	5	1	3196	NO	2023	1	1	37      14
A	M	6	1	6392	NO	2023	1	1	253     13
A	M	7	1	9588	NO	2023	1	1	51      17
A	M	8	1	8218	NO	2023	1	1	140     16
A	M	9	1	10501	NO	2023	1	1	116     15
B	F	10	1	10957	NO	2023	1	1	115     12
B	M	11	1	1826	NO	2023	1	1	156     13
B	F	12	1	10501	NO	2023	1	1	299     14
B	F	13	1	5935	NO	2023	1	1	19      15
B	F	14	1	8675	NO	2023	1	1	86      17
B	M	15	1	3196	NO	2023	1	1	179     16
B	F	16	1	6848	YE	2023	1	1	251     17
B	M	17	1	9131	NO	2023	1	1	112     17
B	F	18	1	11871	YE	2023	1	1	85      17
B	F	19	1	457	NO	2023	1	1	153     17
B	F	20	1	3196	NO	2023	1	1	37      17
;
run;



proc format;
value $gender
'M'=' Male'
'F'='Female'
;
value $eth
'YE'=' Hispanic'
'NO'='Non-Hispanic'
;
run;
proc sql;
create table report as
select 1 as id1,'GENDER' as a length=40,Clinic,
       1 as id2,put(Gender,$gender.) as b length=40,
	   1 as id3,count(*) as c format=comma20.,
	   1 as id4,count(*)/(select count(*) from have where Clinic=a.Clinic) as d format=percent8.2,
	   1 as id5,sum(cost) as e format=dollar20.,
	   1 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Gender
union all
select 2 as id1,'ETHNICITY' as a,Clinic,
       2 as id2,put(Hispanic,$eth.) as b,
	   2 as id3,count(*) as c,
	   2 as id4,count(*)/(select count(*) from have where Clinic=a.Clinic) as d format=percent8.2,
	   2 as id5,sum(cost) as e format=dollar20.,
	   2 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Hispanic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'  Min Age' as b,
	   3 as id3,min(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,' Max Age' as b,
	   3 as id3,max(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'Avg Age' as b,
	   3 as id3,mean(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'    Avg cost per client' as b,
	   6 as id3,. as c,
	   6 as id4,.,
	   6 as id5,mean(cost),
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'   Avg Days per client' as b,
	   6 as id3,mean(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'  Total Stays' as b,
	   6 as id3,sum(stays) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,' Total Days' as b,
	   6 as id3,sum(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'Total' as b,
	   6 as id3,count(*) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
;quit;
options missing=' ';
proc report data=report nowd;
column id1 a b clinic,(c d e f);
define id1/group noprint;
define a/group noprint ' ';
define b/group ' ';
define c/analysis 'Clients';
define d/analysis '% of Clients';
define e/analysis 'Cost';
define f/analysis '% of Cost';

define clinic/across '';
compute before a/style={just=l color=black};
if id1=1 then len=0;
 else len=10;
x=' ';
line x $varying10. len;
line a $40.;
endcomp;
run;

Whitlea · Posted 04-02-2024 10:46 AM

Thank you! Wow this looks great... and as you said, complicated! Some of these clients have multiple stays I have some duplicates when I apply this to my actual data. I attempted to replace the count(*) with a count (distinct) but that did not fix the duplicates. How would I go about replacing the counts of rows to counting the distinct IDs or counting my distinct_count variable?

Thank you!!

Patrick · Posted 04-02-2024 11:49 PM

@Whitlea wrote:
Thank you! Wow this looks great... and as you said, complicated! Some of these clients have multiple stays I have some duplicates when I apply this to my actual data. I attempted to replace the count(*) with a count (distinct) but that did not fix the duplicates. How would I go about replacing the counts of rows to counting the distinct IDs or counting my distinct_count variable?

Thank you!!

Ideally de-dup your data before you feed it to @Ksharp 's code. If that's not possible then I suggest you provide representative sample data with such duplicates and then show us the issue and what you would need instead.

Different people call different things "duplicates" which is another reason why providing such representative sample data will help to avoid misunderstandings.

Ksharp · Posted 04-03-2024 02:56 AM

As Patrick said, Post some your real data and the output you are looking for, so I can test what you are meaning.

Whitlea · Posted 04-03-2024 11:06 AM

Here is a sample that includes duplicates with ID's 8 and 12 being the duplicate clients. A client can have multiple stays in the same or different clinic. I want distinct count of clients for each clinic. So If a client has more than one stay at the same clinic, I only want them counted once for that clinic. But I would like the Cost to include the cost of all stays regardless of duplicates and I would like the total spans to include all stays including duplicates. I was able to achieve unduplicated clients with adding a where span=1 statement but unfortunately that filtered the cost down to those single stays which is not what I wanted.

Data Have;
input 

Clinic$	Gender$	ID$	Distinct_Count Cost Hispanic$ SFY$ Clinic_Type$ Stays Days Age;

datalines;
A	M	1	1	457	NO	2023	1	1	62	16
A	M	2	1	2739	NO	2023	1	1	340     17
A	M	3	1	3652	NO	2023	1	1	7 	12
A	M	4	1	5479	YE	2023	1	1	164	15
A	M	5	1	3196	NO	2023	1	1	37      14
A	M	6	1	6392	NO	2023	1	1	253     13
A	M	7	1	9588	NO	2023	1	1	51      17
A	M	8	1	821 	NO	2023	1	1	40      15
A	M	8	0	105 	NO	2023	1	2	116     15
B	F	10	1	10957	NO	2023	1	1	115     12
B	M	11	1	1826	NO	2023	1	1	156     13
B	F	12	1	10501	NO	2023	1	1	299     14
B	F	12	0	5935	NO	2023	1	2	19      14
B	F	12	0	8675	NO	2023	1	3	86      14
B	M	15	1	3196	NO	2023	1	1	179     16
B	F	16	1	6848	YE	2023	1	1	251     17
B	M	17	1	9131	NO	2023	1	1	112     17
B	F	18	1	11871	YE	2023	1	1	85      17
B	F	19	1	457	NO	2023	1	1	153     17
B	F	20	1	3196	NO	2023	1	1	37      17


;
run;

Ksharp · Posted 04-03-2024 09:54 PM

But you did not post the output you want yet ?

So you just want to count distinct ID ? and keep duplicated COST STAYS ?

And I don't see any variable named "span=1".

Data Have;
input 
Clinic$	Gender$	ID$	Distinct_Count Cost Hispanic$ SFY$ Clinic_Type$ Stays Days Age;
datalines;
A	M	1	1	457	NO	2023	1	1	62	16
A	M	2	1	2739	NO	2023	1	1	340     17
A	M	3	1	3652	NO	2023	1	1	7 	12
A	M	4	1	5479	YE	2023	1	1	164	15
A	M	5	1	3196	NO	2023	1	1	37      14
A	M	6	1	6392	NO	2023	1	1	253     13
A	M	7	1	9588	NO	2023	1	1	51      17
A	M	8	1	821 	NO	2023	1	1	40      15
A	M	8	0	105 	NO	2023	1	2	116     15
B	F	10	1	10957	NO	2023	1	1	115     12
B	M	11	1	1826	NO	2023	1	1	156     13
B	F	12	1	10501	NO	2023	1	1	299     14
B	F	12	0	5935	NO	2023	1	2	19      14
B	F	12	0	8675	NO	2023	1	3	86      14
B	M	15	1	3196	NO	2023	1	1	179     16
B	F	16	1	6848	YE	2023	1	1	251     17
B	M	17	1	9131	NO	2023	1	1	112     17
B	F	18	1	11871	YE	2023	1	1	85      17
B	F	19	1	457	NO	2023	1	1	153     17
B	F	20	1	3196	NO	2023	1	1	37      17
;
run;



proc format;
value $gender
'M'=' Male'
'F'='Female'
;
value $eth
'YE'=' Hispanic'
'NO'='Non-Hispanic'
;
run;
proc sql;
create table report as
select 1 as id1,'GENDER' as a length=40,Clinic,
       1 as id2,put(Gender,$gender.) as b length=40,
	   1 as id3,count(distinct id) as c format=comma20.,
	   1 as id4,count(distinct id)/(select count(distinct id) from have where Clinic=a.Clinic) as d format=percent8.2,
	   1 as id5,sum(cost) as e format=dollar20.,
	   1 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Gender
union all
select 2 as id1,'ETHNICITY' as a,Clinic,
       2 as id2,put(Hispanic,$eth.) as b,
	   2 as id3,count(distinct id) as c,
	   2 as id4,count(distinct id)/(select count(distinct id) from have where Clinic=a.Clinic) as d format=percent8.2,
	   2 as id5,sum(cost) as e format=dollar20.,
	   2 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Hispanic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'  Min Age' as b,
	   3 as id3,min(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,' Max Age' as b,
	   3 as id3,max(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from have 
  group by Clinic
union all
select 3 as id1,'AGE' as a,Clinic,
       3 as id2,'Avg Age' as b,
	   3 as id3,mean(age) as c,
	   3 as id4,.,
	   3 as id5,.,
	   3 as id6,.
 from (select distinct Clinic,id,age from have )
  group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'    Avg cost per client' as b,
	   6 as id3,. as c,
	   6 as id4,.,
	   6 as id5,mean(cost),
	   6 as id6,.
 from (select Clinic,id,sum(cost) as cost from have group by  Clinic,id) 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'   Avg Days per client' as b,
	   6 as id3,mean(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from (select Clinic,id,sum(days) as days from have group by  Clinic,id)  
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'  Total Stays' as b,
	   6 as id3,sum(stays) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,' Total Days' as b,
	   6 as id3,sum(days) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
union all
select 6 as id1,'09'x as a,Clinic,
       6 as id2,'Total' as b,
	   6 as id3,count(distinct id) as c,
	   6 as id4,.,
	   6 as id5,.,
	   6 as id6,.
 from have 
   group by Clinic
;quit;
options missing=' ';
proc report data=report nowd;
column id1 a b clinic,(c d e f);
define id1/group noprint;
define a/group noprint ' ';
define b/group ' ';
define c/analysis 'Clients';
define d/analysis '% of Clients';
define e/analysis 'Cost';
define f/analysis '% of Cost';

define clinic/across '';
compute before a/style={just=l color=black};
if id1=1 then len=0;
 else len=10;
x=' ';
line x $varying10. len;
line a $40.;
endcomp;
run;

Whitlea · Posted 04-17-2024 07:05 PM

For the first 2 select statements, is it possible to get the distinct count of clients where cost >0 but keep the sum of cost for all clients, even those with a cost<0? I was able to adjust for this in other parts of the proc sql but not in the first 2 select statements. When adding in a where cost >0, it will give me the client count I want but eliminates $ (negative $) I want to include in the cost sum. Thanks!

Ksharp · Posted 04-17-2024 11:21 PM

Of course.you can.


proc sql;
create table report as
select 1 as id1,'GENDER' as a length=40,Clinic,
       1 as id2,put(Gender,$gender.) as b length=40,
	   1 as id3,(select count(distinct id) from have where Clinic=a.Clinic and Gender=a.Gender and cost>0) as c format=comma20.,
	   1 as id4,calculated c/(select count(distinct id) from have where Clinic=a.Clinic and cost>0) as d format=percent8.2,
	   1 as id5,sum(cost) as e format=dollar20.,
	   1 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Gender
union all
select 2 as id1,'ETHNICITY' as a,Clinic,
       2 as id2,put(Hispanic,$eth.) as b,
	   2 as id3,(select count(distinct id) from have where Clinic=a.Clinic and Hispanic=a.Hispanic and cost>0) as c,
	   2 as id4,calculated c/(select count(distinct id) from have where Clinic=a.Clinic and cost>0) as d format=percent8.2,
	   2 as id5,sum(cost) as e format=dollar20.,
	   2 as id6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.2
 from have as a
  group by Clinic,Hispanic
;
quit;

mkeintz · Posted 03-28-2024 08:22 PM

You effectively want to convert two columns into four columns. In PROC TABULATE this means you have to have a single expression for each new column (or possible a single expression, crossed by CLINIC, for both new columns.

The problem is that tabulate will apply that expression to every row - but that's not what you want. You want it for only the first 3 non-empty rows.

I think you have to examine PROC REPORT. Those with experience with proc report can better assist in developing the needed code.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

Re: Proc Tabulate-Percent and sum column with percent for only certain variables Sas9.4

SAS Innovate 2025: Call for Content

Classroom Training Available!