BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
GBL__
Quartz | Level 8

Hello,

 

I am wondering if there is a way to NOT summarize certain statistics in the PROC TABULATE output?  In the below code, I would like to have N, SUM, and PCTSUM summarized, but not the AVG and MAX columns.  Is this possible with PROC TABULATE?

 

TITLE "WANT: Grand Totals for 'Count', 'Total', and '% of Total' ONLY'" ;

PROC TABULATE DATA=sashelp.cars MISSING ;
    VAR msrp weight mpg_city mpg_highway horsepower ;
    CLASS make / MISSING ;
    TABLE make=" " ALL="Grand Totals" ,
            N="Count"*f=comma11.
            msrp*(SUM="Total"*f=dollar18.2 PCTSUM="% of Total"*f=percentage_format.)
            weight*(MAX="Max"*f=10.)
            mpg_city*(MEAN="Average" MAX="Max")*f=10.
            mpg_highway*(MEAN="Average" MAX="Max")*f=10.
            horsepower*(MEAN="Average"*f=5.)
        / BOX={LABEL="make" STYLE=[verticalalign=bottom]} ;
    FORMAT make $upcase. ;
RUN ;

TITLE ;

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
GBL__
Quartz | Level 8

Thank you for the reply, @ballardw , I appreciated your feedback and suggestions.

 

I ended up going a slightly different route and created a dataset via PROC SQL with the statistics I needed, then used PROC REPORT to 'recreate' the PROC TABULATE output I desired.  This also had the added bonus of allowing me to have the final output ordered in descending pct sum order, as well.

 

Here is what I ended up with:

 

PROC SQL NOPRINT ;
    SELECT SUM(msrp) format=18.2 INTO :total_msrp TRIMMED
    FROM sashelp.cars ;
QUIT ;


PROC SQL ;
    CREATE TABLE work.make_classification AS
    SELECT
        make ,
        COUNT(*) AS COUNT FORMAT=comma11. ,
        SUM(msrp) AS MSRP_SUM FORMAT=dollar18.2 ,
        DIVIDE(CALCULATED MSRP_SUM, &total_msrp.) AS MSRP_PCTSUM FORMAT=percent12.3 ,
        MAX(weight) AS W_MAX FORMAT=10. ,
        AVG(mpg_city) AS MPGC_AVG FORMAT=10. ,
        MAX(mpg_city) AS MPGC_MAX FORMAT=10. ,
        AVG(mpg_highway) AS MPGH_AVG FORMAT=10. ,
        MAX(mpg_highway) AS MPGH_MAX FORMAT=10. ,
        AVG(horsepower) AS HP_AVG FORMAT=5.
    FROM
        sashelp.cars
    GROUP BY
        make
    ORDER BY
        MSRP_PCTSUM DESC ;
QUIT ;


PROC REPORT DATA=work.make_classification MISSING ;
    COLUMNS make COUNT ("msrp" MSRP_SUM MSRP_PCTSUM) ("weight" W_MAX) ("mpg_city" MPGC_AVG MPGC_MAX)
        ("mpg_highway" MPGH_AVG MPGH_MAX) ("horsepower" HP_AVG) ;

    DEFINE make / ORDER ORDER=DATA ;
    DEFINE COUNT / SUM "Count" ;
    DEFINE MSRP_SUM / SUM "Total" ;
    DEFINE MSRP_PCTSUM / SUM "% of Total" ;
    DEFINE W_MAX / DISPLAY "Max" ;
    DEFINE MPGC_AVG / DISPLAY "Average" ;
    DEFINE MPGC_MAX / DISPLAY "Max" ;
    DEFINE MPGH_AVG / DISPLAY "Average" ;
    DEFINE MPGH_MAX / DISPLAY "Max" ;
    DEFINE HP_AVG / DISPLAY "Average" ;

    RBREAK AFTER / SUMMARIZE ;

    COMPUTE AFTER ;
        make = "Totals" ;
    ENDCOMP ;
RUN ;

 

Thanks again, ballardw!

View solution in original post

2 REPLIES 2
ballardw
Super User

Here is one way using a smaller set with fewer variables and statistics that isn't horrendous for a simple table. Note that this uses the output of Proc Tabulate which can be ugly with complex tables such as nesting class variables in row and columns.

The output set has the values that you wan suppressed set to missing and then the output is created with Proc Print.

Since the output variable names in the Tabulate output are quite different you will want to assign labels to make sense in the output. I show just one.

 

Proc tabulate data=sashelp.class out=tabout;
   class sex;
   var height weight;
   table sex all='Grand total',
         n
         height*(mean sum pctsum)
         weight*max
   ;

run;

data newtab;
   length sex $ 3;
   set tabout;
   if missing(sex) then do;
   /* in this example this is the total row*/
      sex='All';
      call missing (height_mean,weight_max);
   end;
   /* assign meaningful labels*/
   label height_mean='Average Height';
run;

proc print data=newtab noobs label;
   var  sex n height_: weight_: ;
run;

If you have not seen the : list indicator that means "use all the variable names that start with" the text that comes before the colon. In this case there are 3 variables whose names start with Height_ because there are three statistics for height requested and names will have the statistic added after an underscore. You could drop some of the variables if you want in the data step before the Proc Print.

 

Proc report with a compute block in a summary may work but I don't do as well with that proc.

 

If you want to see what I meant about the the complexity with more class variables in both dimensions look at the output data set from this:

Proc tabulate data=sashelp.class out=tabout2;
   class sex age;
   var height weight;
   table sex all='Grand total',
        age*(
         n
         height*(mean sum pctsum)
         weight*max
         )
   ;
run;

At which point you need a bit more complex logic and manipulation to get similar output.

GBL__
Quartz | Level 8

Thank you for the reply, @ballardw , I appreciated your feedback and suggestions.

 

I ended up going a slightly different route and created a dataset via PROC SQL with the statistics I needed, then used PROC REPORT to 'recreate' the PROC TABULATE output I desired.  This also had the added bonus of allowing me to have the final output ordered in descending pct sum order, as well.

 

Here is what I ended up with:

 

PROC SQL NOPRINT ;
    SELECT SUM(msrp) format=18.2 INTO :total_msrp TRIMMED
    FROM sashelp.cars ;
QUIT ;


PROC SQL ;
    CREATE TABLE work.make_classification AS
    SELECT
        make ,
        COUNT(*) AS COUNT FORMAT=comma11. ,
        SUM(msrp) AS MSRP_SUM FORMAT=dollar18.2 ,
        DIVIDE(CALCULATED MSRP_SUM, &total_msrp.) AS MSRP_PCTSUM FORMAT=percent12.3 ,
        MAX(weight) AS W_MAX FORMAT=10. ,
        AVG(mpg_city) AS MPGC_AVG FORMAT=10. ,
        MAX(mpg_city) AS MPGC_MAX FORMAT=10. ,
        AVG(mpg_highway) AS MPGH_AVG FORMAT=10. ,
        MAX(mpg_highway) AS MPGH_MAX FORMAT=10. ,
        AVG(horsepower) AS HP_AVG FORMAT=5.
    FROM
        sashelp.cars
    GROUP BY
        make
    ORDER BY
        MSRP_PCTSUM DESC ;
QUIT ;


PROC REPORT DATA=work.make_classification MISSING ;
    COLUMNS make COUNT ("msrp" MSRP_SUM MSRP_PCTSUM) ("weight" W_MAX) ("mpg_city" MPGC_AVG MPGC_MAX)
        ("mpg_highway" MPGH_AVG MPGH_MAX) ("horsepower" HP_AVG) ;

    DEFINE make / ORDER ORDER=DATA ;
    DEFINE COUNT / SUM "Count" ;
    DEFINE MSRP_SUM / SUM "Total" ;
    DEFINE MSRP_PCTSUM / SUM "% of Total" ;
    DEFINE W_MAX / DISPLAY "Max" ;
    DEFINE MPGC_AVG / DISPLAY "Average" ;
    DEFINE MPGC_MAX / DISPLAY "Max" ;
    DEFINE MPGH_AVG / DISPLAY "Average" ;
    DEFINE MPGH_MAX / DISPLAY "Max" ;
    DEFINE HP_AVG / DISPLAY "Average" ;

    RBREAK AFTER / SUMMARIZE ;

    COMPUTE AFTER ;
        make = "Totals" ;
    ENDCOMP ;
RUN ;

 

Thanks again, ballardw!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 556 views
  • 0 likes
  • 2 in conversation