DATA Step, Macro, Functions and more

PROC Tabulate: displaying counts and percentages with multiple class variables

Accepted Solution Solved
Reply
Contributor
Posts: 42
Accepted Solution

PROC Tabulate: displaying counts and percentages with multiple class variables

I'm using SAS 9.4.  I ran PROC Tabulate with just counts, and it worked great.  Below is the syntax I used and portion of the output:

 

proc tabulate data=noncompl.deliverhmveverelig format=8.;
class fplgroup region agegrp race firsthmpeligmonth
  spoken_language gender / MISSING;
table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
  firsthmpeligmonth='First HMP Elig Month'
  spoken_language='Language' gender='Gender')*n='', region='Region' all='Total';
run;

 

SAS Output

The SAS System

  Region Total
1: UP/NW/NE 2: W/EC/E 3: SC/SW/SE 4: DET
FPL Group Age Group   5117 21747 14426 42839 84129
1: 0-35% 19-34  
35-49   3586 15179 10168 28251 57184
50-64   4579 15298 10081 27934 57892
Race   11988 35230 22963 36807 106988
1-Non-Migrant White, not of Hispanic Origin  
2-Non-Migrant Black, not of Hispanic Origin   128 8713 5655 44500 58996
3-Non-Migrant American Indian or Alaskan Native   142 400 238 472 1252

 

But my boss wants me to add row and column percentages, but I'm having trouble figuring out where to add the necessary code.  I've tried several variations of the following:

 

proc tabulate data=noncompl.deliverhmveverelig format=8.;

 

class fplgroup region agegrp race firsthmpeligmonth

  spoken_language gender / MISSING;

 

table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'

  firsthmpeligmonth='First HMP Elig Month'

  spoken_language='Language' gender='Gender')*n='', region='Region' all='Total',

 

              pctn<agegrp all>='Percent of row total'

              pctn<race all>='Percent of row total'

              pctn<firsthmpeligmonth all>='Percent of row total'

              pctn<spoken_language all>='Percent of row total'

              pctn<gender all>='Percent of row total'

 

              pctn<region all>='Percent of column total'

              pctn='Percent of total'; 

  

run;

 

But everything I've tried gives me a variation of the following errors:

 

ERROR: There are multiple statistics associated with a single table cell in the following
       nesting : fplgroup * PctN * N * region.
ERROR: There are multiple statistics associated with a single table cell in the following
       nesting : fplgroup * PctN * N * All.


Accepted Solutions
Solution
‎12-19-2016 10:49 AM
SAS Super FREQ
Posts: 8,864

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Hi:

  Using SASHELP.HEART, here's a visual on the difference between COLPCTN and using a custom denominator to calculate a percent.

cynthia

 

tabulate_custom_percents.png

View solution in original post


All Replies
Super User
Posts: 11,343

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Wolverine

Please see the code commented below:

proc tabulate data=noncompl.deliverhmveverelig format=8.;
 
   class fplgroup region agegrp race firsthmpeligmonth
  spoken_language gender / MISSING;
 
   table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
            firsthmpeligmonth='First HMP Elig Month'
               spoken_language='Language' gender='Gender')*n='',   /*<= this comma says this
                                                                    is PAGE dimeension*/

               region='Region' all='Total',       /* <= this comma says Region is in the ROW*/
 
              pctn<agegrp all>='Percent of row total'
              pctn<race all>='Percent of row total'
              pctn<firsthmpeligmonth all>='Percent of row total'
              pctn<spoken_language all>='Percent of row total'
              pctn<gender all>='Percent of row total'
 
              pctn<region all>='Percent of column total'
              pctn='Percent of total';         /*<= column dimension*/
run;
  
run;

Since you are requesting an N statistic in the Page (?) then compared with code below then you have the multiple statistic cross error.

 

I suspect that you really did not want the N in the first bit and possibly not 3 dimensions. Perhaps something more like:

proc tabulate data=noncompl.deliverhmveverelig format=8.;
 
   class fplgroup region agegrp race firsthmpeligmonth
  spoken_language gender / MISSING;
 
   table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
            firsthmpeligmonth='First HMP Elig Month'
               spoken_language='Language' gender='Gender')*n='',   

              ( region='Region' all='Total' ) *
              (n
 
              pctn<agegrp all>='Percent of row total'
              pctn<race all>='Percent of row total'
              pctn<firsthmpeligmonth all>='Percent of row total'
              pctn<spoken_language all>='Percent of row total'
              pctn<gender all>='Percent of row total'
 
              pctn<region all>='Percent of column total'
              pctn='Percent of total');        
run;
  

Without data and a desired result it is harder to be more precise in suggestions.

 

Also you may want to investigate the RowPctN and ColPct depending onwhat is desired.

SAS Super FREQ
Posts: 8,864

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Wolverine

Hi:

  My suggestion is that you start using the automatic percent statistics into the table one by one instead of doing them in bulk and with custom denominators.

 

  In your ORIGINAL code, the table structure was defined like this:

table

   fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
  firsthmpeligmonth='First HMP Elig Month'
  spoken_language='Language' gender='Gender')*n='',

   region='Region' all='Total';

 

where all the RED is the ROW dimension (the variable headers on the left side of the table) and the GREEN is the COLUMN dimension (the variable headers at the top of the table). With only 1 comma in the TABLE statement, you have a 2 dimensional table -- with only a ROW and a COLUMN dimension.

 

But in your changed TABLE statement , you have completely changed the structure of the table:

table

  fplgroup='FPL Group'*(agegrp='Age Group' race='Race'

  firsthmpeligmonth='First HMP Elig Month'

  spoken_language='Language' gender='Gender')*n='',

 

       region='Region' all='Total',

 

              pctn<agegrp all>='Percent of row total'

              pctn<race all>='Percent of row total'

              pctn<firsthmpeligmonth all>='Percent of row total'

              pctn<spoken_language all>='Percent of row total'

              pctn<gender all>='Percent of row total'

 

              pctn<region all>='Percent of column total'

              pctn='Percent of total'

 

With 2 commas in the TABLE statement, you now have a 3 dimensional table with a PAGE dimension added to the table structure. Notice how your percents are shown in green, that's because when you added them after a comma in the TABLE statement, you told TABULATE to move them all into the COLUMN dimension, the REGION, ALL are being told to move into the ROW dimension and the entire list of other variables are now in the PAGE dimension (when originally, they were in the ROW dimension) -- so your new table statement, in addition to being broken, will be a completely different structure than what you originally posted.

 

You can only request a statistic in one dimension with TABULATE. Since you have the percents requested in the COLUMN dimension, you have to take the *n=' ' out of the PAGE dimension so that there is only one dimension (column) with a request for statistics. However, I doubt that the resulting tables will be what you want -- structure wise, at least.

 

  As an example, run this code using SASHELP.HEART to get an idea of how the dimensions work. Notice that if I want CHOL_STATUS to ALWAYS be in the columns, with percents, then CHOL_STATUS and the statistics need to be specified after the last comma in the TABLE statement.

 

cynthia

 

proc tabulate data=sashelp.heart;
  title '1) column dimension only';
  class chol_status bp_status smoking_status sex;
   
  table chol_status*(n pctn);
run;
  
proc tabulate data=sashelp.heart;
  title '2) row and column dimension only';
  class chol_status bp_status smoking_status sex;
   
  table bp_status*sex all, 
        chol_status*(n pctn) all*(n pctn);
run;
  
proc tabulate data=sashelp.heart;
  title '3) page, row and column dimension only';
  class chol_status bp_status smoking_status sex;
   
  table smoking_status all,
        bp_status*sex all, 
        chol_status*(n pctn) all*(n pctn);
run;
Contributor
Posts: 42

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Cynthia_sas

Thank you Cynthia -- that was very helpful!  I took a portion of your example syntax and removed the interaction between bp_status and sex.  I also added in smoking_status.  This creates the output I'm looking for:

 

proc tabulate data=sashelp.heart;
  title '2) row and column dimension only';
  class chol_status bp_status smoking_status sex;
   
  table bp_status sex smoking_status all,
        chol_status*(n pctn) all*(n pctn);
run;

 

However, when I tried to apply these changes to my own syntax, there is an extra wrinkle because I have an extra level (the categories under FPL Group).  I also added in a format statement to get the percentages to display correctly.  The resulting output is very close to what I want, but the percentages don't make sense.  I'm not sure what they're supposed to represent.

 

Here is the latest version of my syntax and partial output:

 

proc format;
        picture pctfmt (round) other='009.99%';
run;

proc tabulate data=noncompl.deliverhmveverelig format=8.;
 
   class fplgroup region agegrp race firsthmpeligmonth
  spoken_language gender / MISSING;
 
   table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
            firsthmpeligmonth='First HMP Elig Month'
               spoken_language='Language' gender='Gender')all,   
    Region*(n pctn='%'*f=pctfmt.) all*(n pctn='%'*f=pctfmt.);
 
run;

 

SAS Output

  region All
1: UP/NW/NE 2: W/EC/E 3: SC/SW/SE 4: DET
N % N % N % N % N %
FPL Group Age Group 5117 1.33% 21747 5.66% 14426 3.75% 42839 11.15% 84129 21.89%
1: 0-35% 19-34
35-49 3586 0.93% 15179 3.95% 10168 2.65% 28251 7.35% 57184 14.88%
50-64 4579 1.19% 15298 3.98% 10081 2.62% 27934 7.27% 57892 15.07%
Race 11988 3.12% 35230 9.17% 22963 5.98% 36807 9.58% 106988 27.84%
1-Non-Migrant White, not of Hispanic Origin
2-Non-Migrant Black, not of Hispanic Origin 128 0.03% 8713 2.27% 5655 1.47% 44500 11.58% 58996 15.35%
3-Non-Migrant American Indian or Alaskan Native 142 0.04% 400 0.10% 238 0.06% 472 0.12% 1252 0.33%
4-Non-Migrant Race not included elsewhere 9 0.00% 123 0.03% 122 0.03% 683 0.18% 937 0.24%
5-Non-Migrant Unknown 733 0.19% 4999 1.30% 3699 0.96% 11730 3.05% 21161 5.51%
6-Non-Migrant Hispanic 146 0.04% 2135 0.56% 1317 0.34% 2321 0.60% 5919 1.54%
8-Non-Migrant Native Hawaiian 1 0.00% 6 0.00% 6 0.00% 19 0.00% 32 0.01%
9-Non-Migrant Pacific Islander 2 0.00% 6 0.00% 9 0.00% 27 0.01% 44 0.01%
E-Migrant Unknown . . 1 0.00% . . . . 1 0.00%
F-Migrant Hispanic . . 5 0.00% 5 0.00% . . 10 0.00%
K-Non-Migrant Asian Indian 7 0.00% 53 0.01% 67 0.02% 349 0.09% 476 0.12%
L-Non-Migrant Chinese 4 0.00% 19 0.00% 42 0.01% 81 0.02% 146 0.04%
M-Non-Migrant Filipino 2 0.00% 14 0.00% 21 0.01% 53 0.01% 90 0.02%
R-Non-Migrant Guamanian or Chamorro . . 4 0.00% . . 3 0.00% 7 0.00%
S-Non-Migrant Japanese 1 0.00% 1 0.00% 6 0.00% 8 0.00% 16 0.00%
T-Non-Migrant Korean 9 0.00% 22 0.01% 45 0.01% 60 0.02% 136 0.04%
V-Non-Migrant Vietnamese 5 0.00% 42 0.01% 45 0.01% 97 0.03% 189 0.05%
W-Non-Migrant Samoan 1 0.00% 1 0.00% . . 2 0.00% 4 0.00%
Z-Non-Migrant Unspecified 104 0.03% 450 0.12% 435 0.11% 1812 0.47% 2801 0.73%
Contributor
Posts: 42

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Wolverine

I just met with my boss and there has been a change.  The research group now wants totals based on each Region/FPL group.  So for example, there are 13282 cases in Region 1, FPL Group 1.  Out of those, there are 5576 women and 7706 men.  5576+7706=13282.

 

Likewise, the percentage for women is 5576/13282=42% and for men 7706/13282=58%.

SAS Super FREQ
Posts: 8,864

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Wolverine
I am in meetings, so can't really go into code now. but here are some papers and doc examples to take a look at
https://support.sas.com/resources/papers/proceedings13/134-2013.pdf
http://support.sas.com/documentation/cdl/en/proc/69850/HTML/default/viewer.htm#n1m5z9bzbggfson1wgywh...
cynthia
Super User
Posts: 11,343

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

[ Edited ]
Posted in reply to Wolverine

How about providing a small number of records similar to your noncompl.deliverhmveverelig set. You don't need to have as many levels of every variabl but there should be at least two levels of each included variable. You can get intructions on how to turn an existing SAS data set into data step code that you can post here as text so we can replicate your data from here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

Then we can test code against some data..

 

If you want to post examples of complex data like this you might try turning on the listing destination and saving the text from the OUTPUT window as a text file and attaching that file. Unfortunately pasting html output for large tables overflows the display ability of the forum. 

 

ods listing;

 

proc tabulate ...

run;

ods listing close;

If you now have an OUTPUT (not results) save the output to a txt or lst file.

Solution
‎12-19-2016 10:49 AM
SAS Super FREQ
Posts: 8,864

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Hi:

  Using SASHELP.HEART, here's a visual on the difference between COLPCTN and using a custom denominator to calculate a percent.

cynthia

 

tabulate_custom_percents.png

Contributor
Posts: 42

Re: PROC Tabulate: displaying counts and percentages with multiple class variables

Posted in reply to Cynthia_sas

Thanks again Cynthia -- it works now!  Here is the final version of the code:

 

proc tabulate data=noncompl.deliverhmveverelig format=8.;
 
   class fplgroup region agegrp race firsthmpeligmonth
  spoken_language gender / MISSING;
 
   table fplgroup='FPL Group'*(agegrp='Age Group' race='Race'
            firsthmpeligmonth='First HMP Elig Month'
               spoken_language='Language' gender='Gender' all='FPL-Region Column Total') all,   
    Region*(n='FPL-Region N' pctn <agegrp race firsthmpeligmonth spoken_language gender all>='FPL-Region %'*f=pctfmt.)
        all*(n='FPL-Region Row Total' pctn <agegrp race firsthmpeligmonth spoken_language gender all>='FPL-Region Row %'*f=pctfmt.);

 
run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 732 views
  • 0 likes
  • 3 in conversation