Calcite | Level 5

## Tabulate and percentages

The input for the code below is a dataset which contains one record per month. Also on the record is a variable indicating the quarter and year.

So, there is one record for every month, 3 records for a quarter, and 12 records for a year.

The code below is executed 3 times, once for month, once for quarter, once for year.

The idea is to have the denominator for each time period be accurate (1,3,or 12) so the average/month is correct. (Of course, in the case of a month execution, the total net_book_bal and the avg/month net_book_bal will be the same because it is divided by one).

Can someone please explain if I should be using pctn or pctsum, or colpct or rowpct. Nothing seems to work correctly. (Once again, the denominator should be the frequency (or the number of records) having the same value for a month, quarter, or year (1,3,12).

proc tabulate data=monthlevel;
class &period / descending ;
var glchgoff glrecovery glncl net_book_bal acct;
tables
net_book_bal=''*sum='\$M Outstanding (Total)' *f=dollar21.2
net_book_bal=''*colpctn<&period.> ='\$M Outstanding/month)*f=dollar21.2
,
&period.=' '
/ box="Measure" rtspace=25 row=float;
title1 "&titl1.";
run;
3 REPLIES 3
SAS Super FREQ

## Re: Tabulate and percentages

Hi:
All the automatic percent statistics use the denominator as described in the doc, but generally, it breaks down like this:
PCTN = divides by grand total N
COLPCTN = divides by total of the COL N
ROWPCTN = divides by total of the ROW N

PCTSUM = divides by the grand total of the variable in the equation...so for example, PCTSUM * VAL would divide particular cell's VAL by the SUM of ALL the VAL for the table.
COLPCTSUM = divides by the total for the column SUM of the variable being used
ROWPCTSUM = divides by the total for the row SUM of the variable being used

If you want to generate the MEAN statistic, (which is the sum of a variable divided by the non-missing count of obs), then you would use the MEAN statistic in your COLUMN statement. See the code below. The data is simple enough that you can check the divisions with a calculator. (Even if you are eventually going to remove the N statistic from the final table, I find it is useful to see the N statistic as I'm finalizing my table.)

I'm not sure what statistic you actually want, but if you look at the test data and then check the math, you should be able to figure out which statistic you should use. In some cases, with PROC TABULATE, you can provide a custom denominator in your TABLE statement -- however, I'm not sure you actually need this capability, since I'm not sure that you really want a percent statistic.

cynthia
[pre]
data makedata;
do month = 1 to 12 by 1;
if month in (1,2,3) then q=1;
else if month in (4,5,6) then q=2;
else if month in (7,8,9) then q=3;
else q=4;
val = month * q * 10;
if val lt 100 then val = val * 10;
output;
end;
label q='Quarter'
month='Month';
run;

ods listing;
proc print data=makedata;
title 'what does data look like';
run;

ods listing close;
ods html file='pctn_pctsum.html' style=sasweb;

proc tabulate data=makedata;
title 'MEAN and N statistics';
class month q;
var val;
table n (sum*val mean*val),
Q all/box='Quarter';
table n (sum*val mean*val),
month all/box='Month';
run;

proc tabulate data=makedata;
title 'PCTN PCTSUM';
class month q;
var val;
table (n pctn) (sum*val pctsum*val),
Q all/box='Quarter';
table (n pctn) (sum*val pctsum*val),
month all/box='Month';
run;

proc tabulate data=makedata;
title 'COLPCTN COLPCTSUM';
class month q;
var val;
table (n colpctn) (sum*val colpctsum*val),
Q all/box='Quarter';
table (n colpctn) (sum*val colpctsum*val),
month all/box='Month';
run;

proc tabulate data=makedata;
title 'ROWPCTN ROWPCTSUM (for this table, same as PCTN and PCTSUM)';
class month q;
var val;
table (n rowpctn) (sum*val rowpctsum*val),
Q all/box='Quarter';
table (n rowpctn) (sum*val rowpctsum*val),
month all/box='Month';
run;

ods html close;
[/pre]
Calcite | Level 5

## Re: Tabulate and percentages

Cynthia,

Thanks for your help. I'm going to have to take some time to study all the information you provided. I believe mean might be the way to go.

My example is very very straightfoward. I have 24 records in this file representing 2 years of data. (1 record per month, 3 records per quarter, and 12 records per year)

I am producing 3 tables, one which has one column per month, one which has one column per quarter, and one which has one column per year.

The denominator for the month table is 1, since the average net_book_bal for that month is equal to the net_book_bal divided by one.

The denominator for the quarter table is 3 because every quarter has 3 records. The total net_book_bal for the three records in every quarter is summed and divided by 3.

Finally, the denominator for the yearly table is 12. Every year has 12 records. um all of the net_book_bal's in a given year and divide by 12.
SAS Super FREQ

## Re: Tabulate and percentages

Hi:
I agree. I think you want the MEAN statistic. My data is very simple and only represents one variable (VAL) and 1 year of 12 months. You should be able to figure out by looking at the numbers from PROC PRINT whether the quarter and month numbers from MEAN are what you would expect. Then you could either dummy up another year of data and run the test program again or try the working code pointed to your data.

I'd recommend keeping it simple at first -- make sure you get the stats you want for one variable and then expand your table statement to include more variables/statistics.

cynthia
Discussion stats
• 3 replies
• 1938 views
• 0 likes
• 2 in conversation