Help using Base SAS procedures

proc freq: 100% with format percent10.1

Reply
Regular Contributor
Regular Contributor
Posts: 156

proc freq: 100% with format percent10.1

Dear all,

please take a look at the resulting table work.cylinders_make of this code:

proc freq data=sashelp.cars;

     format percent commax10.1;

     by make;

     tables cylinders / out=work.cylinders_make;

run;

And have a look at make = Chevrolet. There are 3 oberservations for it. If you add all three you get slightly less than 100 :-( This is due to the format on the variable percent.

Does anybody know if I get get only one number after the decimal point for the variable percent in the proc freq without this rounding problem? In the proc means statement ther's the option maxdec. But it is not valid for prof freq.

Best wishes

Eva

Super User
Super User
Posts: 7,083

Re: proc freq: 100% with format percent10.1

Not sure what the question is?  Are you trying to invent a new form of mathematics where 100*7/27 does not round to 25.9 ?

Regular Contributor
Regular Contributor
Posts: 156

Re: proc freq: 100% with format percent10.1

No, look at Chevrolet:

25,9 + 48,1 + 25,9 = 99,9 (not 100)

Without the format statement in the above code I get for Chevrolet

25,92 + 48,14 + 25,92 = 100


Regular Contributor
Posts: 237

Re: proc freq: 100% with format percent10.1

This is what my comment below was about.  If you want it to add to 100, then you have to take it out an extra decimal place or two (sometimes even three or four).  When rounding, your decimal place will either go up or stay the same.

For example,

29.8% + 70.2% = 100.0%.  Yea!  This is what we want!

If we round to whole integers, the 29 will change to a 30, but the 70 will remain the same.  Thus, 30% + 70% = 100%. Yea!  We still got what we want!

Now, let's try a different example where both numbers will go up.

29.5% + 70.5% = 100.0%.  Yea!  This is again what we want!

Now, let's round to whole integers again.  Now the 29 becomes a 30, and the 70 becomes a 71.  So, 30% + 71% = 101%.  Oh no!  This is not true!  (Was that your thought?)  It is still true because you now estimated instead showing an exact result.

Maybe estimation is where your question arises.  When you estimate, you will get close, but not always exact.  Rounding is a way of estimating.  No one wants to see a long stretch of decimal places, so we round to make the display of values look prettier, and we understand that there will be a little variation.

So, are you okay with it not showing up as exactly 100%, or do you have people viewing your results that do not understand rounding?

Regular Contributor
Regular Contributor
Posts: 156

Re: proc freq: 100% with format percent10.1

Posted in reply to djbateman

I guess the basic problem here is that the format is executed after the calculation of proc freq. And in proc freq the variable percent comes with two numbers after the decimal point.

So if proc freq could be persuaded using only one number after the decimal point I wouldn't need the format and therefore no rounding would occur.

Super User
Super User
Posts: 7,083

Re: proc freq: 100% with format percent10.1

No. The problem is that you are representing the percentages using discrete mathematics.  When you round to one decimal place the chances of the rounded values not adding to exact 100% is greater than when you using 2 or 3 decimal places.  But even the full 17 or 18 digit precision that SAS uses to store floating point numbers could still lead to situations where the sum of the stored values does not add to 100.

Just add a footnote to your report for anyone that has trouble this this concept.

NOTE: Percentages might not sum to 100 due to rounding.

Regular Contributor
Posts: 237

Re: proc freq: 100% with format percent10.1

I see, so PROC FREQ is reporting 25.92 when it should be reporting 25.93 (since the true result is 25.9259259...).  Sorry for my little rampage above.  However, if it properly rounded and then summed up the percents, you would still not get 100% because 25.9259259 would round to 25.93 (you have two of these values) and 48.1481481 would round to 48.15.  Thus, 25.93 + 48.15 + 25.93 = 100.01.  This is not what SAS output is showing.  It looks like SAS is not doing individual percents and then adding up those percents.  Rather, it is taking the cumulative frequencies divided by the total.  See if the table below helps at all!

FrequencyCumulative FrequencyPercentCumulative Percent (by Hand)Cumulative Percent (SAS Method)
7725.9325.93 (7/27)25.93 (7/27)
132048.1574.08 (7/27 + 13/27)74.07 (20/27)
72725.93100.01 (7/27 + 13/27 + 7/27)100.00 (27/27)
Regular Contributor
Regular Contributor
Posts: 156

Re: proc freq: 100% with format percent10.1

Posted in reply to djbateman

Well, I see. What do you think about the maxdec option in the proc means statement in this light?

Regular Contributor
Posts: 237

Re: proc freq: 100% with format percent10.1

By the way, 25,92 + 48,14 + 25,92 DOES NOT EQUAL 100.

25,92 + 48,14 + 25,92 = 98.98.  Still not exactly 100, but closer than 99.9.

Regular Contributor
Posts: 237

Re: proc freq: 100% with format percent10.1

I don't think your question should be an issue.  When rounding, you will not always add up to 100%.  If you truly want to get 100% by adding up your displayed values, you would have to take it out an extra decimal place or two.  But most people would recognize that there is a rounding situation going on there.  No one will add it up and say, "Wait a second!  These values only account for 99.9% of the data.  Where is the other 0.1%?"

This reminds me of the Modern Family quote: "Success is 1% inspiration, 98% perspiration, and 2% attention to detail."

Regular Contributor
Regular Contributor
Posts: 156

Re: proc freq: 100% with format percent10.1

Posted in reply to djbateman

Dear djbateman,

I thought the same yet I got exactly that reaction: it doesn't sum up to 100 :-(

Regards

Eva

SAS Super FREQ
Posts: 8,869

Re: proc freq: 100% with format percent10.1

Hi:

  Your FORMAT statement is being ignored by PROC FREQ -- so that is a non-issue. The only way to alter a format with PROC FREQ is to change the TABLE template. (23349 - Modify the default format displayed with PROC FREQ)

   You can prove this to yourself, that the format is being ignored. Here's Chevrolet both with and without the format...same results. Next, if I go out 5 decimal places (with TABULATE) and "add up" the individual numbers on my calculator, I get 100.00001; however, as Tom pointed out, SAS does not "add up" the percent numbers. Never. The formula for calculating a percent is 100*number in cell divided by denominator = percent -- so as he said: 100*7 / 27 = 25.92593 and the "grand total" of 27 is calculated internally as 100*27 / 27 which will always be 100 -- your "individual cell percents" are not "added up" by SAS. So, your 27 cars are not 99.99 percent of the total, the 27 cars are 100% of the total, no matter what the "individual percents" add up to.

  Here's some code comparing FREQ and TABULATE and showing the same individual percents and the same grand total percent. Note that 25.92593 IS rounded by TABULATE to 25.93.  However, 100*13 / 27 = 48.148148148, which SAS is rounding to 48.1 -- if you want some definitive answer on the rounding algorithm and why that number got rounded to 48.1, by both FREQ and TABULATE instead of 48.2, you'd have to open a track with Tech Support. To me it is a moot point, since the 27 total cars are 100% and not 99.99%.

cynthia

proc sort data=sashelp.cars out=cars;
  by make;
  where make in ('Chevrolet');
run;

proc freq data=cars;
  title '1) PROC FREQ without fmt statement';
  by make;
  tables cylinders;
run;
 
proc freq data=cars;
title '2) PROC FREQ with fmt statement';
    format percent commax10.1;
     by make;
     tables cylinders;
run;


proc tabulate data=cars;
title '3) PROC TABULATE default fmt';
by make;
class cylinders;
table cylinders all,
      n pctn;
run;
  
proc tabulate data=cars;
title '4) PROC TABULATE 10.1 fmt';
by make;
class cylinders;
table cylinders all,
      n*f=commax10. pctn*f=commax10.1;
run;

proc tabulate data=cars;
title '5) PROC TABULATE diff format 5 dec places';
by make;
class cylinders;
table cylinders all,
      n*f=commax10. pctn*f=commax14.5;
run;

Super Contributor
Posts: 358

Re: proc freq: 100% with format percent10.1

I think you may also have a problem between rounding the value and using a format.  When you use a format of 9.2 for the number no rounding takes place and the value on the output is truncated.  This can cause problems when manually summing a column and it does not total 100%.

Rounding the value into a variable and then printing it using the same format will introduce a rounding error and the sum may still not be 100%.

I usually generate the sum as a data line before printing so that I can insure that it will total 100% even if the individual values will not sum correctly, then print a footnote regarding rounding on the report.

Ask a Question
Discussion stats
  • 12 replies
  • 2220 views
  • 4 likes
  • 5 in conversation