BookmarkSubscribeRSS Feed
Cuneyt
Obsidian | Level 7

I would like to build a table that shows counts and percentages. The below code displays the correct percentages with "009.9%", but not with "009%". How can I get the correct percentages displayed with no decimal places?

data have;
input a b @@;
datalines;
1	1	1	1	0	1	1	0	0	0	0	1
1	1	0	1	0	0	0	1	1	0	0	0
1	1	0	1	0	0	0	1	1	0	0	0;
run;
proc format;                           
   picture mypct (round) low-high='009%';   
run; 
proc tabulate data=have format=comma.;
	class a b;
	table a *(n='Number' pctn<a>='Col Pct'*f=mypct. ) all, b ;
run;
11 REPLIES 11
ballardw
Super User

When I run your tabulate code with NO format for the Pctn using this code:

proc tabulate data=have format=comma.;
	class a b;
	table a *(n='Number' pctn<a>='Col Pct' ) all, b ;
run;

the result looks like:

  b
0 1
a   5 6
0 Number
Col Pct 63 60
1 Number 3 4
Col Pct 38 40
All N 8 10

With your format:

  b
0 1
a   5 6
0 Number
Col Pct 63% 60%
1 Number 3 4
Col Pct 38% 40%
All N 8 10

So, what do you want it to look like? I don't see any decimals. What would the "correct" percentages be?

 

BTW, you should not have the semicolon ending the datalines on a line with data.

 

 


@Cuneyt wrote:

I would like to build a table that shows counts and percentages. The below code displays the correct percentages with "009.9%", but not with "009%". How can I get the correct percentages displayed with no decimal places?

data have;
input a b @@;
datalines;
1	1	1	1	0	1	1	0	0	0	0	1
1	1	0	1	0	0	0	1	1	0	0	0
1	1	0	1	0	0	0	1	1	0	0	0;
run;
proc format;                           
   picture mypct (round) low-high='009%';   
run; 
proc tabulate data=have format=comma.;
	class a b;
	table a *(n='Number' pctn<a>='Col Pct'*f=mypct. ) all, b ;
run;

 

Cuneyt
Obsidian | Level 7

Sorry for my unclear question. When I use "009.9%", the first column's percentages are 62.5% and 37.5%, which DO add up to 100%.

  b
0 1
a   5 6
0 Number
Col Pct 62.5% 60.0%
1 Number 3 4
Col Pct 37.5% 40.0%
All N 8 10

 

 

When I use "009%", the first column's percentages are 63% and 38%, which do NOT add up to 100%. How I can fix this problem?

  b
0 1
a   5 6
0 Number
Col Pct 63% 60%
1 Number 3 4
Col Pct 38% 40%
All N 8 10
Tom
Super User Tom
Super User

Note that you can also easily create a situation where the total is 99% instead of 100% when you have more than two categories.

 

So what kind of FIX did you have in mind?  How would you decide which value to represent differently?

 

What most people do is just add a footnote.  Like this table I found on the internet.

https://www.researchgate.net/figure/Note-All-percentages-may-not-total-100-due-to-rounding-ACE-and-S...

 

Cuneyt
Obsidian | Level 7
Thanks, Tom.
ballardw
Super User

Not actually a response to the question of what you think the result should look like.

 

ANY result involving fixing the number of digits to display can have similar just shifted results. You can have a format that displays 5 decimal places and when you total the displayed range could end up with 99.99999 or 100.00001. It is the nature of real numbers.

YOU have to tell us what you expect a displayed value to look like but be aware that it will be hard to write a set of generic rules that will cover every single case.

 


@Cuneyt wrote:

Sorry for my unclear question. When I use "009.9%", the first column's percentages are 62.5% and 37.5%, which DO add up to 100%.

  b
0 1
a   5 6
0 Number
Col Pct 62.5% 60.0%
1 Number 3 4
Col Pct 37.5% 40.0%
All N 8 10

 

 

When I use "009%", the first column's percentages are 63% and 38%, which do NOT add up to 100%. How I can fix this problem?

  b
0 1
a   5 6
0 Number
Col Pct 63% 60%
1 Number 3 4
Col Pct 38% 40%
All N 8 10

 

mkeintz
PROC Star

I presume your complaint is that rounding to whole percentages results in a set of column percentages summing to 101%

 

So your original percentages (37.5% and 62.5%) were both "rounded" upwards, to 38% and 63% respectively.  I don't see a way past that rounding behavior for cells in proc tabulate.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cuneyt
Obsidian | Level 7
Thank you. I think you are right.
donricardo
SAS Employee

Hi:

The format you've created should do the trick!   the issue(s) you are running into is that 1) the raw data appears tab-delimited; 2) you should not place a semi-colon on the same line as the raw data, but immediately after on a separate line; and 3) no RUN statement necessary when reading raw data.   Note my change on the INFILE line.

donricardo_0-1685645770279.png

donricardo_1-1685645817586.png

Hope this helps!

Cuneyt
Obsidian | Level 7

Thanks for your response. I am now running the below code. In the second line, "datalines" is not colored blue. Is it not recognized by SAS?

 

Cuneyt_0-1685647911853.png

When I run this code, SAS does not read the data. I get the following error:

Cuneyt_1-1685648104776.png

 

Tom
Super User Tom
Super User

DO NOT PUT TABS IN YOUR IN-LINE DATA LINES.

 

If you use the normal old SAS Display Manger and submit the code SAS will convert the tabs into spaces.

 

(Actually don't but tabs anywhere in your SAS code, they just cause trouble).

 

If you want to use tab as the delimiter then store the data into a file and use an INFILE statement to point to the file instead of using DATALINES.

 

FreelanceReinh
Jade | Level 19

Hello @Cuneyt,

 

For the special case of only two categories (with percentages adding up to 100%) an extension of your MYPCT. format built around the ROUNDE function should work:

data roundepct;
retain fmtname 'roundepct';
length label $9;
do start=0.5 to 99.5;
  label=cats(rounde(start),'%');
  output;
end;
hlo='OF';
label='[mypct4.]';
output;
run;

proc format cntlin=roundepct;
run;

Then use f=roundepct. in the PROC TABULATE step.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1600 views
  • 0 likes
  • 6 in conversation