I would like to build a table that shows counts and percentages. The below code displays the correct percentages with "009.9%", but not with "009%". How can I get the correct percentages displayed with no decimal places?
data have;
input a b @@;
datalines;
1 1 1 1 0 1 1 0 0 0 0 1
1 1 0 1 0 0 0 1 1 0 0 0
1 1 0 1 0 0 0 1 1 0 0 0;
run;
proc format;
picture mypct (round) low-high='009%';
run;
proc tabulate data=have format=comma.;
class a b;
table a *(n='Number' pctn<a>='Col Pct'*f=mypct. ) all, b ;
run;
When I run your tabulate code with NO format for the Pctn using this code:
proc tabulate data=have format=comma.; class a b; table a *(n='Number' pctn<a>='Col Pct' ) all, b ; run;
the result looks like:
With your format:
b | |||
---|---|---|---|
0 | 1 | ||
a | 5 | 6 | |
0 | Number | ||
Col Pct | 63% | 60% | |
1 | Number | 3 | 4 |
Col Pct | 38% | 40% | |
All | N | 8 | 10 |
So, what do you want it to look like? I don't see any decimals. What would the "correct" percentages be?
BTW, you should not have the semicolon ending the datalines on a line with data.
@Cuneyt wrote:
I would like to build a table that shows counts and percentages. The below code displays the correct percentages with "009.9%", but not with "009%". How can I get the correct percentages displayed with no decimal places?
data have; input a b @@; datalines; 1 1 1 1 0 1 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 1 1 0 1 0 0 0 1 1 0 0 0; run; proc format; picture mypct (round) low-high='009%'; run; proc tabulate data=have format=comma.; class a b; table a *(n='Number' pctn<a>='Col Pct'*f=mypct. ) all, b ; run;
Sorry for my unclear question. When I use "009.9%", the first column's percentages are 62.5% and 37.5%, which DO add up to 100%.
b | |||
---|---|---|---|
0 | 1 | ||
a | 5 | 6 | |
0 | Number | ||
Col Pct | 62.5% | 60.0% | |
1 | Number | 3 | 4 |
Col Pct | 37.5% | 40.0% | |
All | N | 8 | 10 |
When I use "009%", the first column's percentages are 63% and 38%, which do NOT add up to 100%. How I can fix this problem?
b | |||
---|---|---|---|
0 | 1 | ||
a | 5 | 6 | |
0 | Number | ||
Col Pct | 63% | 60% | |
1 | Number | 3 | 4 |
Col Pct | 38% | 40% | |
All | N | 8 | 10 |
Note that you can also easily create a situation where the total is 99% instead of 100% when you have more than two categories.
So what kind of FIX did you have in mind? How would you decide which value to represent differently?
What most people do is just add a footnote. Like this table I found on the internet.
Not actually a response to the question of what you think the result should look like.
ANY result involving fixing the number of digits to display can have similar just shifted results. You can have a format that displays 5 decimal places and when you total the displayed range could end up with 99.99999 or 100.00001. It is the nature of real numbers.
YOU have to tell us what you expect a displayed value to look like but be aware that it will be hard to write a set of generic rules that will cover every single case.
@Cuneyt wrote:
Sorry for my unclear question. When I use "009.9%", the first column's percentages are 62.5% and 37.5%, which DO add up to 100%.
b 0 1 a 5 6 0 Number Col Pct 62.5% 60.0% 1 Number 3 4 Col Pct 37.5% 40.0% All N 8 10
When I use "009%", the first column's percentages are 63% and 38%, which do NOT add up to 100%. How I can fix this problem?
b 0 1 a 5 6 0 Number Col Pct 63% 60% 1 Number 3 4 Col Pct 38% 40% All N 8 10
I presume your complaint is that rounding to whole percentages results in a set of column percentages summing to 101%
So your original percentages (37.5% and 62.5%) were both "rounded" upwards, to 38% and 63% respectively. I don't see a way past that rounding behavior for cells in proc tabulate.
Hi:
The format you've created should do the trick! the issue(s) you are running into is that 1) the raw data appears tab-delimited; 2) you should not place a semi-colon on the same line as the raw data, but immediately after on a separate line; and 3) no RUN statement necessary when reading raw data. Note my change on the INFILE line.
Hope this helps!
Thanks for your response. I am now running the below code. In the second line, "datalines" is not colored blue. Is it not recognized by SAS?
When I run this code, SAS does not read the data. I get the following error:
DO NOT PUT TABS IN YOUR IN-LINE DATA LINES.
If you use the normal old SAS Display Manger and submit the code SAS will convert the tabs into spaces.
(Actually don't but tabs anywhere in your SAS code, they just cause trouble).
If you want to use tab as the delimiter then store the data into a file and use an INFILE statement to point to the file instead of using DATALINES.
Hello @Cuneyt,
For the special case of only two categories (with percentages adding up to 100%) an extension of your MYPCT. format built around the ROUNDE function should work:
data roundepct;
retain fmtname 'roundepct';
length label $9;
do start=0.5 to 99.5;
label=cats(rounde(start),'%');
output;
end;
hlo='OF';
label='[mypct4.]';
output;
run;
proc format cntlin=roundepct;
run;
Then use f=roundepct. in the PROC TABULATE step.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.