I am trying to build a dataset that shows the cumulative % of "bads" across 10 different buckets using PROC FREQ. I'm running into a problem for the buckets where there are zero "bads." SAS outputs what I've labeled below as "Table 1" where what I need is "Table 2." Basically, the difference that I'm trying to solve is to have SAS output that 80% - 90% bucked and output the same Cumulative Bad % from the bucket above it. Are there any options in PROC FREQ to do this? If not, what is the best alternative approach? Thanks!
TABLE 1:
Percentile | Bads | Cum. Bad % |
0 - 10% | 5 | 20.8% |
10% - 20% | 6 | 45.8% |
20% - 30% | 2 | 54.2% |
30% - 40% | 1 | 58.3% |
40% - 50% | 3 | 70.8% |
50% - 60% | 1 | 75.0% |
60% - 70% | 2 | 83.3% |
70% - 80% | 2 | 91.7% |
90% - 100% | 2 | 100.0% |
TABLE 2:
Percentile | Bads | Cum. Bad % |
0 - 10% | 5 | 20.8% |
10% - 20% | 6 | 45.8% |
20% - 30% | 2 | 54.2% |
30% - 40% | 1 | 58.3% |
40% - 50% | 3 | 70.8% |
50% - 60% | 1 | 75.0% |
60% - 70% | 2 | 83.3% |
70% - 80% | 2 | 91.7% |
80% - 90% | 0 | 91.7% |
90% - 100% | 2 | 100.0% |
Take a look at the following SAS-L thread: http://listserv.uga.edu/cgi-bin/wa?A2=ind1107e&L=sas-l&D=0&P=3180
Another complete example with data. Uses CLASSDATA option.
proc rank data=sashelp.heart groups=10 out=smoke;
var Smoking;
run;
data class;
if 0 then set smoke;
do smoking=0 to 9;
output;
end;
stop;
run;
proc summary data=smoke nway classdata=class;
class smoking;
output out=summary;
run;
proc freq;
tables smoking;
weight _freq_ / zeros;
run;
Values of Smoking Were Replaced by Ranks
Cumulative Cumulative
Smoking Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
0 0 0.00 0 0.00
1 0 0.00 0 0.00
2 2501 48.35 2501 48.35
3 0 0.00 2501 48.35
4 113 2.18 2614 50.53
5 466 9.01 3080 59.54
6 576 11.13 3656 70.67
7 921 17.80 4577 88.48
8 125 2.42 4702 90.90
9 471 9.10 5173 100.00
Easy enough. Using the preloadfmt / completetypes options in PROC Summary and then running that dataset through PROC Freq seems to get me exactly where I need. Thanks for the link!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.