I have a working script, see below, to keep zero counts in a proc freq for one variable. This is based on the solution of a former question on this forum.
data test;
input person $1. first_digit;
datalines;
A 1
A 2
A 3
A 1
B 2
B 3
B 4
B 5
C 1
C 6
C 7
;
run;
/* Make sure zero counts are kept*/
proc format;
value digits
1='1'
2='2'
3='3'
4='4'
5='5'
6='6'
7='7'
8='8'
9='9';
quit;
proc summary data=test nway completetypes;
class first_digit / preloadfmt order=data missing;
FORMAT first_digit digits.;
output out=first_digit_counts;
by person;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
by person;
TABLES first_digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_digit digits.;
RUN;
Now I need to apply the same script, but the possible values are now 100 to 999 instead of 1 to 9. I want to avoid having to type out all numbers in the proc format step (it can be easily done with the help of excel, but it clutters my code). I already have a table at my disposition that has all possible values for first_3digit (100-999) as a numeric variable. Is there a way I can do the proc format step more efficiently?
Some sample data on which it should work:
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
run
One way is to write your format in a data step and read it into PROC FORMAT like this
data fmt;
retain fmtname "digits";
do start=100 to 999;
label=put(start, 3.);
output;
end;
run;
proc format library=work cntlin=fmt;
run;
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
proc summary data=test2 nway completetypes;
class first_3digit / preloadfmt order=data missing;
FORMAT first_3digit digits.;
output out=first_digit_counts;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
TABLES first_3digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_3digit digits.;
RUN;
One way is to write your format in a data step and read it into PROC FORMAT like this
data fmt;
retain fmtname "digits";
do start=100 to 999;
label=put(start, 3.);
output;
end;
run;
proc format library=work cntlin=fmt;
run;
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
proc summary data=test2 nway completetypes;
class first_3digit / preloadfmt order=data missing;
FORMAT first_3digit digits.;
output out=first_digit_counts;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
TABLES first_3digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_3digit digits.;
RUN;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.