I have a working script, see below, to keep zero counts in a proc freq for one variable. This is based on the solution of a former question on this forum.
data test;
input person $1. first_digit;
datalines;
A 1
A 2
A 3
A 1
B 2
B 3
B 4
B 5
C 1
C 6
C 7
;
run;
/* Make sure zero counts are kept*/
proc format;
value digits
1='1'
2='2'
3='3'
4='4'
5='5'
6='6'
7='7'
8='8'
9='9';
quit;
proc summary data=test nway completetypes;
class first_digit / preloadfmt order=data missing;
FORMAT first_digit digits.;
output out=first_digit_counts;
by person;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
by person;
TABLES first_digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_digit digits.;
RUN;
Now I need to apply the same script, but the possible values are now 100 to 999 instead of 1 to 9. I want to avoid having to type out all numbers in the proc format step (it can be easily done with the help of excel, but it clutters my code). I already have a table at my disposition that has all possible values for first_3digit (100-999) as a numeric variable. Is there a way I can do the proc format step more efficiently?
Some sample data on which it should work:
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
run
One way is to write your format in a data step and read it into PROC FORMAT like this
data fmt;
retain fmtname "digits";
do start=100 to 999;
label=put(start, 3.);
output;
end;
run;
proc format library=work cntlin=fmt;
run;
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
proc summary data=test2 nway completetypes;
class first_3digit / preloadfmt order=data missing;
FORMAT first_3digit digits.;
output out=first_digit_counts;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
TABLES first_3digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_3digit digits.;
RUN;
One way is to write your format in a data step and read it into PROC FORMAT like this
data fmt;
retain fmtname "digits";
do start=100 to 999;
label=put(start, 3.);
output;
end;
run;
proc format library=work cntlin=fmt;
run;
data test2;
input person $1. first_3digit;
datalines;
A 101
A 253
A 336
A 101
B 245
B 245
B 405
B 504
C 101
C 666
C 789
;
proc summary data=test2 nway completetypes;
class first_3digit / preloadfmt order=data missing;
FORMAT first_3digit digits.;
output out=first_digit_counts;
run;
PROC FREQ DATA=first_digit_counts order=data NOPRINT;
TABLES first_3digit / MISSING OUT=distribution
(RENAME=(PERCENT=OBSERVED));
weight _freq_ / zeros;
FORMAT first_3digit digits.;
RUN;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.