## PROC FREQ - Include Zero Counts

Solved
Frequent Contributor
Posts: 134

# PROC FREQ - Include Zero Counts

[ Edited ]

I have a set of 145 numeric variables whose counts I want to compare period to period.  For example, I might want to compare January of this year to January of last year or January of this year to, say, April of last year.  Easy enough to get the counts using "one way" tables in PROC FREQ, but when displayed side by side, the rows don't necessarily line up.  If a particular period doesn't have any instances in a given range, then PROC FREQ will omit that range.

Here's an example:

``````PROC FORMAT;
VALUE TV_MON_20YM
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;

RUN;

PROC	FREQ	DATA=Comp_Lib.&Comp_File;
TABLES	TV_PS_PSA008	/	MISSING;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;
``````

The most common problem we experience is that there will be no instances of '0 months' for one of the periods being compared. The rows then don't line up period to period.

Is there a way that I can have SAS print a '0 months' row even when there are no occurences that have a value of zero?  The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.  I just have simple one-way tables.  If someonr could point me in the right direction, that would be most helpful.

Thank you,

Jim

Accepted Solutions
Solution
‎01-17-2017 09:29 PM
Posts: 3,852

## Re: PROC FREQ - Include Zero Counts

[ Edited ]
Posted in reply to jimbarbour

This is how I would do it.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
TV_PS_PSA008=3;
run;
proc summary data=compfile nway completetypes;
class TV_PS_PSA008 / preloadfmt order=data missing;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
output out=counts;
run;
proc print;
run;
PROC	FREQ	DATA=counts order=data;
TABLES	TV_PS_PSA008	/	MISSING;
weight _freq_ / zeros;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;``````

All Replies
Super User
Posts: 23,776

## Re: PROC FREQ - Include Zero Counts

Posted in reply to jimbarbour

jimbarbour wrote:

The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.

PRELOADFMT would definitely solve your issue, SPARSE may not helpful. Given your description of comparing things over time and periods, I'm not sure how you have a one way table. You need to post some more sample data that reflects your problem, this isn't enough to illustrate it beyond the standard use a PRELOADFMT.

Solution
‎01-17-2017 09:29 PM
Posts: 3,852

## Re: PROC FREQ - Include Zero Counts

[ Edited ]
Posted in reply to jimbarbour

This is how I would do it.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
TV_PS_PSA008=3;
run;
proc summary data=compfile nway completetypes;
class TV_PS_PSA008 / preloadfmt order=data missing;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
output out=counts;
run;
proc print;
run;
PROC	FREQ	DATA=counts order=data;
TABLES	TV_PS_PSA008	/	MISSING;
weight _freq_ / zeros;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;``````

Frequent Contributor
Posts: 134

## Re: PROC FREQ - Include Zero Counts

Posted in reply to data_null__

OK, @data_null__, that works. That gives me the results I need including the zero counts.  It runs pretty fast with a couple of variables, but really really slow with more.  I may have done something wrong there; not sure. If the slowness I'm noticing is just part and parcel of having 145 variables and about 800,000 - 1,000,000 records, then I can turn it into a macro or something and just put through a few variables at a time.

Jim

Frequent Contributor
Posts: 134

## Re: PROC FREQ - Include Zero Counts

Posted in reply to jimbarbour

@Reeza, I'm just producing two sets of one-way frequency counts, one for the current period, one for a prior period.  The one-way counts are then laid side-by-side for presentation purposes and Excel macros highlight any differences.

The solution that @data_null__ proposed is working, albeit slowly, so I won't post more detail at this juncture.

Thanks for your input,

Jim

Posts: 3,852

## Re: PROC FREQ - Include Zero Counts

Posted in reply to jimbarbour
Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##
Frequent Contributor
Posts: 134

## Re: PROC FREQ - Include Zero Counts

Posted in reply to data_null__

data_null__ wrote:
Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##

Sorry, you lost me @data_null__

Do what?

Jim

Posts: 3,852

## Re: PROC FREQ - Include Zero Counts

[ Edited ]
Posted in reply to jimbarbour

jimbarbour wrote:

data_null__ wrote:
Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##

Sorry, you lost me @data_null__

Do what?

Jim

OK @jimbarbour WAYS 1 is what you need to get PROC SUMMARY to just do the 1-way tables.

http://support.sas.com/documentation/cdl/en/proc/69850/HTML/default/viewer.htm#n1affq2dctdc8un1eokb5...

Here is an example program that will handle up to 32767 class variables.  The class variables can be either char or numeric as the MLF CLASS statement option converts all class variables to character so they can be arrayed. Then a second data step to normalize the class variables into _NAME_ and _VALUE_ and the class variables are dropped.  This is a more manageable data in my opinion.  This output can be passed to PROC FREQ with a BY statement if you like.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
do TV_PS_PSA008=3,.;
TV_PS_PSA006=TV_PS_PSA008;
TV_PS_PSA010=TV_PS_PSA008;
output;
end;
run;
proc summary data=compfile completetypes chartype;
class TV_PS_PSA: / preloadfmt order=data missing mlf;
FORMAT TV_PS_PSA:	TV_MON_20YM.;
ways 1;
output out=counts;
run;
data counts;
length _order_ 8 _name_ \$32 _value_ \$64;
set counts;
array tv[*] TV_PS_PSA:;
drop tv:;
_I_ = indexc(_type_,'1');
_order_ = length(_type_)-_i_;
_name_  = vname(tv[_i_]);
_value_ = tv[_i_];
run;
proc print;
run;
proc freq data=counts order=data;
by _order_ _name_;
tables _value_;
weight _freq_ / zeros;
run;
``````

Super User
Posts: 23,776

## Re: PROC FREQ - Include Zero Counts

Posted in reply to jimbarbour

The slowness is probably from generating the results, try turning on the NOPRINT option or the listing output.

Frequent Contributor
Posts: 134

## Re: PROC FREQ - Include Zero Counts

@Reeza,  Ah.  Good idea.  I will try that.

Thank you,

Jim

Frequent Contributor
Posts: 134

## Re: PROC FREQ - Include Zero Counts

What appears to be happening in the PROC SUMMARY, before I added the WAYS 1 parameter at @data_null__'s suggestion, was that, in the output data set, each variable's format categories were repeated for each category of each format for the preceeding variable which were in turn repeated for each category of the variable preceeding that, and so on, winding up with something that looks suspiciously like a Cartesian product.

With just 5 variables with 20ish format categories each, I wound up with more than 11,000,000 combinations.  It's hardly a wonder that it wouldn't work with 145 variables.

Jim

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 10 replies
• 4542 views
• 3 likes
• 3 in conversation