PROC Star

## PROC FREQ - Include Zero Counts

I have a set of 145 numeric variables whose counts I want to compare period to period.  For example, I might want to compare January of this year to January of last year or January of this year to, say, April of last year.  Easy enough to get the counts using "one way" tables in PROC FREQ, but when displayed side by side, the rows don't necessarily line up.  If a particular period doesn't have any instances in a given range, then PROC FREQ will omit that range.

Here's an example:

``````PROC FORMAT;
VALUE TV_MON_20YM
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;

RUN;

PROC	FREQ	DATA=Comp_Lib.&Comp_File;
TABLES	TV_PS_PSA008	/	MISSING;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;
``````

The most common problem we experience is that there will be no instances of '0 months' for one of the periods being compared. The rows then don't line up period to period.

Is there a way that I can have SAS print a '0 months' row even when there are no occurences that have a value of zero?  The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.  I just have simple one-way tables.  If someonr could point me in the right direction, that would be most helpful.

Thank you,

Jim

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: PROC FREQ - Include Zero Counts

This is how I would do it.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
TV_PS_PSA008=3;
run;
proc summary data=compfile nway completetypes;
class TV_PS_PSA008 / preloadfmt order=data missing;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
output out=counts;
run;
proc print;
run;
PROC	FREQ	DATA=counts order=data;
TABLES	TV_PS_PSA008	/	MISSING;
weight _freq_ / zeros;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;``````

10 REPLIES 10
Super User

## Re: PROC FREQ - Include Zero Counts

@jimbarbour wrote:

The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.

PRELOADFMT would definitely solve your issue, SPARSE may not helpful. Given your description of comparing things over time and periods, I'm not sure how you have a one way table. You need to post some more sample data that reflects your problem, this isn't enough to illustrate it beyond the standard use a PRELOADFMT.

## Re: PROC FREQ - Include Zero Counts

This is how I would do it.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
TV_PS_PSA008=3;
run;
proc summary data=compfile nway completetypes;
class TV_PS_PSA008 / preloadfmt order=data missing;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
output out=counts;
run;
proc print;
run;
PROC	FREQ	DATA=counts order=data;
TABLES	TV_PS_PSA008	/	MISSING;
weight _freq_ / zeros;
FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;``````

PROC Star

## Re: PROC FREQ - Include Zero Counts

OK, @data_null__, that works. That gives me the results I need including the zero counts.  It runs pretty fast with a couple of variables, but really really slow with more.  I may have done something wrong there; not sure. If the slowness I'm noticing is just part and parcel of having 145 variables and about 800,000 - 1,000,000 records, then I can turn it into a macro or something and just put through a few variables at a time.

Jim

PROC Star

## Re: PROC FREQ - Include Zero Counts

@Reeza, I'm just producing two sets of one-way frequency counts, one for the current period, one for a prior period.  The one-way counts are then laid side-by-side for presentation purposes and Excel macros highlight any differences.

The solution that @data_null__ proposed is working, albeit slowly, so I won't post more detail at this juncture.

Jim

Ways 1;

attachments. -##
PROC Star

## Re: PROC FREQ - Include Zero Counts

@data_null__ wrote:
Ways 1;

attachments. -##

Sorry, you lost me @data_null__

Do what?

Jim

## Re: PROC FREQ - Include Zero Counts

@jimbarbour wrote:

@data_null__ wrote:
Ways 1;

attachments. -##

Sorry, you lost me @data_null__

Do what?

Jim

OK @jimbarbour WAYS 1 is what you need to get PROC SUMMARY to just do the 1-way tables.

http://support.sas.com/documentation/cdl/en/proc/69850/HTML/default/viewer.htm#n1affq2dctdc8un1eokb5...

Here is an example program that will handle up to 32767 class variables.  The class variables can be either char or numeric as the MLF CLASS statement option converts all class variables to character so they can be arrayed. Then a second data step to normalize the class variables into _NAME_ and _VALUE_ and the class variables are dropped.  This is a more manageable data in my opinion.  This output can be passed to PROC FREQ with a BY statement if you like.

``````PROC FORMAT;
VALUE TV_MON_20YM(notsorted)
0 = '0 months'
1-2 = '1 to 2 months'
3-4 = '3 to 4 months'
5-6 = '5 to 6 months'
7-8 = '7 to 8 months'
9-10 = '9 to 10 months'
11-12 = '11 to 12 months'
13 - 99999 = '> 12 months'
;
RUN;
data compfile;
do TV_PS_PSA008=3,.;
TV_PS_PSA006=TV_PS_PSA008;
TV_PS_PSA010=TV_PS_PSA008;
output;
end;
run;
proc summary data=compfile completetypes chartype;
class TV_PS_PSA: / preloadfmt order=data missing mlf;
FORMAT TV_PS_PSA:	TV_MON_20YM.;
ways 1;
output out=counts;
run;
data counts;
length _order_ 8 _name_ \$32 _value_ \$64;
set counts;
array tv[*] TV_PS_PSA:;
drop tv:;
_I_ = indexc(_type_,'1');
_order_ = length(_type_)-_i_;
_name_  = vname(tv[_i_]);
_value_ = tv[_i_];
run;
proc print;
run;
proc freq data=counts order=data;
by _order_ _name_;
tables _value_;
weight _freq_ / zeros;
run;
``````

Super User

## Re: PROC FREQ - Include Zero Counts

The slowness is probably from generating the results, try turning on the NOPRINT option or the listing output.

PROC Star

## Re: PROC FREQ - Include Zero Counts

@Reeza,  Ah.  Good idea.  I will try that.

Thank you,

Jim

PROC Star

## Re: PROC FREQ - Include Zero Counts

What appears to be happening in the PROC SUMMARY, before I added the WAYS 1 parameter at @data_null__'s suggestion, was that, in the output data set, each variable's format categories were repeated for each category of each format for the preceeding variable which were in turn repeated for each category of the variable preceeding that, and so on, winding up with something that looks suspiciously like a Cartesian product.

With just 5 variables with 20ish format categories each, I wound up with more than 11,000,000 combinations.  It's hardly a wonder that it wouldn't work with 145 variables.

Jim

Discussion stats
• 10 replies
• 23528 views
• 4 likes
• 3 in conversation