Help using Base SAS procedures

PROC FREQ - Include Zero Counts

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 100
Accepted Solution

PROC FREQ - Include Zero Counts

[ Edited ]

I have a set of 145 numeric variables whose counts I want to compare period to period.  For example, I might want to compare January of this year to January of last year or January of this year to, say, April of last year.  Easy enough to get the counts using "one way" tables in PROC FREQ, but when displayed side by side, the rows don't necessarily line up.  If a particular period doesn't have any instances in a given range, then PROC FREQ will omit that range.

 

Here's an example:

PROC FORMAT;
    VALUE TV_MON_20YM 
	0 = '0 months'
	1-2 = '1 to 2 months'
	3-4 = '3 to 4 months'
	5-6 = '5 to 6 months'
	7-8 = '7 to 8 months'
	9-10 = '9 to 10 months'
	11-12 = '11 to 12 months'
	13 - 99999 = '> 12 months'
	;

RUN;

PROC	FREQ	DATA=Comp_Lib.&Comp_File;
	TABLES	TV_PS_PSA008	/	MISSING;
	FORMAT	TV_PS_PSA008	TV_MON_20YM.;
RUN;

The most common problem we experience is that there will be no instances of '0 months' for one of the periods being compared. The rows then don't line up period to period.  

 

Is there a way that I can have SAS print a '0 months' row even when there are no occurences that have a value of zero?  The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.  I just have simple one-way tables.  If someonr could point me in the right direction, that would be most helpful.

 

Thank you,

 

Jim


Accepted Solutions
Solution
‎01-17-2017 09:29 PM
Respected Advisor
Posts: 3,777

Re: PROC FREQ - Include Zero Counts

[ Edited ]

This is how I would do it.

 

PROC FORMAT;
   VALUE TV_MON_20YM(notsorted)
      0 = '0 months'
      1-2 = '1 to 2 months'
      3-4 = '3 to 4 months'
      5-6 = '5 to 6 months'
      7-8 = '7 to 8 months'
      9-10 = '9 to 10 months'
      11-12 = '11 to 12 months'
      13 - 99999 = '> 12 months'
   ;
   RUN;
data compfile;
   TV_PS_PSA008=3;
   run;
proc summary data=compfile nway completetypes;
   class TV_PS_PSA008 / preloadfmt order=data missing;
   FORMAT	TV_PS_PSA008	TV_MON_20YM.;
   output out=counts;
   run;
proc print;
   run;
PROC	FREQ	DATA=counts order=data;
   TABLES	TV_PS_PSA008	/	MISSING;
   weight _freq_ / zeros;
   FORMAT	TV_PS_PSA008	TV_MON_20YM.;
   RUN;

Capture.PNG 

View solution in original post


All Replies
Super User
Posts: 17,868

Re: PROC FREQ - Include Zero Counts


jimbarbour wrote:

 The SPARSE option in PROC FREQ appears to be geared toward two way tables.  The PRELOADFMT option appears to also be geared toward two way tables.  


PRELOADFMT would definitely solve your issue, SPARSE may not helpful. Given your description of comparing things over time and periods, I'm not sure how you have a one way table. You need to post some more sample data that reflects your problem, this isn't enough to illustrate it beyond the standard use a PRELOADFMT. 

 

 

Solution
‎01-17-2017 09:29 PM
Respected Advisor
Posts: 3,777

Re: PROC FREQ - Include Zero Counts

[ Edited ]

This is how I would do it.

 

PROC FORMAT;
   VALUE TV_MON_20YM(notsorted)
      0 = '0 months'
      1-2 = '1 to 2 months'
      3-4 = '3 to 4 months'
      5-6 = '5 to 6 months'
      7-8 = '7 to 8 months'
      9-10 = '9 to 10 months'
      11-12 = '11 to 12 months'
      13 - 99999 = '> 12 months'
   ;
   RUN;
data compfile;
   TV_PS_PSA008=3;
   run;
proc summary data=compfile nway completetypes;
   class TV_PS_PSA008 / preloadfmt order=data missing;
   FORMAT	TV_PS_PSA008	TV_MON_20YM.;
   output out=counts;
   run;
proc print;
   run;
PROC	FREQ	DATA=counts order=data;
   TABLES	TV_PS_PSA008	/	MISSING;
   weight _freq_ / zeros;
   FORMAT	TV_PS_PSA008	TV_MON_20YM.;
   RUN;

Capture.PNG 

Frequent Contributor
Posts: 100

Re: PROC FREQ - Include Zero Counts

OK, @data_null__, that works. That gives me the results I need including the zero counts.  It runs pretty fast with a couple of variables, but really really slow with more.  I may have done something wrong there; not sure. If the slowness I'm noticing is just part and parcel of having 145 variables and about 800,000 - 1,000,000 records, then I can turn it into a macro or something and just put through a few variables at a time.

 

Jim

Frequent Contributor
Posts: 100

Re: PROC FREQ - Include Zero Counts

@Reeza, I'm just producing two sets of one-way frequency counts, one for the current period, one for a prior period.  The one-way counts are then laid side-by-side for presentation purposes and Excel macros highlight any differences.

 

The solution that @data_null__ proposed is working, albeit slowly, so I won't post more detail at this juncture.

 

Thanks for your input,

 

Jim

Respected Advisor
Posts: 3,777

Re: PROC FREQ - Include Zero Counts

Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##
Frequent Contributor
Posts: 100

Re: PROC FREQ - Include Zero Counts


data_null__ wrote:
Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##

Sorry, you lost me @data_null__

 

Do what?

 

Jim

Respected Advisor
Posts: 3,777

Re: PROC FREQ - Include Zero Counts

[ Edited ]

jimbarbour wrote:

data_null__ wrote:
Ways 1;

##- Please type your reply above this line. Simple formatting, no
attachments. -##

Sorry, you lost me @data_null__

 

Do what?

 

Jim


OK @jimbarbour WAYS 1 is what you need to get PROC SUMMARY to just do the 1-way tables.

 

http://support.sas.com/documentation/cdl/en/proc/69850/HTML/default/viewer.htm#n1affq2dctdc8un1eokb5...

 

Here is an example program that will handle up to 32767 class variables.  The class variables can be either char or numeric as the MLF CLASS statement option converts all class variables to character so they can be arrayed. Then a second data step to normalize the class variables into _NAME_ and _VALUE_ and the class variables are dropped.  This is a more manageable data in my opinion.  This output can be passed to PROC FREQ with a BY statement if you like.

 

PROC FORMAT;
   VALUE TV_MON_20YM(notsorted)
      0 = '0 months'
      1-2 = '1 to 2 months'
      3-4 = '3 to 4 months'
      5-6 = '5 to 6 months'
      7-8 = '7 to 8 months'
      9-10 = '9 to 10 months'
      11-12 = '11 to 12 months'
      13 - 99999 = '> 12 months'
   ;
   RUN;
data compfile;
   do TV_PS_PSA008=3,.;
      TV_PS_PSA006=TV_PS_PSA008;
      TV_PS_PSA010=TV_PS_PSA008;
      output;
      end;
   run;
proc summary data=compfile completetypes chartype;
   class TV_PS_PSA: / preloadfmt order=data missing mlf;
   FORMAT TV_PS_PSA:	TV_MON_20YM.;
   ways 1;
   output out=counts;
   run;
data counts;
   length _order_ 8 _name_ $32 _value_ $64;
   set counts;
   array tv[*] TV_PS_PSA:;
   drop tv:;
   _I_ = indexc(_type_,'1');
   _order_ = length(_type_)-_i_;
   _name_  = vname(tv[_i_]);
   _value_ = tv[_i_];
   run;
proc print;
   run;
proc freq data=counts order=data;
   by _order_ _name_;
   tables _value_;
   weight _freq_ / zeros;
   run;

Capture.PNG

Super User
Posts: 17,868

Re: PROC FREQ - Include Zero Counts

The slowness is probably from generating the results, try turning on the NOPRINT option or the listing output. 

Frequent Contributor
Posts: 100

Re: PROC FREQ - Include Zero Counts

@Reeza,  Ah.  Good idea.  I will try that.

 

Thank you,

 

Jim

Frequent Contributor
Posts: 100

Re: PROC FREQ - Include Zero Counts

@Reeza,

 

What appears to be happening in the PROC SUMMARY, before I added the WAYS 1 parameter at @data_null__'s suggestion, was that, in the output data set, each variable's format categories were repeated for each category of each format for the preceeding variable which were in turn repeated for each category of the variable preceeding that, and so on, winding up with something that looks suspiciously like a Cartesian product.  

 

With just 5 variables with 20ish format categories each, I wound up with more than 11,000,000 combinations.  It's hardly a wonder that it wouldn't work with 145 variables.

 

Jim


Screen_Print_of_Cartesian_Product_of_Formats.jpg

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 1105 views
  • 3 likes
  • 3 in conversation