BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MrBoo
Calcite | Level 5

Hi Everyone. I use this site extensively, so thank you in advance for all of the questions you have answered already. This is my first occasion to write a question of my own.

 

In attempting to create columns that will return missing/zero counts in PROC TABULATE, I have run across the technique of substituting a PRELOADFMT dataset where the analysis set returns zero counts but does not display the column. I have had problems with this technique because my analysis set generates variable values for specific dates that are tabulated based on when the report is run. My reports include work accomplished last week, last month, and last year; all terminating last Saturday. ACTIONDT is the variable stating when the action took place in the dataset.

 

I use a loop to assign each date the work was accomplished to the appropriate time period for each date column used in the reporting. That is, I dynamically generate date variables' values based on existing data. I require the values as formatted for reporting purposes as I could not determine how to assign it thusly to a value labels. Here is how I do it:

 

data SASUSER.STEP002 ;
	set SASUSER.STEP001 ;

/** ROW_VALUE = Set value for tabulation                                    **/
/** ACTIONS   = Initialize variable with a value to be set to undefined     **/
ROW_VALUE = 1 ;
ACTIONS = 0 ;
STARTDATE = intnx('year',today(),0,'b') ;   /*\ start date for analysis is first day of the year \*/
ENDDATE = intnx('week',today(),-1,'e') ;    /*\ end date is the end of last week                 \*/
LOOPDATE = STARTDATE ;			    /*\ loop set to start date for project analysis      \*/
LOOPEND = intck('day',STARTDATE,ENDDATE) ;  /*\ used to set the number of loops                  \*/
WEEKNM = week(STARTDATE) ;                  /*\ returns number of week within year               \*/
MNTHNM = month(STARTDATE) ;                 /*\ returns number of month wihtin year              \*/
WEEKDT = '31Dec2999'd ;                     /*\ set to create noticeable outlier if not recoded  \*/
 
format STARTDATE ENDDATE LOOPDATE WEEKDT ACTIONDT mmddyys10. ; 

/*****************************************************************************/
/** LOOP FOR DATE CODING PURPOSES. Loop runs from StartDate to EndDate by   **/
/** using INTCK value for loop terminus. 									**/
/*****************************************************************************/
data SASUSER.STEP002 ;
	set SASUSER.STEP002 ;

DO I = 0 TO LOOPEND BY 1 ; /*\ Loop advances days by one from Start to End Date \*/
IF ACTIONDT = ( LOOPDATE + I ) THEN WEEKNM = week( ACTIONDT ) ; /*\ Generates week number based on Action Out Date   \*/
IF ACTIONDT = ( LOOPDATE + I ) THEN WEEKDT = intnx('week',ACTIONDT,0,'b') ; /*\ Generates date of beginning of week    \*/
IF ACTIONDT = ( LOOPDATE + I ) THEN MNTHNM = month(ACTIONDT ) ;   /*\ Generates month number based on Action Out Date  \*/
END ;
run ;

 

The problem I have is in  creating a PRELOADFMT substitute dataset. I cannot generate the same dates as in the analysis dataset because I lack the dates in the substitute set upon which the PROC TABULATE variables are based.

 

DO WEEKDT = value1, value2, value3, etc... ;

 

I do not have all of the values, which I think is key to my problem.

 

I have tried to create them dynamically based on just today's date and starting at the beginning of last year, but I cannot determine how to assign the values to a variable in a DO LOOP as part of the creation of the substitute dataset (here referred to as WORK.ALTERNATE).

 

 

proc tabulate MISSING CLASSDATA=WORK.ALTERNATE ;
title2 "Last Week" ;
var 	ROW_VALUE ;
class	USER ACTIONS FORM_NUMBER WEEKDT / preloadfmt order=data ;
table 	( USER ALL = 'Total' ) , 
		( WEEKDT * FORM_NUMBER = ' ' ) * 
		( ACTIONS = ' ' ALL = 'Total' ) 
		( ALL = 'Grand Total' )
		/ printmiss ;
where	intnx('week',today(),-1,'b') <= WEEKDT <= intnx('week',today(),-1,'e') ;
footnote1 "The week runs Sunday through Saturday." ;
run ;

 

 

Obviously, I am looking for suggestions on how to tackle this problem. To state it succinctly: I need to create a dataset that recreates all categorical values possible in the analysis dataset. This is being done to return zero counts so that dropped columns might be displayed in PROC TABULATE.

 

I appreciate all suggestions.

 

For reference purposes, I am using SAS EG 7.11.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I think there may be two bits you are missing.

One is I don't see where you create a FORMAT and second you do not assign a format to the variable for the preloadfmt option to use.

 

I suspect you are wanting to use some of the result for from your STEP002 data set to create a CNTLIN data set for use by Proc Format.

 

The cntlin data set would have a character variable named FMTNAME to have the name of the format for each value,  a character variable named START with low end of a range of values, character variable named END with the high value of a range, a character variable named LABEL with the text to display and a character variable named TYPE to indicate numeric, value of N or character, value of C. There would be one record for each range of values.

Since you also want an "other" to represent a default value you would also need a record with the same fmtname value, the Label text you want a character variable named HLO to have the value "O" (capital letter o)

 

See if this gives you any help. It shows creating a daily format from a range of values in a data set and applying that format.

data have;
    do date='14Mar2017'd to '27Jan2018'd; 
      value= round(100*rand('uniform'),0.1);
      /* create a few missing dates to see if the preloadfmt works*/
      if mod(date,24) ne 0 then output;
      else put date= date9. " is missinging";
    end;
   format date date9.;
run;


proc sql;
   create table temp as
   select min(date) as mindate, max(date) as maxdate
   from have
   ;
quit;

data cntlinset;
   set temp;
   fmtname='MYDATE';
   TYPE='n';
   do dv = intnx('week',mindate,0,'B') to intnx('week',maxdate,0,'e');
      start=dv;
      label = put(dv,date9.);
      output;
   end;
   START=.;
   LABEL='31DEC2999';
   HLO='O';
   output;
   drop mindate maxdate;
run;
/* the cntlout is so you can see more about what could be in a
   cntlin data set*/
proc format library=work cntlin=cntlinset cntlout=work.cntlout;
run;

proc tabulate data=have;
   class date/preloadfmt;
   format date mydate.;
   var value;
   table date,
         value=sum
         /printmiss
   ;
run;

View solution in original post

8 REPLIES 8
ballardw
Super User

I think there may be two bits you are missing.

One is I don't see where you create a FORMAT and second you do not assign a format to the variable for the preloadfmt option to use.

 

I suspect you are wanting to use some of the result for from your STEP002 data set to create a CNTLIN data set for use by Proc Format.

 

The cntlin data set would have a character variable named FMTNAME to have the name of the format for each value,  a character variable named START with low end of a range of values, character variable named END with the high value of a range, a character variable named LABEL with the text to display and a character variable named TYPE to indicate numeric, value of N or character, value of C. There would be one record for each range of values.

Since you also want an "other" to represent a default value you would also need a record with the same fmtname value, the Label text you want a character variable named HLO to have the value "O" (capital letter o)

 

See if this gives you any help. It shows creating a daily format from a range of values in a data set and applying that format.

data have;
    do date='14Mar2017'd to '27Jan2018'd; 
      value= round(100*rand('uniform'),0.1);
      /* create a few missing dates to see if the preloadfmt works*/
      if mod(date,24) ne 0 then output;
      else put date= date9. " is missinging";
    end;
   format date date9.;
run;


proc sql;
   create table temp as
   select min(date) as mindate, max(date) as maxdate
   from have
   ;
quit;

data cntlinset;
   set temp;
   fmtname='MYDATE';
   TYPE='n';
   do dv = intnx('week',mindate,0,'B') to intnx('week',maxdate,0,'e');
      start=dv;
      label = put(dv,date9.);
      output;
   end;
   START=.;
   LABEL='31DEC2999';
   HLO='O';
   output;
   drop mindate maxdate;
run;
/* the cntlout is so you can see more about what could be in a
   cntlin data set*/
proc format library=work cntlin=cntlinset cntlout=work.cntlout;
run;

proc tabulate data=have;
   class date/preloadfmt;
   format date mydate.;
   var value;
   table date,
         value=sum
         /printmiss
   ;
run;
MrBoo
Calcite | Level 5
I will check this out. Thank you for your input.
I am trying to replicate what Mr. Cerussi is doing here:
http://support.sas.com/resources/papers/proceedings11/087-2011.pdf
ballardw
Super User

@MrBoo wrote:
I will check this out. Thank you for your input.
I am trying to replicate what Mr. Cerussi is doing here:
http://support.sas.com/resources/papers/proceedings11/087-2011.pdf

Please note that he variables the paper is using with classdata are not dates, time or datetimes and have a "natural" limited set of values.

Then the associated format used does not change. Preloadfmt with date, time and datetime values are going to be extremely problematic because the typical SAS date, time and datetime formats have extremely large number of possible displayed values. Note that times and datetimes can contain and differentiate between small fractions of second. Dates formats are defined for dates between the years 1581 and 20000 (yes 4 zeroes). So Preloadfmt with those is going create some conflicts.

 

Without data or an example of what the output should look like I am only addressing the preloadfmt part. There are interactions with classdata.

MrBoo
Calcite | Level 5

To respond to an earlier point, yes, some of the formatting was left out of the  PROC TABULATE statement where I was testing other approaches. I apologize for any confusion this may have caused.

 

I think Mr. Cerussi's approach should be applicable as I'm reducing to a finite number the instances of date and time. My loops take the date in which an action occurs and place them into a week or month bucket as appropriate. I am trying to replicate those buckets, but because I am dynamically generating their values, I am having a difficult time replicating this in the PRELOADFMT dataset.

 

Below are illustrations of the output.

PROC TABULATE Dropping Columns Week.pngPROC TABULATE Dropping Columns.png 

 

 

ballardw
Super User

@MrBoo wrote:

To respond to an earlier point, yes, some of the formatting was left out of the  PROC TABULATE statement where I was testing other approaches. I apologize for any confusion this may have caused.

 

I think Mr. Cerussi's approach should be applicable as I'm reducing to a finite number the instances of date and time. My loops take the date in which an action occurs and place them into a week or month bucket as appropriate. I am trying to replicate those buckets, but because I am dynamically generating their values, I am having a difficult time replicating this in the PRELOADFMT dataset.

 

Below are illustrations of the output.

 


I say again, there is not such thing as a Preloadfmt dataset. Preloadfmt refers to "preload a format". A variable would either have a permanent format assigned to use the Preloadfmt or there would be an explicit format in the proc tabulate code. But there is no data set associated with the option.

There is a CLASSDATA dataset option. And Classdata and Preloadfmt interact. How depends on the exact contents of the classdata, if using Exclusive option or not, the format and any order options,

 

 

Also which buckets are you attempting to dynamically create? With the pictures shown I am not at all sure.

Again, actual example data in the form of a data step might help to get to something resembling working code.

 

Your tabulate example shows 4 variables with the Preloadfmt option.

The only format WEEKDT in your SASUSER.Step002 data set has a format of mmddyys10. assigned. Preloadfmt attempting to use that format (I suspect it wouldn't actually) would try to create approximately (20000 - 1581) *365 values.

There is also no description or content for the Classdata data set named Work.alternate.

 

Your variables USER Action and Form_numbr likely can use a Preloadfmt.

 

This might be of some interest for what I think part of your date related issue might be:

This creates a format that shows the first day of the week in mm/dd/yyyy format for any given date value.

Warnings: the library location in the OPTIONS CMPLIB= statement must be available when the format is compiled AND used. I placed the results in the temporary work library as I don't need them hanging around. So if you try to use this as a permanent format without placing everything in permanent library with the format in a library in your current FMTSEARCH path you'll have problems.

 

Coupled with the where statement that tabulate might work BUT you have to have a format statement associating the variable and the format where you want it used.

 

 

proc fcmp outlib=work.funcs.fmts;
   function FirstDayWeek(date) $;
      return(Put(intnx('week',date,0,'B'),mmddyys10.));
   endsub;
run;

options cmplib=(work.funcs);

proc format library=work;
value FirstDay
low-high = [firstdayweek() ]
;
run;

data work.junk;
  x=today();
  put x firstday.;
  format x firstday.;
run;
MrBoo
Calcite | Level 5

This explains much of my problem: As you observed, I had interpreted PRELOADFMT as a dataset masquerading as a format.

 

I left out the CLASSDATA qualifier so that the code would match the output I provided.

 

I also left out the display of multiple weeks or months of data for the sake of reading it when displayed here. Obviously, other periods would expand the table horizontally.

 

Please pardon my ignorance. In having addressed the problem of reports including data based on today's date and running specified periods of time, I have introduced problems in reporting by columns with zero counts being dropped from the table. I am not quite grasping how PRELOADFMT or CLASSDATA quite work to solve this problem though my efforts have attempted to use these tools.

 

But I am in the process of trying out your suggestions. It will likely be next week before I have a chance to respond. Thank you very much for your assistance.

 

ballardw
Super User

The online help in the Learning to Use SAS Base SAS Sample programs has this example that shows use of PRELOADFMT and CLASSDATA separately and together which may help identify which you want.

 

proc format; 
   value regfmt(notsorted) 
      3='North' 
      4='South' 
      1='East' 
      2='West' 
      5='Texas' 
      6='California'; 
 
   value habfmt(notsorted) 
      2='Rural' 
      1='Urban' 
      3='Burbs'; 
 
   value sexfmt(notsorted) 
      2='Male' 
      1='Female'; 
run; 
 
 
  /*-------------------------------------------------------------*/ 
  /*-- Now we will concoct some data                           --*/ 
  /*-------------------------------------------------------------*/ 
data one; 
  retain seed1 543 seed2 1234; 
  drop j ind; 
  label y='Number of years of schooling'; 
  format region regfmt. sex sexfmt. habitat habfmt.; 
  do region = 1 to 4; 
     if (region = 1) then 
        do habitat = 1 to 3; 
           do sex = 1 to 2; 
              call ranuni( seed1, ind ); 
              ind = int( ind * 20 ) + 2; 
              do j= 1 to ind; 
                 call rannor( seed2, y ); 
                 y = int( y * 3 + 13 ); 
                 output; 
              end; 
           end; 
        end; 
     else 
        do habitat = 1 to 2; 
           do sex = 1 to 2; 
              call ranuni( seed1, ind ); 
              ind = int( ind * 20 ) + 2; 
              do j= 1 to ind; 
                 call rannor( seed2, y ); 
                 y = int( y * 3 + 13 ); 
                 output; 
              end; 
           end; 
        end; 
  end; 
  drop seed1 seed2; 
run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- We are really only interested in seeing all combinations--*/ 
  /*-- for the the South and the West                          --*/ 
  /*-------------------------------------------------------------*/ 
data clsdat; 
   input region sex habitat; 
   format region regfmt. sex sexfmt. habitat habfmt.; 
   cards; 
2 1 3 
2 1 2 
2 1 1 
2 2 3 
2 2 2 
2 2 1 
4 1 3 
4 1 2 
4 1 1 
4 2 3 
4 2 2 
4 2 1 
run; 
 
 
  /*-------------------------------------------------------------*/ 
  /*-- Standard Tabulate of the data                           --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Standard V6 type Tabulate'; 
  proc tabulate data=one; 
    class region sex habitat; 
    var y; 
 
    table region*sex, habitat*y*(min max median); 
  run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- First let us try out preload format trick               --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Using PreLoadFmt and PrintMiss'; 
  proc tabulate data=one; 
    class region sex habitat / PreLoadFmt; 
    var y; 
 
    table region*sex, habitat*y*(min max median) / PrintMiss; 
  run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- Now let's preload our class data                        --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Using ClassData alone'; 
  proc tabulate data=one ClassData=clsdat; 
    class region sex habitat; 
    var y; 
 
    table region*sex, habitat*y*(min max median); 
  run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- Let's add Exclusive to screen-out some obs              --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Using ClassData with Exclusive'; 
  proc tabulate data=one ClassData=clsdat Exclusive; 
    class region sex habitat; 
    var y; 
 
    table region*sex, habitat*y*(min max median); 
  run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- what happens when we use Order=data now?                --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Using ClassData with Order=data'; 
  proc tabulate data=one ClassData=clsdat Exclusive; 
    class region sex habitat / order=data; 
    var y; 
 
    table region*sex, habitat*y*(min max median); 
  run; 
 
  /*-------------------------------------------------------------*/ 
  /*-- Using both PreLoadFmt and ClassData, which ordering     --*/ 
  /*-- wins?                                                   --*/ 
  /*-------------------------------------------------------------*/ 
  title2 'Using ClassData and PreLoadFmt with Order=data'; 
  proc tabulate data=one ClassData=clsdat Exclusive; 
    class region sex habitat / PreLoadFmt Order=data; 
    var y; 
 
    table region*sex, habitat*y*(min max median); 
  run; 
  title;
 
  /*-------------------------------------------------------------*/ 
  /*-- All done, let us take a look.                           --*/ 
  /*-------------------------------------------------------------*/ 
MrBoo
Calcite | Level 5
Thank you. Now that I have had a chance to review and revise my program, with your responses, I have been able to accomplish what I had set out to do. Thank you very much!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1955 views
  • 0 likes
  • 2 in conversation