BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SivaKizildag
Fluorite | Level 6

I make a PROC TABULATE giving the number of occurrences per region (REGION), per sector (MACTIV) and per month. So, as seen in the table below, there are 113 occurrences in region 01, in the secondary activities, in april 2009.

 

01.jpg

 

But, I need the labels per row to be repeated, because, I export this PROC TABULATE to an Excel file using ODS EXCEL FILE and use anoter Excel file takes the needed values with an INDEX-MATCH. This INDEX-MATCH looks up for the month by column (no problem) but also the REGION and the MACTIV per row. So, in order to have the row labels to be repeated, I use the same PROC TABULATE, adding the BY REGION statement.

 

       

    PROC TABULATE DATA = data;

    option missing=0;

    BY REGION;

    CLASS REGION MACTIV DATE / MLF MISSING ;

    TABLE (MACTIV *REGION), DATE  / printmiss nocellmerge ;

    RUN;

 

 

 

This allows me to have one table for each region, and having both REGION and MACTIV labels to be repeated. Therefore, I can use the INDEX-MATCH.

 

02.jpg

 

But, this time, the problem is that the tables are not same. If, there is a region that has no occurrences for some months or sectors, these rows & columns won’t appear. I didn't have this problem in the first PROC TABULATE, thanks to the PRINTMISS option. But, afer adding BY REGION statement, the tables are not standardized. Below, you can see the tables for region 04 and 06 that are very different :

 

03.JPG

 

My question is; how can I have standardized tables (same rows and columns, as in regular PROC TABULATE) and also repeated label names (as in the PROB TABULATE with the BY statement)?

 

Thank you very much,

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

There are a couple of ways to force levels of a variable that do not appear in some combination of data to be in proc tabulate output.

One way is to have a format that has all of the levels desired and use the PRELOADFMT option on the class statement for the variable and then have option PRINTMISS in the table options. Maybe

        PROC TABULATE DATA = data;
        option missing=0;
        BY REGION;
        CLASS REGION MACTIV / MLF MISSING preloadfmt;
        class date;
        TABLE (MACTIV *REGION), DATE  / printmiss nocellmerge ;
        RUN;

You did not indicate which variable actually uses a multilable format. Generally I segregate those from other variables in a different class statement to make it clear which variable, if any, is actually using the multilabel format. The order of code creating the MLF plus options like order can affect the appearance of output so isolating such can help id sources of appearance issues.

 

HINT: do not use Preloadfmt with date values in general as that would be calling for every single possible date and SAS won't do that (years 1581 to 20,000 would be a big table).

 

Another approach is to have a CLASSDATA data set that contains all of the combinations of variables that you want to appear in the table. This data set would then be an option on the Proc statement such as Classdata=mycombinationdataset.

Which likely should be a permanent data set so you don't have to recreate it every time.

View solution in original post

6 REPLIES 6
ballardw
Super User

There are a couple of ways to force levels of a variable that do not appear in some combination of data to be in proc tabulate output.

One way is to have a format that has all of the levels desired and use the PRELOADFMT option on the class statement for the variable and then have option PRINTMISS in the table options. Maybe

        PROC TABULATE DATA = data;
        option missing=0;
        BY REGION;
        CLASS REGION MACTIV / MLF MISSING preloadfmt;
        class date;
        TABLE (MACTIV *REGION), DATE  / printmiss nocellmerge ;
        RUN;

You did not indicate which variable actually uses a multilable format. Generally I segregate those from other variables in a different class statement to make it clear which variable, if any, is actually using the multilabel format. The order of code creating the MLF plus options like order can affect the appearance of output so isolating such can help id sources of appearance issues.

 

HINT: do not use Preloadfmt with date values in general as that would be calling for every single possible date and SAS won't do that (years 1581 to 20,000 would be a big table).

 

Another approach is to have a CLASSDATA data set that contains all of the combinations of variables that you want to appear in the table. This data set would then be an option on the Proc statement such as Classdata=mycombinationdataset.

Which likely should be a permanent data set so you don't have to recreate it every time.

SivaKizildag
Fluorite | Level 6

Thank you very much!

 

The PRELOADFMT really made a difference for the rows. Using the code you entered, I was able to force SAS to put all the possible labels for the rows.

 

The problem is, I cannot do the same for the columns (dates); I still only have May 2018.

 

04.JPG

 

So, I tried using CLASSDATA. I defined a Classdata using the dates of my input datafile, My_data:

 

Data level;
Set My_data;
keep DATE REGION MACTIV;
Run;

 Then used the PROC TABULATE : 

 

PROC TABULATE DATA = My_data

MISSING classdata=level;
option missing=0;
by REGION;
CLASS REGION MACTIV DATE / MLF MISSING PRELOADFMT; FORMAT MACTIV $MACTIV_TAB. ;
TABLE ( MACTIV*REGION), (DATE) / printmiss nocellmerge ; RUN;

 

This gives me a humongous table with all the possible combinations by row and by column, like this: 

 

05.JPG

 

So I think I am close to the solution, but this is not it. I thought, maybe I should only keep the DATE varaible, in the CLASSDATA,but in that case, the PROC TABULATE says it does not find the two other variables, when I do like this:

 

Data level;
Set My_data;
keep DATE ;
Run;

 

Thank you very much again! 

ballardw
Super User

Did you try moving the DATE variable to a different Class statement without the Preloadfmt option?

 

I believe I did say you don't want to use Preloadfmt with dates in general. You should get a note in the log like:

WARNING: The format for variable Date cannot be preloaded. A finite set of formatted values
         cannot be produced from the format.  The format is not recognized, is a SAS format,
         calls a function, or contains a nested format in its definition. Preload will have no
         effect.

Classdata sets have to have all of the values of the variables in involved. So if you are using formats to create groups from date values then your class data would require all the days, not just an example date.

And I've never messed around with mixing Preloadfmt and Classdata so you may be getting an odd interaction.

But without actual data or the complete code used I don't know what else to look at.

 

 

SivaKizildag
Fluorite | Level 6

Thank you very much again for your helpfulness. 

 

To answer your question, the MLF statement concerns the MACTIV variable. So, when I try two separate CLASS statements, as you suggested, it still doesn't change much:

 

PROC TABULATE DATA = data;
option missing=0;
by REGION;
CLASS  MACTIV  / MLF MISSING preloadfmt ; FORMAT MACTIV $MACTIV. ; 
CLASS REGION DATE/ MISSING  ;
TABLE (
(MACTIV)*(REGION)
), (DATE='')   / printmiss nocellmerge ; RUN;

 

 

Speaking of having the data; maybe it will be easier, with an example from sashelp.cars.

 

I attached the code and here is what it does.

 

The first PROC TABULATE has the advantage of having repeating labels, which is needed for the INDEX-MATCH. But, its flaw is that SAS only gives the non missing values. 

 

data cars;
set sashelp.cars;  
run;

proc sort data=cars;
by make;
run;

/*Doesn't give all attributes; I would like a table with 3 continents by column 
(Europe, Asia, USA) and every car type (Sedan, SUV, Wagon, Sports...)*/
PROC TABULATE DATA = cars;
option missing=0;
by make;
CLASS make type Type Origin / mlf MISSING  ; 
TABLE (
(type*make)
), (Origin='')   / printmiss nocellmerge ; RUN;

 

So, in order to have all the 3 continents by colum, and every type of car (Sedan, SUV, Wagon, Sports...), I use CLASSDATA, as you suggested:

 

Data level; 
set cars;
keep make type Type Origin;
Run;

PROC TABULATE DATA = cars MISSING classdata=level;
option missing=0;
by make;
CLASS make type Type Origin / mlf MISSING  ; 
TABLE (
(make*type)
), (Origin='')   / printmiss nocellmerge ; RUN;

But this gives the table that I call humongous. Isn't there a midway having:

1. all the columns like in the last table

2. only the concerned MAKEs, that is the first 6 rows for Acura
3. repeated labels 

 

ballardw
Super User

Using the Sashelp.cars data, draw a table in a word processor of what you expect the output to look like.

Label the columns and rows with the variables/values. Then post that.

 

 

 

 

 

SivaKizildag
Fluorite | Level 6

This is the output I would like to have (made with collage, so the data is irrelevant 😄 😞

07.png

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1829 views
  • 1 like
  • 2 in conversation