BookmarkSubscribeRSS Feed
data_null__
Jade | Level 19

I can undstand that PROC SUMMARY cannot preload a nested format, as the nested format may not have a finite number of levels.  eg.  Other=[F8.2].

Often I have a nested format in a value label format that is another value label format.  (don't ask just go with it)

Has anyone thought up an easy way to un-nest this into a new format.  Using CNTLOUT you can ID the nested levels HLO=F and LABEL=the format name in square brackets.

12 REPLIES 12
Astounding
PROC Star

I'm not sure you can do that, but there may be a workaround.  It might help to know a little more about how you are using the format.  For example, if it is with a CLASS variable you might be able to apply the format ahead of time:

data temp / view=temp;

set my_original_data;

new_var = put(old_var, my_nested_format.);

run;

proc summary data=temp;

class new_var;

This would change the order of the output categories.  But it's not clear if that is an issue or not (and there might be another workaround for that using ID variables).

Good luck.

data_null__
Jade | Level 19

Yes I want to PRELOADFMT the nested format.  But that is not allowed.  Your example tosses the whole preloadfmt bit. 

Astounding
PROC Star

Guilty as charged.  I was hoping for an easy solution.

If your formats define values only (not ranges), it shouldn't be too difficult to combine them into a single format.  (If either one contains ranges, this becomes a nasty problem.)  Create a CNTLOUT data set for each format, and eliminate any reference to the nested format in the "outer" CNTLOUT data set.  Then merge the two CNTLOUT data sets BY START.  If the nested format was the OTHER= definition for the "outer" format, take the outer format's LABEL when there is a match.  Use the result as a CNTLIN= data set.  That gives you a single format you can preload.

Does this sound about right?

FriedEgg
SAS Employee

Is this a reasonable assumption of your example problem?

proc format;

value mage

  0 - 2 = 'Infant'

  3 - 4 = 'Toddler'

  5 - 10 = 'Young Child'

  11 - 15 = 'Tween'

  16 - 19 = 'Teenager'

  other = 'Non Minor'

;

value mlfage (multilabel)

  0 - 17 = 'Minor'

  0 - 19 = [mage]

  10 - 19 = 'Denarian'

  18 - 29 = 'Young Adult'

  20 - 29 = 'Vicenarian'

  30 - 39 = 'Tricenarian'

  30 - 49 = 'Adult'

  50 - 59 = 'Quadragenarian'

  50 - 64 = 'Middle-Aged Adults'

  60 - 69 = 'Sexagenarian'

  65 - high = 'Senior'

  70 - 79 = 'Septuagenarian'

  80 - 89 = 'Octogenarian'

  90 - 99 = 'Nonagenarian'

  other = '***ERROR***'

;

run;

proc sql;

create view class as select sex,height,weight,case(rand('TABLE',.33,.33,.33)) when 1 then age*2 when 2 then age when 3 then age/2 end as age from sashelp.class order by sex;

quit;

proc summary data=class print n mean max min range std nonobs nway fw=8;

by sex;

var height weight;

class age /mlf order=data;

format age mlfage.;

run;

Preloadfmt I do not think is necessary for proc summary when using multilabel formats, use mlf option instead, unlike with proc report.  This produces the expected results above.  This however will not work if the nested format is also multilabel.

Astounding
PROC Star

The discussion seems to have died out a bit here.  Here's a set of assumptions, plus the program format that you can work with.

The assumptions are at least within the realm of reason:

  - The "outer" format defines values only, not ranges.

  - The "outer" format uses the "inner" format for its OTHER= definition.

  - The "inner" format defines values only, not ranges, except that it can define an OTHER= category.

I know those are restrictive, and may not fit what you need to do.  But if the assumptions are valid, the code is pretty straightforward.

proc format on the "outer" format, CNTLOUT=outer (where=(HLO ne 'O'));

proc format on the "inner" format, CNTLOUT=inner;

If necessary (I forget the order of CNTLOUT= data sets):

proc sort data=outer; by start; run;

proc sort data=inner; by start; run;

data combine;

   set inner outer;

   by start;

   if last.start;

   fmtname='Combined';

run;

proc format cntlin=combine;

run;

My recollection is that if you use a multilabel format, you get different results.  The same observation can be counted twice, and contribute to the count of two separate cells of your report.

Good luck.

FriedEgg
SAS Employee

Astounding wrote:

My recollection is that if you use a multilabel format, you get different results.  The same observation can be counted twice, and contribute to the count of two separate cells of your report.

Yes, that is the intentional purpose for it's use.  Judging by what you said in you explanation of the problem however it is a differnt question than the one I answered...

data_null__
Jade | Level 19

Astounding wrote:

The discussion seems to have died out a bit here.  Here's a set of assumptions, plus the program format that you can work with.


Yes sorry,  I've enjoyed the discussion and there are some good ideas.  The problem may be to difficult for the value it returns.  I was thinking of the simplest case where both inner and outer are simple one to one value labels.   I may return to this at a later time.

Tom
Super User Tom
Super User

Sounds like you need to build up the list of values by traversing the tree defined by the nested structure of your formats.  I have done this with macro loops of SQL queries for both path analysis of social contacts for criminology research and program dependencies for system development.  I also had some success using the NETDRAW procedure from SAS/OR to do the analysis.

Astounding
PROC Star

There's definitely a question or two about what needs to be solved.

If this is a one-time effort that has to process ranges, and should not use multilabel formats, and can stand to have human intervention applied, here's a pragmatic approach.  Combine the two CNTLOUT= data sets, but eliminate from the INNER definition just the records that identify exactly the same range (match on BOTH START and END).  Then try to use the result as a CNTLIN= data set and let SAS give you an error message for the ranges that overlap.  Modify the CNTLIN= data set to eliminate the overlap and try again.  Repeat until the errors stop.

Here's an example of a difficult situation when ranges come into play.  The outer format definition includes:

15 - 25 = 'Small'

28 - 32 = 'Medium'

And the inner format definition includes:

20 - 30 = 'Young'

So the inner range has to change to the equivalent of:

25 <- < 28 = 'Young'

There's no easy, automated solution that I see.  Tip of the hat to anyone who can do it.

data_null__
Jade | Level 19

Not sure I understand your point the only time the inner for is called is when the range of the outter includes =[inner.].  Like I said I probably don't undrestand.

Astounding
PROC Star

Sorry, I can make that clearer.  Here's a more complete example:

proc format;

value outer 15 - 25 = 'Small'    28 - 32 = 'Medium'  Other = [inner.};

value inner 20 - 30 = 'Young';

run;

Now a few sample mappings would be:

25 = Small

29 = Medium

27 = Young

The full inner range of 20 - 30 cannot be used.  Because inner is the OTHER= definition for outer, it should only be applied to values that outer has not defined.  In this case, that would limit the application of inner to values greater than 25 and less than 28.

Tom
Super User Tom
Super User

Data _null_ -

   Here a really simple example of pulling two nested formats and merging into one.

    I have manually assigned the ORDER variable in this case.  That is where graph analysis would allow automation and circular reference detection.

- Tom

proc format ;

  value outer 1='One(outer)' 3='Three(outer)' other=[inner.];

  value inner 1='One(inner)' 2='Two(inner)' ;

run;

proc format noprint lib=work cntlout=test1;

select inner outer;

run;

data test2;

  set test1;

  if fmtname='OUTER' then order=2;

  else order=1;

  if hlo='OF' then delete;

run;

proc sort ;

  by start end order;

run;

data test3;

  update test2 (obs=0) test2(keep=fmtname start end label);

  by start end;

  fmtname='NEW';

run;

proc format cntlin=test3 lib=work; run;

proc format lib=work fmtlib;

  select inner outer new;

run;

----------------------------------------------------------------------------

|       FORMAT NAME: INNER    LENGTH:   10   NUMBER OF VALUES:    2        |

|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH  10  FUZZ: STD        |

|--------------------------------------------------------------------------|

|START           |END             |LABEL  (VER. 8.2     08FEB2012:14:18:17)|

|----------------+----------------+----------------------------------------|

|               1|               1|One(inner)                              |

|               2|               2|Two(inner)                              |

----------------------------------------------------------------------------

----------------------------------------------------------------------------

|       FORMAT NAME: NEW      LENGTH:   12   NUMBER OF VALUES:    3        |

|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH  12  FUZZ: STD        |

|--------------------------------------------------------------------------|

|START           |END             |LABEL  (VER. 8.2     08FEB2012:14:18:17)|

|----------------+----------------+----------------------------------------|

|               1|               1|One(outer)                              |

|               2|               2|Two(inner)                              |

|               3|               3|Three(outer)                            |

----------------------------------------------------------------------------

----------------------------------------------------------------------------

|       FORMAT NAME: OUTER    LENGTH:   40   NUMBER OF VALUES:    3        |

|   MIN LENGTH:   1  MAX LENGTH:  40  DEFAULT LENGTH  40  FUZZ: STD        |

|--------------------------------------------------------------------------|

|START           |END             |LABEL  (VER. 8.2     08FEB2012:14:18:17)|

|----------------+----------------+----------------------------------------|

|               1|               1|One(outer)                              |

|               3|               3|Three(outer)                            |

|**OTHER**       |**OTHER**       |[INNER40.]                              |

----------------------------------------------------------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1616 views
  • 0 likes
  • 4 in conversation