Help using Base SAS procedures

PROC SUMMARY and MULTILABEL formats.

Reply
Respected Advisor
Posts: 3,799

PROC SUMMARY and MULTILABEL formats.

I was trying to see how many 2 way tables I could get out of PROC SUMMARY and I noticed a performance issue with multi-lable formats and the MLF option.  I think my test must be flawed but I can't figure a way to get the same performance without MLF even if the formats don't actually have multi-lables but just use the option as in my example below.


17         data sample;
18            array d[1500];
19            do id = 1 to 100;
20               trt = rantbl(8,.3,.3);
21               do i = 1 to dim(d);
22                  d = rantbl(8,.1,.05,.3,.4);
23                  end;
24               output;
25               end;
26            run;

NOTE:
The data set WORK.SAMPLE has 100 observations and 1503 variables.
NOTE: DATA statement used (Total process time):
      real time          
0.03 seconds
      user cpu time      
0.02 seconds
      system cpu time    
0.01 seconds
     

27         proc format;
28            value dgrp(notsorted) 2='Group A' 1='Group C' 5='Group D' 4='Group E' 3='Group B';
29            value trt(notsorted)  1='Placebo' 2='Active 1' 3='Active 2';
29       !                                                                ;
2                                                          The SAS System                             08:41 Monday, January 13, 2014

30            run;
NOTE: PROCEDURE FORMAT used (Total process time):
      real time          
0.00 seconds
      user cpu time      
0.00 seconds
      system cpu time    
0.00 seconds
     

31         proc summary data=sample chartype completetypes missing;
32            class d: trt / preloadfmt order=data;
33            format d: dgrp. trt trt.;
34            types trt (dSmiley Happy*trt;
35            output out=_null_;* / levels ways;
36            run;

NOTE:
Multiple concurrent threads will be used to summarize data.
NOTE: There were
100 observations read from the data set WORK.SAMPLE.
NOTE: PROCEDURE SUMMARY used (Total process time):
      real time          
30.20 seconds
      user cpu time      
29.84 seconds
      system cpu time    
0.36 seconds
      memory             
186330.71k
      OS Memory          
207000.00k
      Timestamp          
01/13/2014 08:58:08 AM
      Page Faults                      
1
      Page Reclaims                    
0
      Page Swaps                       
0
      Voluntary Context Switches       
136
      Involuntary Context Switches     
234
      Block Input Operations           
1
      Block Output Operations          
0
     

37        
38         proc format;
39            value dgrp(notsorted multilabel) 2='Group A' 1='Group C' 5='Group D' 4='Group E' 3='Group B';
40            value trt(notsorted multilabel)  1='Placebo' 2='Active 1' 3='Active 2' /*1,2,3='Total'*/;
41            run;

NOTE:
PROCEDURE FORMAT used (Total process time):
      real time          
0.00 seconds
      user cpu time      
0.00 seconds
      system cpu time    
0.00 seconds
     

42         proc summary data=sample chartype completetypes missing;
43            class d: trt / preloadfmt mlf order=data;
44            format d: dgrp. trt trt.;
45            types trt (dSmiley Happy*trt;
46            output out=_null_;* / levels ways;
47            run;

NOTE:
Multiple concurrent threads will be used to summarize data.
NOTE: There were
100 observations read from the data set WORK.SAMPLE.
NOTE: PROCEDURE SUMMARY used (Total process time):
      real time          
3.42 seconds
      user cpu time      
4.14 seconds
      system cpu time    
0.50 seconds
      memory             
257941.45k
      OS Memory          
278520.00k
      Timestamp          
01/13/2014 08:58:11 AM
      Page Faults                      
0
      Page Reclaims                    
0
      Page Swaps                       
0
      Voluntary Context Switches       
157
      Involuntary Context Switches     
43
      Block Input Operations           
0
      Block Output Operations          
0
     

Respected Advisor
Posts: 3,799

Re: PROC SUMMARY and MULTILABEL formats.

Posted in reply to data_null__

I was thinking that someone might comment.  I guess I should have made it a question.

Super User
Super User
Posts: 7,039

Re: PROC SUMMARY and MULTILABEL formats.

Posted in reply to data_null__

What is the question?  The second summary appears to use less time and more memory.  More memory might be attributed to the MLF option. Less time might just be disk caching of you input data set.

Respected Advisor
Posts: 3,799

Re: PROC SUMMARY and MULTILABEL formats.

It's not disk caching.

As best I can tell the performance different is related to the use of MLF.  I just think it is interesting and wondered if anyone had noticed it before or maybe someone from SAS had an explanation.    It does take a lot of variables before the performance different becomes noticeable.

SAS Super FREQ
Posts: 8,864

Re: PROC SUMMARY and MULTILABEL formats.

Posted in reply to data_null__


Hi:

  I didn't comment because you're not using MULTILABEL and MLF the way they are intended. Your user-defined format does not have any "overlapping" values...so I'm not sure what your test shows. The usual way to specify the MLF would be if you had 5 years in the data (1998 - 2002) and you wanted the year categories to be every year "by itself" and then the 1998-1999 as one category, and the 2000-2002 years as a separate category. So essentially, you are "double counting" each year.:

1998

1999

2000

2001

2002

1998 and 1999

2000 through 2002

  You might have the same results, you might not. As we say in the Advanced Programming class -- Your Mileage May Vary.

cynthia

Respected Advisor
Posts: 3,799

Re: PROC SUMMARY and MULTILABEL formats.

Posted in reply to Cynthia_sas

I removed the "true" MULTILABEL from my example to make the two steps equivalent in the ouput they produce, same number of observations from the same number of variables and crossings.

PROC FORMAT and PROC SUMMARY don't seem to care if you used MULTILABEL as it was intended or not.

Ask a Question
Discussion stats
  • 5 replies
  • 728 views
  • 0 likes
  • 3 in conversation