BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Swarn30
Calcite | Level 5
I have to use proc summary multiple times in the program. so i wrote this macro. The macro is acting funny. Weight_var when ran simultaneously with weights given and no weights given, the macro is giving different results every time. Sometimes its right, sometimes its wrong. I know I have a lot of DO loops in this code, wanted to understand what is going wrong at times, maybe i am missing something fundamentally or is it too many loops confusing the code. I am writing two different macros where weight is present and where weight is not present for the program. but wanted to understand what is going wrong in this code at times for basics understanding. Thank you for the help.


%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
 
    PROC SUMMARY DATA = &INPUT_FILE. NWAY MISSING;
    CLASS &CLASS_VARS.;
    %IF &WEIGHT_VAR. NE . %THEN %DO;  /* ACTIVATE WEIGHT ONLY IF WEIGHT_VAR IS NOT EMPTY */
        WEIGHT &WEIGHT_VAR.;
    %END;
    VAR &VARS.;
%LET NUM_VARS = %SYSFUNC(COUNTW(&VARS.));    /* COUNTING THE NUMBER OF VARIABLES IN THE LIST */
    %LET NUM_OPS = %SYSFUNC(COUNTW(&OPERATION_VARS.)); /* COUNTING NUMBER OF OPERATIONS */
    OUTPUT OUT= &OUTPUT_FILE.
        %DO I = 1 %TO &NUM_OPS.;  /* LOOP FOR READING OPERATIONS LIST LIKE MEAN, SUM, ETC. ALL STATS TYPES */
            %LET OPERATION_ACTIVE = %SCAN(&OPERATION_VARS., &I);
            %DO J = 1 %TO &NUM_VARS.;  /* LOOP FOR READING VARIABLES TO PERFORM OPERATIONS ON */
                %LET VAR_ACTIVE = %SCAN(&VARS., &J);
%IF %BQUOTE(&WEIGHT_VAR.) NE %THEN %DO;
                &OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._VW
%END;
%ELSE %DO;
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW;
%END;
%END;
        %END;
    ;
    RUN;
 
%MEND PROC_SUMMARY_CALS;

%PROC_SUMMARY_CALS(INPUT_FILE=, CLASS_VARS=, WEIGHT_VAR=, VARS=, OPERATION_VARS=, OUTPUT_FILE=);
 
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

I don't see anything that is obviously wrong, other than the previously mentioned extra semicolon.

 

I would write this differently, but it's certainly fine to use nested %DO loops.  You should add a %LOCAL statement, just to make sure you're not accidentally using global macro variables.

 

But I'm still not understanding the problem statement.  It shouldn't be 'sometimes right and sometimes wrong.' 

 

Here is a version of your code with a %LOCAL statement, and reformatted the  indenting a bit:

%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
    %local NUM_VARS NUM_OPS I J OPERATION_ACTIVE VAR_ACTIVE ;
   
    PROC SUMMARY DATA = &INPUT_FILE. NWAY MISSING;

    CLASS &CLASS_VARS.;
    %IF &WEIGHT_VAR. NE . %THEN %DO;  /* ACTIVATE WEIGHT ONLY IF WEIGHT_VAR IS NOT EMPTY -- this is not working like you intend, but it is probably not the problem*/
        WEIGHT &WEIGHT_VAR.;
    %END;

    VAR &VARS.;
  
    %LET NUM_VARS = %SYSFUNC(COUNTW(&VARS.));    /* COUNTING THE NUMBER OF VARIABLES IN THE LIST */
    %LET NUM_OPS = %SYSFUNC(COUNTW(&OPERATION_VARS.)); /* COUNTING NUMBER OF OPERATIONS */

    OUTPUT OUT= &OUTPUT_FILE.
        %DO I = 1 %TO &NUM_OPS.;  /* LOOP FOR READING OPERATIONS LIST LIKE MEAN, SUM, ETC. ALL STATS TYPES */
            %LET OPERATION_ACTIVE = %SCAN(&OPERATION_VARS., &I);
            %DO J = 1 %TO &NUM_VARS.;  /* LOOP FOR READING VARIABLES TO PERFORM OPERATIONS ON */
                %LET VAR_ACTIVE = %SCAN(&VARS., &J);
                %IF %BQUOTE(&WEIGHT_VAR.) NE %THEN %DO;
                    &OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._VW
                %END;
                %ELSE %DO;
                    &OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW
                %END;
            %END;
        %END;
    ;
    RUN;
 
%MEND PROC_SUMMARY_CALS;

I would suggest turning on the MPRINT system option, so you can see the PROC SUMMARY step that is generated when you run the macro.  That should help you figure out what is going wrong.

Another thought is to run the macro against sashelp.class or sashelp.shoes, and see if you can replicate the problem.  If you can get a surprising result from data that we all have, we can test it and try to explain.

I called the above macro like:


%PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
                 ,CLASS_VARS=sex
                 ,WEIGHT_VAR=age
                 ,VARS=height
                 ,OPERATION_VARS=sum
                 ,OUTPUT_FILE=want)

And it looks like it's generating the step you want:

130  %PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
131                   ,CLASS_VARS=sex
132                   ,WEIGHT_VAR=age
133                   ,VARS=height
134                   ,OPERATION_VARS=sum
135                   ,OUTPUT_FILE=want)
MPRINT(PROC_SUMMARY_CALS):   PROC SUMMARY DATA = sashelp.class NWAY MISSING;
MPRINT(PROC_SUMMARY_CALS):   CLASS sex;
MPRINT(PROC_SUMMARY_CALS):   WEIGHT age;
MPRINT(PROC_SUMMARY_CALS):   VAR height;
MPRINT(PROC_SUMMARY_CALS):   OUTPUT OUT= want sum(height) = sum_height_VW ;
MPRINT(PROC_SUMMARY_CALS):   RUN;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.WANT has 2 observations and 4 variables.

I tried a couple more calls, but they looked good to me too.

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

View solution in original post

9 REPLIES 9
PeterClemmensen
Tourmaline | Level 20

Before diving into your code. Is is really necessary to write a macro? How many times do you use Proc Summary? And do you run Proc Summary on multiple data sets or against different data in the same data set?

Swarn30
Calcite | Level 5

Not really necessary but i have to use proc summary 15 to 20 times in the code repeatedly. It was becoming confusing so I thought of writing a single macro for proc summary so that code looks more tidy and readable. i have to present codes to the stakeholders, more readable it is, better it is. 

/*  Value Weighted proc summary */ 
%PROC_SUMMARY_CALS(INPUT_FILE=, CLASS_VARS=, WEIGHT_VAR=, VARS=, OPERATION_VARS=, OUTPUT_FILE=); 

/* Equal weighted proc summary */
%PROC_SUMMARY_CALS(INPUT_FILE=, CLASS_VARS=, WEIGHT_VAR=, VARS=, OPERATION_VARS=, OUTPUT_FILE=); 

/* MERGING BOTH THE DATASETS WITH FILE DATASETS */
%MERGE_TIME_VARYING(INPUT_FILE=, FILE_2=, SORT_VARS=, OUTPUT_FILE=);
%MERGE_TIME_VARYING(INPUT_FILE=FILE_066, FILE_2=, SORT_VARS=, OUTPUT_FILE=);

ballardw
Super User

What does "simultaneously" in this statement mean?: "Weight_var when ran simultaneously with weights given and no weights given" .

With Proc summary you either have a weight variable or do not so I do not see how "simultaneously" come into this. And I would definitely expect different results with and without weights.

 

You have logic errors related to building your your output statement. Because you have a semicolon at the end of

&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW;

The first statistic requested ends the output list and throws errors.

 

I might suggest using the AUTONAME option and then having a separate step to create Rename syntax for Proc Datasteps if you need different variable names. If you have longish variable names you have nothing that will prevent creating variable names longer than 32 characters which will cause all sorts of issues.

 

Swarn30
Calcite | Level 5

sorry for using the word simulataneously. It is the below mentioned code, when i am running this one, at times it giving EW and VW values both and sometimes it is just giving VW values not EW values. the code is getting executed but the output is mostly coming VW values, hence my confusion of what I am doing wrong logically. 
Thanks for ; .. I will implement it and check the results again, m taking care that summary variables names dont go beyond 32.
/* *********Value weighted calculations****************** */
%PROC_SUMMARY_CALS(INPUT_FILE=FILE_066, CLASS_VARS=var1, WEIGHT_VAR=weight_asset, VARS=constructed_var, OPERATION_VARS=MEAN, OUTPUT_FILE=FILE_09990); 
/* **********Equal Weighted calculations ***************** */
%PROC_SUMMARY_CALS(INPUT_FILE=FILE_066, CLASS_VARS=var1, WEIGHT_VAR=, VARS=constructed_var, OPERATION_VARS=MEAN, OUTPUT_FILE=FILE_09933); 

/* MERGING BOTH THE DATASETS  */
%MERGE_TIME_VARYING(INPUT_FILE=FILE_09990, FILE_2=FILE_09933, SORT_VARS=SECID YEAR MONTH, OUTPUT_FILE=TEMP_FILE_6);

Tom
Super User Tom
Super User

The calls to your macro

%PROC_SUMMARY_CALS
(INPUT_FILE=FILE_066
,CLASS_VARS=var1
,WEIGHT_VAR=weight_asset
,VARS=constructed_var
,OPERATION_VARS=MEAN
,OUTPUT_FILE=FILE_09990
); 

are almost as long as the PROC SUMMARY code itself. 

PROC SUMMARY DATA = FILE_066 NWAY MISSING;
  CLASS var1;
  WEIGHT weight_asset;
  VAR constructed_var;
  OUTPUT OUT= FILE_09990 MEAN = MEAN_constructed_var_VW ;
RUN;

Not sure that it is actually helping to make the code simpler.  You have added a level of obfuscation for no real benefit.

Swarn30
Calcite | Level 5
I totally agree with you. the code I am working on have gone beyond 3000 lines with proc summary used around 20 times in it. m a research assistant in uni so what i do is just write the macro in one sheet and put all the macro calls in one another sheet and forward it to get checked..kinda working for me. that is why i was trying to make just one macro for weights present or not, so that stakeholders have to go just thru one macro code. if u see an alternative approach which will be better please hit me up with it.
Tom
Super User Tom
Super User

As I said the proc summary code is not any longer than the macro calls.  So adding the macro just makes the code HARDER TO UNDERSTAND but does not make it ANY simpler.

 

If you want help making the whole program simpler then you need to explain the whole program.

Quentin
Super User

I don't see anything that is obviously wrong, other than the previously mentioned extra semicolon.

 

I would write this differently, but it's certainly fine to use nested %DO loops.  You should add a %LOCAL statement, just to make sure you're not accidentally using global macro variables.

 

But I'm still not understanding the problem statement.  It shouldn't be 'sometimes right and sometimes wrong.' 

 

Here is a version of your code with a %LOCAL statement, and reformatted the  indenting a bit:

%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
    %local NUM_VARS NUM_OPS I J OPERATION_ACTIVE VAR_ACTIVE ;
   
    PROC SUMMARY DATA = &INPUT_FILE. NWAY MISSING;

    CLASS &CLASS_VARS.;
    %IF &WEIGHT_VAR. NE . %THEN %DO;  /* ACTIVATE WEIGHT ONLY IF WEIGHT_VAR IS NOT EMPTY -- this is not working like you intend, but it is probably not the problem*/
        WEIGHT &WEIGHT_VAR.;
    %END;

    VAR &VARS.;
  
    %LET NUM_VARS = %SYSFUNC(COUNTW(&VARS.));    /* COUNTING THE NUMBER OF VARIABLES IN THE LIST */
    %LET NUM_OPS = %SYSFUNC(COUNTW(&OPERATION_VARS.)); /* COUNTING NUMBER OF OPERATIONS */

    OUTPUT OUT= &OUTPUT_FILE.
        %DO I = 1 %TO &NUM_OPS.;  /* LOOP FOR READING OPERATIONS LIST LIKE MEAN, SUM, ETC. ALL STATS TYPES */
            %LET OPERATION_ACTIVE = %SCAN(&OPERATION_VARS., &I);
            %DO J = 1 %TO &NUM_VARS.;  /* LOOP FOR READING VARIABLES TO PERFORM OPERATIONS ON */
                %LET VAR_ACTIVE = %SCAN(&VARS., &J);
                %IF %BQUOTE(&WEIGHT_VAR.) NE %THEN %DO;
                    &OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._VW
                %END;
                %ELSE %DO;
                    &OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW
                %END;
            %END;
        %END;
    ;
    RUN;
 
%MEND PROC_SUMMARY_CALS;

I would suggest turning on the MPRINT system option, so you can see the PROC SUMMARY step that is generated when you run the macro.  That should help you figure out what is going wrong.

Another thought is to run the macro against sashelp.class or sashelp.shoes, and see if you can replicate the problem.  If you can get a surprising result from data that we all have, we can test it and try to explain.

I called the above macro like:


%PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
                 ,CLASS_VARS=sex
                 ,WEIGHT_VAR=age
                 ,VARS=height
                 ,OPERATION_VARS=sum
                 ,OUTPUT_FILE=want)

And it looks like it's generating the step you want:

130  %PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
131                   ,CLASS_VARS=sex
132                   ,WEIGHT_VAR=age
133                   ,VARS=height
134                   ,OPERATION_VARS=sum
135                   ,OUTPUT_FILE=want)
MPRINT(PROC_SUMMARY_CALS):   PROC SUMMARY DATA = sashelp.class NWAY MISSING;
MPRINT(PROC_SUMMARY_CALS):   CLASS sex;
MPRINT(PROC_SUMMARY_CALS):   WEIGHT age;
MPRINT(PROC_SUMMARY_CALS):   VAR height;
MPRINT(PROC_SUMMARY_CALS):   OUTPUT OUT= want sum(height) = sum_height_VW ;
MPRINT(PROC_SUMMARY_CALS):   RUN;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.WANT has 2 observations and 4 variables.

I tried a couple more calls, but they looked good to me too.

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.
Swarn30
Calcite | Level 5
Thank you very much, got it know.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 927 views
  • 0 likes
  • 5 in conversation