- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't see anything that is obviously wrong, other than the previously mentioned extra semicolon.
I would write this differently, but it's certainly fine to use nested %DO loops. You should add a %LOCAL statement, just to make sure you're not accidentally using global macro variables.
But I'm still not understanding the problem statement. It shouldn't be 'sometimes right and sometimes wrong.'
Here is a version of your code with a %LOCAL statement, and reformatted the indenting a bit:
%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
%local NUM_VARS NUM_OPS I J OPERATION_ACTIVE VAR_ACTIVE ;
PROC SUMMARY DATA = &INPUT_FILE. NWAY MISSING;
CLASS &CLASS_VARS.;
%IF &WEIGHT_VAR. NE . %THEN %DO; /* ACTIVATE WEIGHT ONLY IF WEIGHT_VAR IS NOT EMPTY -- this is not working like you intend, but it is probably not the problem*/
WEIGHT &WEIGHT_VAR.;
%END;
VAR &VARS.;
%LET NUM_VARS = %SYSFUNC(COUNTW(&VARS.)); /* COUNTING THE NUMBER OF VARIABLES IN THE LIST */
%LET NUM_OPS = %SYSFUNC(COUNTW(&OPERATION_VARS.)); /* COUNTING NUMBER OF OPERATIONS */
OUTPUT OUT= &OUTPUT_FILE.
%DO I = 1 %TO &NUM_OPS.; /* LOOP FOR READING OPERATIONS LIST LIKE MEAN, SUM, ETC. ALL STATS TYPES */
%LET OPERATION_ACTIVE = %SCAN(&OPERATION_VARS., &I);
%DO J = 1 %TO &NUM_VARS.; /* LOOP FOR READING VARIABLES TO PERFORM OPERATIONS ON */
%LET VAR_ACTIVE = %SCAN(&VARS., &J);
%IF %BQUOTE(&WEIGHT_VAR.) NE %THEN %DO;
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._VW
%END;
%ELSE %DO;
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW
%END;
%END;
%END;
;
RUN;
%MEND PROC_SUMMARY_CALS;
I would suggest turning on the MPRINT system option, so you can see the PROC SUMMARY step that is generated when you run the macro. That should help you figure out what is going wrong.
Another thought is to run the macro against sashelp.class or sashelp.shoes, and see if you can replicate the problem. If you can get a surprising result from data that we all have, we can test it and try to explain.
I called the above macro like:
%PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
,CLASS_VARS=sex
,WEIGHT_VAR=age
,VARS=height
,OPERATION_VARS=sum
,OUTPUT_FILE=want)
And it looks like it's generating the step you want:
130 %PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class 131 ,CLASS_VARS=sex 132 ,WEIGHT_VAR=age 133 ,VARS=height 134 ,OPERATION_VARS=sum 135 ,OUTPUT_FILE=want) MPRINT(PROC_SUMMARY_CALS): PROC SUMMARY DATA = sashelp.class NWAY MISSING; MPRINT(PROC_SUMMARY_CALS): CLASS sex; MPRINT(PROC_SUMMARY_CALS): WEIGHT age; MPRINT(PROC_SUMMARY_CALS): VAR height; MPRINT(PROC_SUMMARY_CALS): OUTPUT OUT= want sum(height) = sum_height_VW ; MPRINT(PROC_SUMMARY_CALS): RUN; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.WANT has 2 observations and 4 variables.
I tried a couple more calls, but they looked good to me too.
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Before diving into your code. Is is really necessary to write a macro? How many times do you use Proc Summary? And do you run Proc Summary on multiple data sets or against different data in the same data set?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not really necessary but i have to use proc summary 15 to 20 times in the code repeatedly. It was becoming confusing so I thought of writing a single macro for proc summary so that code looks more tidy and readable. i have to present codes to the stakeholders, more readable it is, better it is.
/* Value Weighted proc summary */
%PROC_SUMMARY_CALS(INPUT_FILE=, CLASS_VARS=, WEIGHT_VAR=, VARS=, OPERATION_VARS=, OUTPUT_FILE=);
/* Equal weighted proc summary */
%PROC_SUMMARY_CALS(INPUT_FILE=, CLASS_VARS=, WEIGHT_VAR=, VARS=, OPERATION_VARS=, OUTPUT_FILE=);
/* MERGING BOTH THE DATASETS WITH FILE DATASETS */
%MERGE_TIME_VARYING(INPUT_FILE=, FILE_2=, SORT_VARS=, OUTPUT_FILE=);
%MERGE_TIME_VARYING(INPUT_FILE=FILE_066, FILE_2=, SORT_VARS=, OUTPUT_FILE=);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What does "simultaneously" in this statement mean?: "Weight_var when ran simultaneously with weights given and no weights given" .
With Proc summary you either have a weight variable or do not so I do not see how "simultaneously" come into this. And I would definitely expect different results with and without weights.
You have logic errors related to building your your output statement. Because you have a semicolon at the end of
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW;
The first statistic requested ends the output list and throws errors.
I might suggest using the AUTONAME option and then having a separate step to create Rename syntax for Proc Datasteps if you need different variable names. If you have longish variable names you have nothing that will prevent creating variable names longer than 32 characters which will cause all sorts of issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
sorry for using the word simulataneously. It is the below mentioned code, when i am running this one, at times it giving EW and VW values both and sometimes it is just giving VW values not EW values. the code is getting executed but the output is mostly coming VW values, hence my confusion of what I am doing wrong logically.
Thanks for ; .. I will implement it and check the results again, m taking care that summary variables names dont go beyond 32.
/* *********Value weighted calculations****************** */
%PROC_SUMMARY_CALS(INPUT_FILE=FILE_066, CLASS_VARS=var1, WEIGHT_VAR=weight_asset, VARS=constructed_var, OPERATION_VARS=MEAN, OUTPUT_FILE=FILE_09990);
/* **********Equal Weighted calculations ***************** */
%PROC_SUMMARY_CALS(INPUT_FILE=FILE_066, CLASS_VARS=var1, WEIGHT_VAR=, VARS=constructed_var, OPERATION_VARS=MEAN, OUTPUT_FILE=FILE_09933);
/* MERGING BOTH THE DATASETS */
%MERGE_TIME_VARYING(INPUT_FILE=FILE_09990, FILE_2=FILE_09933, SORT_VARS=SECID YEAR MONTH, OUTPUT_FILE=TEMP_FILE_6);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The calls to your macro
%PROC_SUMMARY_CALS
(INPUT_FILE=FILE_066
,CLASS_VARS=var1
,WEIGHT_VAR=weight_asset
,VARS=constructed_var
,OPERATION_VARS=MEAN
,OUTPUT_FILE=FILE_09990
);
are almost as long as the PROC SUMMARY code itself.
PROC SUMMARY DATA = FILE_066 NWAY MISSING;
CLASS var1;
WEIGHT weight_asset;
VAR constructed_var;
OUTPUT OUT= FILE_09990 MEAN = MEAN_constructed_var_VW ;
RUN;
Not sure that it is actually helping to make the code simpler. You have added a level of obfuscation for no real benefit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As I said the proc summary code is not any longer than the macro calls. So adding the macro just makes the code HARDER TO UNDERSTAND but does not make it ANY simpler.
If you want help making the whole program simpler then you need to explain the whole program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't see anything that is obviously wrong, other than the previously mentioned extra semicolon.
I would write this differently, but it's certainly fine to use nested %DO loops. You should add a %LOCAL statement, just to make sure you're not accidentally using global macro variables.
But I'm still not understanding the problem statement. It shouldn't be 'sometimes right and sometimes wrong.'
Here is a version of your code with a %LOCAL statement, and reformatted the indenting a bit:
%MACRO PROC_SUMMARY_CALS(INPUT_FILE, CLASS_VARS, WEIGHT_VAR, VARS, OPERATION_VARS, OUTPUT_FILE);
%local NUM_VARS NUM_OPS I J OPERATION_ACTIVE VAR_ACTIVE ;
PROC SUMMARY DATA = &INPUT_FILE. NWAY MISSING;
CLASS &CLASS_VARS.;
%IF &WEIGHT_VAR. NE . %THEN %DO; /* ACTIVATE WEIGHT ONLY IF WEIGHT_VAR IS NOT EMPTY -- this is not working like you intend, but it is probably not the problem*/
WEIGHT &WEIGHT_VAR.;
%END;
VAR &VARS.;
%LET NUM_VARS = %SYSFUNC(COUNTW(&VARS.)); /* COUNTING THE NUMBER OF VARIABLES IN THE LIST */
%LET NUM_OPS = %SYSFUNC(COUNTW(&OPERATION_VARS.)); /* COUNTING NUMBER OF OPERATIONS */
OUTPUT OUT= &OUTPUT_FILE.
%DO I = 1 %TO &NUM_OPS.; /* LOOP FOR READING OPERATIONS LIST LIKE MEAN, SUM, ETC. ALL STATS TYPES */
%LET OPERATION_ACTIVE = %SCAN(&OPERATION_VARS., &I);
%DO J = 1 %TO &NUM_VARS.; /* LOOP FOR READING VARIABLES TO PERFORM OPERATIONS ON */
%LET VAR_ACTIVE = %SCAN(&VARS., &J);
%IF %BQUOTE(&WEIGHT_VAR.) NE %THEN %DO;
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._VW
%END;
%ELSE %DO;
&OPERATION_ACTIVE.(&VAR_ACTIVE.) = &OPERATION_ACTIVE._&VAR_ACTIVE._EW
%END;
%END;
%END;
;
RUN;
%MEND PROC_SUMMARY_CALS;
I would suggest turning on the MPRINT system option, so you can see the PROC SUMMARY step that is generated when you run the macro. That should help you figure out what is going wrong.
Another thought is to run the macro against sashelp.class or sashelp.shoes, and see if you can replicate the problem. If you can get a surprising result from data that we all have, we can test it and try to explain.
I called the above macro like:
%PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class
,CLASS_VARS=sex
,WEIGHT_VAR=age
,VARS=height
,OPERATION_VARS=sum
,OUTPUT_FILE=want)
And it looks like it's generating the step you want:
130 %PROC_SUMMARY_CALS(INPUT_FILE=sashelp.class 131 ,CLASS_VARS=sex 132 ,WEIGHT_VAR=age 133 ,VARS=height 134 ,OPERATION_VARS=sum 135 ,OUTPUT_FILE=want) MPRINT(PROC_SUMMARY_CALS): PROC SUMMARY DATA = sashelp.class NWAY MISSING; MPRINT(PROC_SUMMARY_CALS): CLASS sex; MPRINT(PROC_SUMMARY_CALS): WEIGHT age; MPRINT(PROC_SUMMARY_CALS): VAR height; MPRINT(PROC_SUMMARY_CALS): OUTPUT OUT= want sum(height) = sum_height_VW ; MPRINT(PROC_SUMMARY_CALS): RUN; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.WANT has 2 observations and 4 variables.
I tried a couple more calls, but they looked good to me too.
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content