05-19-2016 03:16 PM
I am trying to write a macro to get out three variables WAITCAT30A , WAITCAT40A, WAITCAT50A..... WAITCAT30B, WAITCAT30B, WAITCAT30B.... and so on.
I am trying to use a do loop to generate all these variables.
I can't remember how to define macro variables with trailing letters behind it. I remember it was taught in MACRO I.
05-19-2016 03:56 PM - edited 05-19-2016 03:57 PM
Your problem isn't very clear but I suspect you are talking about using leading ampersands to create your macro variables dynamically something like:
%let Varname = WAITCAT30A; %let &&&VarName = something; %put &Varname = &&&VarName;
05-19-2016 03:57 PM
I'm not quite sure what "get out" means here, but this will create global macro variables using the names you describe:
%macro all_names (prefix=WAITCAT, suffix=);
%do i = 30 %to 50 %by 10;
Yes, you can pull off using a single parameter for suffixes if you are clever: suffix_list=A B C
But that takes more skill and a longer macro. Is it necessary?
05-19-2016 04:01 PM - edited 05-19-2016 05:44 PM
I wrote a general macro that does that (I use it all the time). Here is how to use it:
*Create list of variables; %let list_of_variables = %metric_list(WAITCAT, 30 40 50, A B C D E, sep=); *Print list; %put list_of_variables = &list_of_variables.; *The output that you will get is: WAITCAT30A WAITCAT30B WAITCAT30C WAITCAT30D WAITCAT30E... WAITCAT40A WAITCAT40B WAITCAT40C WAITCAT40D WAITCAT40E... WAITCAT50A WAITCAT50B WAITCAT50C WAITCAT50D WAITCAT50E... ;
Is that what you were expecting to get?
If you meant to get only once set at a time (e.g. A, B, C, etc.) then this is another way to use it:
*Create list of variables; %let list_of_variables_A = %metric_list(WAITCAT, 30 40 50, A, sep=); %let list_of_variables_B = %metric_list(WAITCAT, 30 40 50, B, sep=); %let list_of_variables_C = %metric_list(WAITCAT, 30 40 50, C, sep=); *Print list; %put list_of_variables_A = &list_of_variables_A.; %put list_of_variables_B = &list_of_variables_B.; %put list_of_variables_C = &list_of_variables_C.; *The output that you will get is: list_of_variables_A = WAITCAT30A WAITCAT40A WAITCAT50A list_of_variables_B = WAITCAT30B WAITCAT40B WAITCAT50B list_of_variables_C = WAITCAT30C WAITCAT40C WAITCAT50C ;
Does this answer your question?
If so, here is the code of the actual macro:
%macro metric_list (part0, part1, part2, part3, part4, part5, part6, part7, part8, part9, sep=_); %local result result_temp; %local level; %local ix1 vx1; %local ix2 vx2; %let result = ; %do level = 0 %to 9; %if %length(&result.)=0 %then %do; %let result = &&&part&level.; %end; %else %if %length(&&&part&level.)~=0 %then %do; %let result_temp = ; %let ix1 = 1; %let vx1 = %scan(&result., &ix1., %str( )); %do %while(%length(&vx1.)); %let ix2 = 1; %let vx2 = %scan(&&&part&level., &ix2., %str( )); %do %while(%length(&vx2.)); %let result_temp = &result_temp. &vx1.&sep.&vx2.; %let ix2 = %eval(&ix2. + 1); %let vx2 = %scan(&&&part&level., &ix2., %str( )); %end; %let ix1 = %eval(&ix1. + 1); %let vx1 = %scan(&result., &ix1., %str( )); %end; %let result = &result_temp.; %end; %end; &result. %mend metric_list; %put TEST01: %metric_list(A, B, C, D, E, F, G, H, I, J); %put TEST02: %metric_list(A, B, C, D, E, F, G, H, I, J, sep=); %put TEST03: %metric_list(X, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, sep=); %put TEST04: %metric_list(revenue volume profit, period1 period2 period3 period4, prodA prodB prodC); %put TEST05: %metric_list();
05-19-2016 04:17 PM
Can you use variable lists at all to help simplify that?
05-20-2016 04:42 AM
Totally agree with @Reeza, its a large block of code, but for what purpose. You seem to be trying to fight the system with it.
Firstly, why would you want to create lots of variables? Its far easier to work with normalised data (observations rather than variables), so setup your dataset so you have:
Then use by group processing. This is both more effcient, saves vast amounts of coding, and is easier to maintain.
Even if for some unknown reason you really have no choice but to create all those variables, then use arrays, and do loop over any array. Again, simpler than messing around with macro code. Its all in Base SAS.
05-20-2016 11:51 AM - edited 05-20-2016 06:37 PM
@RW9 You make good points. For me it depends on the use case.
On one hand, if this is something that one only needs to do just once, then one could write an ad-hoc piece of code (much smaller) to get the job done. Certainly true!
On the other hand, if this need comes up all the time (just with different metrics, periods, products, regions, etc.) then this macro (which is not that long) can become very helpful. For me it has been... specially when one starts reaching the hundreds or thousands of variables of different sorts.
On the question about what is easier to maintain, I tend to look at the long run. In my experience it has been easier to maintain the macro calls than maintaining the ad-hoc code that I had before it. Because the macro is parametric, one just needs to change the inputs, that's it! I rarely have to make changes to the macro itself (because it is very general.)
On the question about why have a lot of variables, if we are talking about reporting or analysis in aggregate, I completely agree that its better to keep the datasets "normalized" (in a broad sense of the word.) If we are talking about modeling, then one needs to generate a "customer signature" with all the variables in transposed form (e.g. one records per customers, lots of columns for all the different metrics.) This is the case for reggresion, classiffication, clustering, and other similar tasks. Here is where I use the macro.
Arrays are very helpful for many applications, but they dont set up the name combinations exactly as requested in this problem. If one creates the variable list, one could use an array to reference it (I am not disputing that.)
Bottom line, I beleive that there is rarely a single right answer when it comes to design/architecture. In depends on the size/scale, complexity, degree of change, things that are more likely to change, etc. I'm not sure what would be Lei's preference but thought this may be helpful, at least for some folks as it was for me. Thanks for the comment. I hope this contributes to the discussion.
05-24-2016 08:13 AM - edited 05-24-2016 08:17 AM
The key point in your statement there however is:
"...specially when one starts reaching the hundreds or thousands of variables of different sorts."
The question arises why are you dealing with a) data with thousands of variables, and b) why write code to handle data structure changes. Both can be avoided by changes to the structure of the data so that you are using the functionality of SAS - fixed data structure, many observations (which also most other formats use - databases, CDISC etc.), rather than many columns which may change, and fewer observations - which is a good indicator that someone is used to/uses Excel.
To give an example. Lab data, this is where we have many parameters, which could have text or numeric results, and units. I could define a dataset which looks like this:
WBC UNIT RESULT RBC UNIT RESULT... for each test. This may work, but if I drop RBC or WBC then I need to change my code, or at least put code in to develop variable lists and such like. Now consider the simple data change to (CDISC):
PARAM UNIT RESULT_C RESULT_N
WBC xyz xyz 123
on so on for each parameter. You see in the second instance the structure of the data does not change regardless of if I drop RBC or add tests. For reporting or aggregates etc, I can use by group processing, which means I don't need to knwo what unique obs there is. Its all automatic and very simple base SAS programming.
05-24-2016 09:02 AM
You need to use a period to indicate the end of the macro variable name.
So if you had a macro variable named VARNAME with the value WAITCAT30 and you want to generate the string WAITCAT30A you would code &varname.A.
Note that this means that when you actually want a period you need to type it twice since the first one will be interpretted as ending the macro variable reference. So if you had macro variable named LIBREF with the value WORK and you wanted to generate the dataset name WORK.MYDATA you would code &libref..MYDATA.