Hi,
I am particularly new to SAS so this question may seem quite basic for some of you...
I am using SAS 9.4 now. I have a number of variables that are named in the same structure (e.g. Year_1990, Year_1991, Year_1992, etc.), and my goal is to multiply the data within each of these variables (columns) by another set of year-specific data (for which I have created a list of macro variables (&M1990, &M1991, &M1992, etc.)).
Since I have dozens of these variables I would like to achieve this step with a do-loop. But I am not quite sure how to set up an index system to do the calculation (something that can end up with things like Year_i = Year_i * &Mi). An example of the index construction would be particularly helpful.
Thanks!
Generally speaking, you don't do arithmetic with macro variables. Now if the macro variable is an integer and you're adding another integer, maybe that's fine, I can make an exception for that, but ... really ... multiplication ought to be done in a data step with data step variables.
So wherever you got those macro variables from, make them into datastep variables when they are created. Don't put them in macro variables at all. Then, when you have the data in data step variables, arrays will allow you to handle all the looping without any issues.
Something like
data a; array years year_1990-year_1992; array m m1990-m1992; array result result1990-result1992; if _n_=1 then set datasetnamewithmvariables; set datasetnamewithyearvariables; do i=1 to dim(m); result(i)=years(i)*m(i); end; drop i; run;
Generally speaking, you don't do arithmetic with macro variables. Now if the macro variable is an integer and you're adding another integer, maybe that's fine, I can make an exception for that, but ... really ... multiplication ought to be done in a data step with data step variables.
So wherever you got those macro variables from, make them into datastep variables when they are created. Don't put them in macro variables at all. Then, when you have the data in data step variables, arrays will allow you to handle all the looping without any issues.
Something like
data a; array years year_1990-year_1992; array m m1990-m1992; array result result1990-result1992; if _n_=1 then set datasetnamewithmvariables; set datasetnamewithyearvariables; do i=1 to dim(m); result(i)=years(i)*m(i); end; drop i; run;
The macro variables essentially come from a smaller data set with different dimensions (in fact it has only 2 variables with one denoting the years and the other denoting the corresponding values for m(i) in the macro variables created through proc sql later).
I also tried to append the transposed small dataset to the large one but as they don't share the same row designations of any kind, each of the "m variables" (although they are not macro variables in this case) only has one observation at the very last row newly created through this append operation. I am wondering if there is a way to "fill" the empty rows for "m variables" and use direct multiplications for corresponding "year variables" and "m variables"...
Thanks for your suggestions!
@SirFrank wrote:
The macro variables essentially come from a smaller data set with different dimensions (in fact it has only 2 variables with one denoting the years and the other denoting the corresponding values for m(i) in the macro variables created through proc sql later).
I also tried to append the transposed small dataset to the large one but as they don't share the same row designations of any kind, each of the "m variables" (although they are not macro variables in this case) only has one observation at the very last row newly created through this append operation. I am wondering if there is a way to "fill" the empty rows for "m variables" and use direct multiplications for corresponding "year variables" and "m variables"...
Thanks for your suggestions!
My code above does exactly what you are asking for.
I have been sorting out some glitches to get the actual result...
Now I have worked through your code and it has indeed yielded the desired result.
Thanks for your help!
Just to add on a bit, I would also recommend an array, but I would use a temporary array and you can initialize the values using your macro variables.
array m(*) _temporary_ (&m1 &m2 ... &m200);
@SirFrank, youwrote:
"set of year-specific data (for which I have created a list of macro variables (&M1990, &M1991, &M1992, etc.))."
1) by what means have you created the list of macro variables ?
you could create a dataset - mvalues - with the list of variables and use it as in next code:
%let from_year = 1990;
%let upto_year = 1999;
data want;
retain m&from_year - m&upto_year;
set mvalues have;
array mv m&from_year - m&upto_year;
array yr year_&from_year - year_&upto_year;
do i=1 to dim(yr);
yr(i) = yr(i) * mv(i);
end;
run;
2) In case you got those macro variables, as they are, you can create the mvalues dataset,
by a loop in a macro, preceding the above code:
%macro m2data(from,upto);
data mvalues;
%do i=&from %to &upto;
m&i = &&m&i;
%end;
output;
run;
%mend;
%m2data(&from_year, &upto_year);
Adapt years in %LET statments to your needs.
The macro variables were created through proc sql.
I went through most part of the method you mentioned, and it worked well until the step "yr(i) = yr(i) * mv(i)". The problem might lie in that the dimensions of the two datasets are different (in particular, "mvalues" is a 1-by-dim(yr) dataset while "have" has over 6k rows). And as a consequence of this dataset appending the values of mvs are exclusively placed in the last row newly created. Essentially, in "yr(i) = yr(i) * mv(i)", yr(i) would all be multiplied by empty value and return empty. Is there a way to "fill up" the mvs before this step? Or is there a way to transform mvs into scalars?
I am particularly new to SAS so this question may really seem stupid...
Thank you so much for your help!
Given that you have macro variables, it's probably easiest to use macro language to construct your DATA step logic:
%macro multiply;
%local i;
data want;
set have;
%do i=1990 %to 1999;
year_&i = year_&i * &&m&i;
%end;
run;
%mend multiply;
%multiply
You end up with a bunch of assignment statements, created by the macro %DO loop.
To all of the above suggestions ... I guess the idea of using the macro variables instead of data step variables works fine if the variable is not formatted when the macro variable is created. If the variable is formatted somehow when the macro variable is created, you would probably see some truncation of digits (perhaps significant digits, perhaps not, depending on what format is used), and then using the macro variable would not produce the same answer as keepign everything in a data set variable and never using the macro variable.
Also, unless the user has hard-coded these macro variables earlier in the program somewhere, the values have to come from a previous data step or previous PROC output, in either case the values are in a dataset. Why go through the effort of turning data into macros so you can use it later in another data set? It seems like extra work for no benefit at all. Unnecessary.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.