&effects and &estimate are regression results stored in vector macro variables.
As shown in the pictures above, &effects is the vector variable with the names of effects.
&estimate are the vector of coefficients.
say &effects is "a b c d e f g".
&estimate is "1 2 3 4 5 6 7"
How can I use these two macro variables to create a data step equation that equals to the dot product of the two vectors?
Here's the code that I tried, but it did not produce the correct result.
%do i=1 %to %sysfunc(countw(&effects));
%let effects&i = %scan(&effects, &i, %str( ));
%end;
%do i=1 %to %sysfunc(countw(&estimate));
%let estimate&i = %scan(&estimate, &i, %str( ));
%end;
%let total1=&effects1*estimate1;
%let largeN=%sysfunc(countw(&effects));
%do i=2 %to %sysfunc(countw(&effects));
%let total&i = total%eval(&i-1).+&effects&i*&estimate&i;
%end;
%put &total&largeN; [<-this should be the yhat definition, it is an expression like a*1 + b*2 + c*3 + d*4 + e*5 + f*6 + g*7]
data option2;
set opion1;
yhat=&total&largeN; [<-this equation should be the same as "yhat=a*1 + b*2 + c*3 + d*4 + e*5 + f*6 + g*7"]
run;
Can anyone help me to solve this problem? Thank you so much!
If @jint83 is correct and this is coming from proc glmselect then help us out by running the model and then posting the result of
%put _user_;
This will give us the actual names and contents of the macro variables.
Or try adding a line something like this to your code (before the run statement)
ods output parameterestimates= work.parms;
Which will create a data set with the effect and the parameters.
You don't mention if by groups are involved but for a single model
data _null_; set work.parms end=eof; length longstr $ 500; retain longstr; longstr = catx(' + ',longstr, catx('*',effect,estimate)); call symputx('Formula',longstr); run; %put &formula;
appears to do what you want.
Well, you could cut out a few pieces in this way:
data option2;
set option1;
yhat =
%do i=1 %to %sysfunc(countw(&effects));
%scan(&effects, &i) * %scan(&estimate, &i)
%if &i < %sysfunc(countw(&effects)) %then + ;
%end;
;
run;
This does have to be inside a macro definition, to permit %if and %do.
I tried the code, but it still did not work. Thank you anyway.
I would say, don’t create this problem in the first place.
There are multiple ways to score new data, this doesn’t seem like a good approach.
Assuming that’s what you’re doing of course.
The problem was that I predicted the model using sample A, but I also want to use the model to test the accuracy of the model on sample B. However, I only know the step that directly produces predicted yhat for sample A. That is why I need to create a formula by macro variables.
Please explain what the overall purpose of the generated data step code.
If your macro variable lists came from a data set it may be that using that data set and some data transformation in a data step will work in a much cleaner manner.
If you are going to use multiple identical calls such as
%sysfunc(countw(&effects))
perhaps you would be better off calling that once and assigning the value to another macro variable. Then you could make the code a little cleaner:
%do I = 1 to &EffectCount;
Speaking on behalf of the poster. These macros, did not come from a dataset, so the question cannot be avoided, as the macros definitions come from SAS GLM select outputs:
I don't know the answer to poster's question, but I do think the question is an important one that begs to be addressed within the framework of the poster's question.
Hi, the problem was that I predicted the model from sample A.
Let's say the model is yhat=a + b*1 + c*2
I want to calculate yhat not only for the observations in sample A, but also for the observations in sample B.
Is there a simpler way that I can use to avoid using macro variables?
Thank you!
See the various methods here.
https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html
If @jint83 is correct and this is coming from proc glmselect then help us out by running the model and then posting the result of
%put _user_;
This will give us the actual names and contents of the macro variables.
Or try adding a line something like this to your code (before the run statement)
ods output parameterestimates= work.parms;
Which will create a data set with the effect and the parameters.
You don't mention if by groups are involved but for a single model
data _null_; set work.parms end=eof; length longstr $ 500; retain longstr; longstr = catx(' + ',longstr, catx('*',effect,estimate)); call symputx('Formula',longstr); run; %put &formula;
appears to do what you want.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.