I am trying to write a macro that runs regressions with or without BY variables. It is working if I specify the BY variables. But what if I dont have any BY variable and want to perform regressions on the whole sample? That is, if the parameter by_var is none, then ignore the BY statement. My code so far is:
%macro regress(input,dep,indep,by_var,cluster_var,FE,modelNo);
*FE is time fixed effect;
data sample; set &input.;run;
ods output ParameterEstimates=para_est fitstatistics=fit;
proc surveyreg data=sample;
output out=fitted_&dep&modelNo.(keep=stock date predicted_value rename=(predicted_value=fitted&modelNo._&dep)) p= Predicted_value r=residual u=upper_bound l=lower_bound;
cluster &cluster_var;
by &by_var;
%if &FE.=1 %then %do;
class date;
model &dep = &indep date/solution;
%end;
%else %if &FE.=0 %then %do;
model &dep = &indep/solution;
%end;
run;
%mend;
Hello,
Make by_var a keyword parameter :
%macro regress(input,dep, indep, cluster_var, FE, modelNo, by_var=);
*FE is time fixed effect;
data sample; set &input.;run;
ods output ParameterEstimates=para_est fitstatistics=fit;
proc surveyreg data=sample;
output out=fitted_&dep&modelNo.(keep=stock date predicted_value rename=(predicted_value=fitted&modelNo._&dep)) p= Predicted_value r=residual u=upper_bound l=lower_bound;
cluster &cluster_var;
%if &by_var. ne %then %do;
by &by_var;
%end;
%if &FE.=1 %then %do;
class date;
model &dep = &indep date/solution;
%end;
%else %if &FE.=0 %then %do;
model &dep = &indep/solution;
%end;
run;
%mend;
Note, in the macro signature, how by_var is now after the positional parameters and is followed by "=".
You can now call the macro with or without giving a value for this parameter :
%regress(input1,dep1, indep1, cluster_var1, 1, model1);
%regress(input1,dep1, indep1, cluster_var1, 1, model1, by_var=var1);
In the first call, the by statement will be ignored as the condition "&by_var. ne " will be false.
Hello,
Make by_var a keyword parameter :
%macro regress(input,dep, indep, cluster_var, FE, modelNo, by_var=);
*FE is time fixed effect;
data sample; set &input.;run;
ods output ParameterEstimates=para_est fitstatistics=fit;
proc surveyreg data=sample;
output out=fitted_&dep&modelNo.(keep=stock date predicted_value rename=(predicted_value=fitted&modelNo._&dep)) p= Predicted_value r=residual u=upper_bound l=lower_bound;
cluster &cluster_var;
%if &by_var. ne %then %do;
by &by_var;
%end;
%if &FE.=1 %then %do;
class date;
model &dep = &indep date/solution;
%end;
%else %if &FE.=0 %then %do;
model &dep = &indep/solution;
%end;
run;
%mend;
Note, in the macro signature, how by_var is now after the positional parameters and is followed by "=".
You can now call the macro with or without giving a value for this parameter :
%regress(input1,dep1, indep1, cluster_var1, 1, model1);
%regress(input1,dep1, indep1, cluster_var1, 1, model1, by_var=var1);
In the first call, the by statement will be ignored as the condition "&by_var. ne " will be false.
Hi,
I understand that the following checks if by_var exist.
%if &by_var. ne %then %do;by &by_var;%end;
What if I want to check if by_var does not exist? I tried
%if &by_var. e %then %do; something;%end;
but it does not work
@somebody wrote:
I am trying to write a macro that runs regressions with or without BY variables. It is working if I specify the BY variables. But what if I dont have any BY variable and want to perform regressions on the whole sample? That is, if the parameter by_var is none, then ignore the BY statement. My code so far is:
%macro regress(input,dep,indep,by_var,cluster_var,FE,modelNo); *FE is time fixed effect; data sample; set &input.;run; ods output ParameterEstimates=para_est fitstatistics=fit; proc surveyreg data=sample; output out=fitted_&dep&modelNo.(keep=stock date predicted_value rename=(predicted_value=fitted&modelNo._&dep)) p= Predicted_value r=residual u=upper_bound l=lower_bound; cluster &cluster_var; by &by_var; %if &FE.=1 %then %do; class date; model &dep = &indep date/solution; %end; %else %if &FE.=0 %then %do; model &dep = &indep/solution; %end; run; %mend;
Please also check the documentation for SURVEYREG (and the other survey analysis procedures) and the by statement.
From the online documentation:
Note that using a BY statement provides completely separate analyses of the BY groups. It does not provide a statistically valid domain (subpopulation) analysis, where the total number of units in the subpopulation is not known with certainty. You should use the DOMAIN statement to obtain domain analysis.
Be very sure that you actually want BY instead of DOMAIN.
There is more detail in the DOMAIN statement documentation.
Say for example, I want to study the effect of food on health in different countries. Should I would use PROC SURVEYREG with the BY or DOMAIN statement on Country?Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.