BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
somebody
Lapis Lazuli | Level 10

I am writing a macro to run fixed effect regressions with clustering using the demeaning method as normal procedures give memory errors. With my current code, I have to modify the macro everytime I run a different regression as the variables are different and I have to get the means of them. I would like to write a macro which can apply to any variables I input without changing the macro. My current macro is:

%macro FEregression(dep,indep,clusterVar,FE_var); 
	      * To run with fixed effects use the method of subtracting off the mean for each date because the standard dummy variables approach needs too much memory;
	      proc sort data=panel; by &FE_var; run;
	      proc means data=panel print; by &FE_var; output out=means (drop=_TYPE_ _FREQ_) 
	            mean(&dep)=m&dep mean(A)=mA mean(B)=mB mean(C)=mC mean(D)=mD ; 
	      run;
	    	data means; merge panel means; by &FE_var; 
				&dep=&dep-m&dep;
				A=A-mA; B=B-mB;C=C-mC;D=D-mD;	
			run;
            proc surveyreg data=means; class &FE_var; cluster &clusterVar; * Cluster by clusterVar;
                  model &dep = &indep  / solution; 
            run; quit;
%mend;

%FEregression(Y, A B C D, , date)

So for this example, I am regressing Y on A B C D. 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

So what you are asking is to replace in this code:

proc means data=panel print; 
  by &FE_var; 
  output out=means (drop=_TYPE_ _FREQ_) 
   mean(&dep)=m&dep mean(A)=mA mean(B)=mB mean(C)=mC mean(D)=mD ;  
run;

data means; 
merge panel means;
by &FE_var; &dep=&dep-m&dep; A=A-mA;
B=B-mB;
C=C-mC;
D=D-mD; run;

The A B C D variables as needed from the value of &indep where that is a list of variables?

 Or are A B C D a subset of the variables in &indep? If so, how do we know what the subset would be?

 

Things might get a lot simpler if you had a VAR statement on your Proc means like

Var &dep &indep;

and used the autoname option on out put instead of forcing use of the mA mB variables. mean(&dep &indep)= /autoname would append _mean to the name of each variable.

 

the data step could then become

data means; 
  merge panel means; 
  by &FE_var; 
  &dep=&dep- &dep._mean;
  %do i= 1 %to %sysfunc(countw(&indep));
   %let tvar= %scan(&indep,&i);
   &tvar = &tvar - &tvar._mean;
  %end;
 run;

View solution in original post

2 REPLIES 2
ballardw
Super User

So what you are asking is to replace in this code:

proc means data=panel print; 
  by &FE_var; 
  output out=means (drop=_TYPE_ _FREQ_) 
   mean(&dep)=m&dep mean(A)=mA mean(B)=mB mean(C)=mC mean(D)=mD ;  
run;

data means; 
merge panel means;
by &FE_var; &dep=&dep-m&dep; A=A-mA;
B=B-mB;
C=C-mC;
D=D-mD; run;

The A B C D variables as needed from the value of &indep where that is a list of variables?

 Or are A B C D a subset of the variables in &indep? If so, how do we know what the subset would be?

 

Things might get a lot simpler if you had a VAR statement on your Proc means like

Var &dep &indep;

and used the autoname option on out put instead of forcing use of the mA mB variables. mean(&dep &indep)= /autoname would append _mean to the name of each variable.

 

the data step could then become

data means; 
  merge panel means; 
  by &FE_var; 
  &dep=&dep- &dep._mean;
  %do i= 1 %to %sysfunc(countw(&indep));
   %let tvar= %scan(&indep,&i);
   &tvar = &tvar - &tvar._mean;
  %end;
 run;
somebody
Lapis Lazuli | Level 10

yes. that is correct, so that I dont have to change those steps everytime I run a different regression. So A B C D would be all the independent variables in the list &indep. Essentially, what I would like to do is to simply regress a new regression, say Z = M N O P Q by running %FEregression(Z, M N O P Q, clusterVar,date). To do this using my current method, I would have to rewrite the PROC MEANS and DATA step in my macro

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2167 views
  • 0 likes
  • 2 in conversation