This is part three of an ongoing series into how to accomplish a task WITHOUT using a macro.
In this post, I go over one method to run multiple regressions with multiple dependent variables.
This is based on a question that was recently asked on the communities forum:
For example consider if you have 10 dependent variables Y1-Y10 and you want to run regressions against 3 independent variables X1-X3
EDIT: Rick Wicklin (a SAS employee) has published an blog post that has a fully worked example with code here:
https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html
Sample data:
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 X1 X2 X3
This would result in a PROC GLM model that looked like the following:
PROC GLM DATA=sample;
model y1 = x1 x2 x3;
run;
One method is to write a macro to loop through all the different Y variables using the above PROC GLM code. Another is to transpose your data to a different format that then allows you to use the BY group processing available in most PROCS.
Output data:
Y_Index Y X1 X2 X3
1 Y1
2 Y2
..
10 Y10
1 Y1
2 Y2
..
10 Y10
Then the PROC GLM code becomes (after a sort):
PROC SORT data=sample_transpose;
by Y_index;
run;
PROC GLM DATA=sample;
BY Y_Index;
model y= x1 x2 x3;
run;
This runs all 10 models at once. Additionally, if you're capturing model estimates via an output or ODS statement they're all included in the same table and there is no need to append different tables from each model.
Hope you found this useful!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.