I am trying to get this code to work:
PROC REG DATA=CITIES; ID CITY; BY YEAR STATE; %DO I=1 %TO 5; %DO J=(&I.+1) %TO 6; X&I._X&J.: MODEL YVAR = X&I. X&J.; %END; %END; ODS OUTPUT FitStatistics=OUTPUT_1; RUN;
It does, and for C(6,2) = 15 models, it takes about 27 minutes. Not great, but I haven't found a more efficient way to run it.
Anyway, I tried it with a larger number of models yesterday so it would run overnight, with I from 1 to 25 and J from (I+1) to 26. That is, C(26,2) = 325 models. It apparently couldn't handle that many models; it crashed after 2+ hours with the error
ERROR: An exception has been encountered.
Please contact technical support and provide them with the following traceback information:
The SAS task name is [REG]
ERROR: Write Access Violation REG
Exception occurred at (########)
I really need to run it up to C(130,2) = 8385 models, while I go on vacation next week. But it doesn't seem to be able to handle this. So what is the limit on number of models REG can handle? And are there any thoughts on a better way to do this? (Previously I had macro loops with data steps and a PROC REG for each iteration, and that worked, but it was much, much slower.)
@mathmannix wrote:
ERROR: An exception has been encountered. Please contact technical support and provide them with the following traceback information: The SAS task name is [REG] ERROR: Write Access Violation REG Exception occurred at (########)
Whether there is a limit (which I doubt), or some other problem, is something I don't know. However, the error message is very clear, you need to contact technical support.
As far as the idea of running 8385 regression models each with two predictor variables, I'm sure there must be a better way, but then we don't really know what the project is or what your goals are. There are pre-programmed methods in SAS that run all possible combinations of models, or a subset thereof, so I would see if those would work for you, specifically PROC REG with the option SELECTION=RSQUARE or SELECTION=CP. Also, I note that I do not recommend these methods in general, for statistical reasons that I won't get into here, but if you have a lot of variables and want to fit a model, I do recommend Partial Least Squares (PROC PLS) and put all your potential variables into the model and let PLS figure out which are the important ones.
Submit this to SAS technical support, as this is not the way SAS should react in case it runs out of resources (its should throw the usual ERROR message with a code and a short description).
But I think you will run out of resources if you try to run that many models in one procedure call.
Have you considered wrapping the outer macro loop around the procedure call itself?
You should show the code with two hard-coded cases that worked as intended.
And your BY statement may not work as intended with multiple model statements:
When a BY statement is used with PROC REG, interactive processing is not possible; that is, once the first RUN statement is encountered, processing proceeds for each BY group in the data set, and no further statements are accepted by the procedure.
@mathmannix wrote:It does, and for C(6,2) = 15 models, it takes about 27 minutes. Not great, but I haven't found a more efficient way to run it.
What is the size (numbers of observations and variables) of your input dataset and what is the number of BY groups? This information would help us to judge whether 27 min is reasonable.
On my computer the setting ods graphics on; has an enormous impact on the run times of procedures which (like PROC REG) produce ODS graphics by default. For example, the run time of a PROC REG step for a single model of the form y=x1 x2 with 6 BY groups on SASHELP.HEART (~5000 obs.) soars from 0.04 s to >10.0 s as soon as I switch ODS graphics on.
Note that in your example ("by year state") PROC REG would possibly (try to) create up to 33,540,000 (!) plots if you requested 8385 models for each of the 400 combinations of, say, 8 years and 50 states (10 plots per model).
Wow, just wow. I did not know ODS Graphics was on by default. This makes PROC REG so much faster, thank you so much!
You say you are running 15 or 325 models, but actually you are running that many times the number of joint levels in the YEAR and STATE variables. If you have 10 years and 50 states, that is 500*15 and 500*325 models.
It is possible that your problem is caused because PROC REG automatically produces ODS graphics for each model.
Prior to your PROC REG step, use
ODS GRAPHICS OFF;
to make sure the procedure does not create any graphs. You probably also want to suppress the thousands of other tables that are automatically created by using
ODS exclude all;
before the call.
I also think you need to use a VAR statement so that PROC REG will create and reuse a single X`X matrix that includes all the variables in the multiple models. For example,
VAR X: YVAR;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.