That will allow SAS to use the maximum available memory for subsequent analyses.
You don't indicate how many observations you have (number of subjects and number of observations/subject). You also don't indicate whether the predictor variables vary over time within subject. You also don't tell us now much RAM you have or what kind of hardware you have to run these analyses on. There can be differences across operating systems in the efficiency that each is able to process large amounts of data. This is all critical information for understanding the scope of the problem.
I tried specifying that option but I get a message that memsize is only valid at SAS startup. I tried restarting SAS and running the memsize option and got the same error message.
I have 3.5 G of RAM and I'm using a 3.0 GHz processor. That's all I know. I have close to 1,000,000 million subjects (I am looking at infant deaths over 4 years at the state level). Only 1 record per subject. Each risk factor is dichotomous (1=present, 2=not present) and refGroup=1 indicates the subject is in the reference group, and refGroup=0 indicates the subject is not in the reference group (in this case black). I'm trying to get the relative prevalence of each risk factor comparing the black population to the reference group.
I took out all of the estimate statements and the out of memory error went away. Since I don't really need the estimate statements I'm not as concerned.
Since you seem to know quite a lot I wonder if you could answer the following questions I have:
1) Examples I've seen use the repeated statement when using Poisson. Is this necessary in my case?
2) It's my understanding that if the data do not converge I need to use poisson. Is this correct? If not, what should I use?
3) It's my understanding that if I want to know the relative prevalence for a given risk factor I need to run a separate analsyis (i.e. proc genmod for each risk factor). So one model statement would look like this:
model MultGest = refGroup smoke STD...;
And another like this
model smoke = refGroup MultGest STD...;
Is that correct? If so, is my RR for a given risk factor the refGroup estimate (i.e. exp[refGroup estimate])? If so, what are the estimates next to the other risk factors (covariates)?
Sorry, I didn't pay attention to the note about specification of the memsize option only in the SAS configuration file or at startup. If you locate your sas configuration file (it should have name sasv9.cfg), then you can edit it to have a line:
With regard to the error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations", take a look at:
The author indicates that this message indicates the need for exact Poisson regression when you encounter this error message. You must have SAS version 9.22 in order to conduct exact Poisson regression (or use other software as indicated in the above link). See the SAS documentation for version 9.22 for appropriate syntax for exact Poisson regression. (I would note that version 9.22 also allows exact logistic regression. I might think that exact logistic regression would be appropriate for your problem. But don't take that to be a recommendation. I would have to spend more time thinking about the issues and perhaps spend time with the data itself in order to make a specific recommendation.) It is possible that exact Poisson and exact logistic regression were implemented in version 9.2 as undocumented features.
Finally, you might still run into memory issues even when the memsize option is specified as indicated above given the volume of data that you have. (In fact, since the memsize option default is -memsize 0 and if you have not previously modified your SAS configuration file to change the memsize specification, then editing your configuration file will probably not change the behavior of SAS at all.) Certainly, to perform exact logistic or exact Poisson regression, you will encounter additional performance issues on top of the problems you already have. Since you have only a single observation per subject, you can greatly reduce the data processing requirements by constructing a summary data set where each combination of the response and predictors is observed only once and you record a variable that indicates the number of times that combination occurs in your data set. You can get this as follows:
by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;
if first.MEDICAID then count=0;
if last.MEDICAID then output;
You can then name the data set summary to the GENMOD procedure and specify a FREQ statement naming the variable COUNT. Since you apparently have only binary predictors and a binary response, then the number of unique combinations of these 9 variables would be at most 2^9=512. Using the summary data set should produce great improvements in data processing.