BookmarkSubscribeRSS Feed
RyanD
Fluorite | Level 6
I submitted the following code to get adjusted risk ratios for smoking among a black cohort vs. a reference group adjusted for other risk factors and got an out of memory error:

proc genmod data=dataset;
class id refGroup (ref='1') MultGest (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=glm;
model smoke = refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;
repeated subject=id/type=ind;
estimate 'Beta refGroup' refGroup 1 -1/ exp;
estimate 'Beta MultGest' MultGest 1 -1/ exp;
estimate 'Beta HighParity' HighParity 1 -1/ exp;
estimate 'Beta PreviousPreterm' PreviousPreterm 1 -1/ exp;
estimate 'Beta STD' STD 1 -1/ exp;
estimate 'Beta WtGnLT15' WtGnLT15 1 -1/ exp;
estimate 'Beta Inadequate' Inadequate 1 -1/ exp;
estimate 'Beta MEDICAID' MEDICAID 1 -1/ exp;
run;

When I ran this code without the poisson option (dist=bin) I received an error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations".

Question 1: Does this error message indicate that my data did not converge? If so, does that mean I should use the poisson option?

Question 2: What is causing the out of memory error?

Thanks for your help.
5 REPLIES 5
Dale
Pyrite | Level 9
Have you tried specifying the SAS option:

options memsize -0;

That will allow SAS to use the maximum available memory for subsequent analyses.

You don't indicate how many observations you have (number of subjects and number of observations/subject). You also don't indicate whether the predictor variables vary over time within subject. You also don't tell us now much RAM you have or what kind of hardware you have to run these analyses on. There can be differences across operating systems in the efficiency that each is able to process large amounts of data. This is all critical information for understanding the scope of the problem.
RyanD
Fluorite | Level 6
I tried specifying that option but I get a message that memsize is only valid at SAS startup. I tried restarting SAS and running the memsize option and got the same error message.

I have 3.5 G of RAM and I'm using a 3.0 GHz processor. That's all I know. I have close to 1,000,000 million subjects (I am looking at infant deaths over 4 years at the state level). Only 1 record per subject. Each risk factor is dichotomous (1=present, 2=not present) and refGroup=1 indicates the subject is in the reference group, and refGroup=0 indicates the subject is not in the reference group (in this case black). I'm trying to get the relative prevalence of each risk factor comparing the black population to the reference group.

I took out all of the estimate statements and the out of memory error went away. Since I don't really need the estimate statements I'm not as concerned.

Since you seem to know quite a lot I wonder if you could answer the following questions I have:

1) Examples I've seen use the repeated statement when using Poisson. Is this necessary in my case?

2) It's my understanding that if the data do not converge I need to use poisson. Is this correct? If not, what should I use?

3) It's my understanding that if I want to know the relative prevalence for a given risk factor I need to run a separate analsyis (i.e. proc genmod for each risk factor). So one model statement would look like this:
model MultGest = refGroup smoke STD...;
And another like this
model smoke = refGroup MultGest STD...;
Is that correct? If so, is my RR for a given risk factor the refGroup estimate (i.e. exp[refGroup estimate])? If so, what are the estimates next to the other risk factors (covariates)?

I really appreciate the help you've given me.

Thanks,
Ryan
RyanD
Fluorite | Level 6
I should clarify that my non-poisson regression code now looks like this:

proc genmod data=test2 ;
class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;
model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=bin;
run;


And my poisson regression code looks like this:

proc genmod data=test2 ;
class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;
model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;
run;


The only difference is the dist=option
Dale
Pyrite | Level 9
Ryan,

Sorry, I didn't pay attention to the note about specification of the memsize option only in the SAS configuration file or at startup. If you locate your sas configuration file (it should have name sasv9.cfg), then you can edit it to have a line:

-memsize 0


With regard to the error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations", take a look at:

http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBkQFjAA&url=http%3A%2F%2Fwww2.sas.com%2Fproceed...

The author indicates that this message indicates the need for exact Poisson regression when you encounter this error message. You must have SAS version 9.22 in order to conduct exact Poisson regression (or use other software as indicated in the above link). See the SAS documentation for version 9.22 for appropriate syntax for exact Poisson regression. (I would note that version 9.22 also allows exact logistic regression. I might think that exact logistic regression would be appropriate for your problem. But don't take that to be a recommendation. I would have to spend more time thinking about the issues and perhaps spend time with the data itself in order to make a specific recommendation.) It is possible that exact Poisson and exact logistic regression were implemented in version 9.2 as undocumented features.

Finally, you might still run into memory issues even when the memsize option is specified as indicated above given the volume of data that you have. (In fact, since the memsize option default is -memsize 0 and if you have not previously modified your SAS configuration file to change the memsize specification, then editing your configuration file will probably not change the behavior of SAS at all.) Certainly, to perform exact logistic or exact Poisson regression, you will encounter additional performance issues on top of the problems you already have. Since you have only a single observation per subject, you can greatly reduce the data processing requirements by constructing a summary data set where each combination of the response and predictors is observed only once and you record a variable that indicates the number of times that combination occurs in your data set. You can get this as follows:

proc sort data=dataset;
  by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;
run;

data summary;
  set dataset;
  by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;
  if first.MEDICAID then count=0;
  count+1;
  if last.MEDICAID then output;
run;

You can then name the data set summary to the GENMOD procedure and specify a FREQ statement naming the variable COUNT. Since you apparently have only binary predictors and a binary response, then the number of unique combinations of these 9 variables would be at most 2^9=512. Using the summary data set should produce great improvements in data processing.
RyanD
Fluorite | Level 6
I ran across this macro that does just what I need and it looks like it works.
http://www.cdc.gov/niosh/ext-supp-mat/pr-sasmac/

Thanks for your help with this. You gave me some good tricks.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3636 views
  • 0 likes
  • 2 in conversation