turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- out of memory error using proc GENMOD poisson

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2011 10:59 AM

I submitted the following code to get adjusted risk ratios for smoking among a black cohort vs. a reference group adjusted for other risk factors and got an out of memory error:

proc genmod data=dataset;

class id refGroup (ref='1') MultGest (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=glm;

model smoke = refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;

repeated subject=id/type=ind;

estimate 'Beta refGroup' refGroup 1 -1/ exp;

estimate 'Beta MultGest' MultGest 1 -1/ exp;

estimate 'Beta HighParity' HighParity 1 -1/ exp;

estimate 'Beta PreviousPreterm' PreviousPreterm 1 -1/ exp;

estimate 'Beta STD' STD 1 -1/ exp;

estimate 'Beta WtGnLT15' WtGnLT15 1 -1/ exp;

estimate 'Beta Inadequate' Inadequate 1 -1/ exp;

estimate 'Beta MEDICAID' MEDICAID 1 -1/ exp;

run;

When I ran this code without the poisson option (dist=bin) I received an error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations".

Question 1: Does this error message indicate that my data did not converge? If so, does that mean I should use the poisson option?

Question 2: What is causing the out of memory error?

Thanks for your help.

proc genmod data=dataset;

class id refGroup (ref='1') MultGest (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=glm;

model smoke = refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;

repeated subject=id/type=ind;

estimate 'Beta refGroup' refGroup 1 -1/ exp;

estimate 'Beta MultGest' MultGest 1 -1/ exp;

estimate 'Beta HighParity' HighParity 1 -1/ exp;

estimate 'Beta PreviousPreterm' PreviousPreterm 1 -1/ exp;

estimate 'Beta STD' STD 1 -1/ exp;

estimate 'Beta WtGnLT15' WtGnLT15 1 -1/ exp;

estimate 'Beta Inadequate' Inadequate 1 -1/ exp;

estimate 'Beta MEDICAID' MEDICAID 1 -1/ exp;

run;

When I ran this code without the poisson option (dist=bin) I received an error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations".

Question 1: Does this error message indicate that my data did not converge? If so, does that mean I should use the poisson option?

Question 2: What is causing the out of memory error?

Thanks for your help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-20-2011 07:20 PM

Have you tried specifying the SAS option:

options memsize -0;

That will allow SAS to use the maximum available memory for subsequent analyses.

You don't indicate how many observations you have (number of subjects and number of observations/subject). You also don't indicate whether the predictor variables vary over time within subject. You also don't tell us now much RAM you have or what kind of hardware you have to run these analyses on. There can be differences across operating systems in the efficiency that each is able to process large amounts of data. This is all critical information for understanding the scope of the problem.

options memsize -0;

That will allow SAS to use the maximum available memory for subsequent analyses.

You don't indicate how many observations you have (number of subjects and number of observations/subject). You also don't indicate whether the predictor variables vary over time within subject. You also don't tell us now much RAM you have or what kind of hardware you have to run these analyses on. There can be differences across operating systems in the efficiency that each is able to process large amounts of data. This is all critical information for understanding the scope of the problem.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-21-2011 09:21 AM

I tried specifying that option but I get a message that memsize is only valid at SAS startup. I tried restarting SAS and running the memsize option and got the same error message.

I have 3.5 G of RAM and I'm using a 3.0 GHz processor. That's all I know. I have close to 1,000,000 million subjects (I am looking at infant deaths over 4 years at the state level). Only 1 record per subject. Each risk factor is dichotomous (1=present, 2=not present) and refGroup=1 indicates the subject is in the reference group, and refGroup=0 indicates the subject is not in the reference group (in this case black). I'm trying to get the relative prevalence of each risk factor comparing the black population to the reference group.

I took out all of the estimate statements and the out of memory error went away. Since I don't really need the estimate statements I'm not as concerned.

Since you seem to know quite a lot I wonder if you could answer the following questions I have:

1) Examples I've seen use the repeated statement when using Poisson. Is this necessary in my case?

2) It's my understanding that if the data do not converge I need to use poisson. Is this correct? If not, what should I use?

3) It's my understanding that if I want to know the relative prevalence for a given risk factor I need to run a separate analsyis (i.e. proc genmod for each risk factor). So one model statement would look like this:

model MultGest = refGroup smoke STD...;

And another like this

model smoke = refGroup MultGest STD...;

Is that correct? If so, is my RR for a given risk factor the refGroup estimate (i.e. exp[refGroup estimate])? If so, what are the estimates next to the other risk factors (covariates)?

I really appreciate the help you've given me.

Thanks,

Ryan

I have 3.5 G of RAM and I'm using a 3.0 GHz processor. That's all I know. I have close to 1,000,000 million subjects (I am looking at infant deaths over 4 years at the state level). Only 1 record per subject. Each risk factor is dichotomous (1=present, 2=not present) and refGroup=1 indicates the subject is in the reference group, and refGroup=0 indicates the subject is not in the reference group (in this case black). I'm trying to get the relative prevalence of each risk factor comparing the black population to the reference group.

I took out all of the estimate statements and the out of memory error went away. Since I don't really need the estimate statements I'm not as concerned.

Since you seem to know quite a lot I wonder if you could answer the following questions I have:

1) Examples I've seen use the repeated statement when using Poisson. Is this necessary in my case?

2) It's my understanding that if the data do not converge I need to use poisson. Is this correct? If not, what should I use?

3) It's my understanding that if I want to know the relative prevalence for a given risk factor I need to run a separate analsyis (i.e. proc genmod for each risk factor). So one model statement would look like this:

model MultGest = refGroup smoke STD...;

And another like this

model smoke = refGroup MultGest STD...;

Is that correct? If so, is my RR for a given risk factor the refGroup estimate (i.e. exp[refGroup estimate])? If so, what are the estimates next to the other risk factors (covariates)?

I really appreciate the help you've given me.

Thanks,

Ryan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-21-2011 09:28 AM

I should clarify that my non-poisson regression code now looks like this:

proc genmod data=test2 ;

class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;

model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=bin;

run;

And my poisson regression code looks like this:

proc genmod data=test2 ;

class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;

model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;

run;

The only difference is the dist=option

proc genmod data=test2 ;

class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;

model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=bin;

run;

And my poisson regression code looks like this:

proc genmod data=test2 ;

class refGroup (ref='1') smoke (ref='2') HighParity (ref='2') PreviousPreterm (ref='2') STD (ref='2') WtGnLT15 (ref='2') Inadequate (ref='2') MEDICAID (ref='2')/param=ref;

model MultGest = refGroup smoke HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID/ link=log dist=poisson;

run;

The only difference is the dist=option

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-21-2011 02:33 PM

Ryan,

Sorry, I didn't pay attention to the note about specification of the memsize option only in the SAS configuration file or at startup. If you locate your sas configuration file (it should have name sasv9.cfg), then you can edit it to have a line:

-memsize 0

With regard to the error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations", take a look at:

http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBkQFjAA&url=http%3A%2F%2Fwww2.sas.com%2Fproceed...

The author indicates that this message indicates the need for exact Poisson regression when you encounter this error message. You must have SAS version 9.22 in order to conduct exact Poisson regression (or use other software as indicated in the above link). See the SAS documentation for version 9.22 for appropriate syntax for exact Poisson regression. (I would note that version 9.22 also allows exact logistic regression. I might think that exact logistic regression would be appropriate for your problem. But don't take that to be a recommendation. I would have to spend more time thinking about the issues and perhaps spend time with the data itself in order to make a specific recommendation.) It is possible that exact Poisson and exact logistic regression were implemented in version 9.2 as undocumented features.

Finally, you might still run into memory issues even when the memsize option is specified as indicated above given the volume of data that you have. (In fact, since the memsize option default is -memsize 0 and if you have not previously modified your SAS configuration file to change the memsize specification, then editing your configuration file will probably not change the behavior of SAS at all.) Certainly, to perform exact logistic or exact Poisson regression, you will encounter additional performance issues on top of the problems you already have. Since you have only a single observation per subject, you can greatly reduce the data processing requirements by constructing a summary data set where each combination of the response and predictors is observed only once and you record a variable that indicates the number of times that combination occurs in your data set. You can get this as follows:

proc sort data=dataset;

by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;

run;

data summary;

set dataset;

by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;

if first.MEDICAID then count=0;

count+1;

if last.MEDICAID then output;

run;

You can then name the data set summary to the GENMOD procedure and specify a FREQ statement naming the variable COUNT. Since you apparently have only binary predictors and a binary response, then the number of unique combinations of these 9 variables would be at most 2^9=512. Using the summary data set should produce great improvements in data processing.

Sorry, I didn't pay attention to the note about specification of the memsize option only in the SAS configuration file or at startup. If you locate your sas configuration file (it should have name sasv9.cfg), then you can edit it to have a line:

-memsize 0

With regard to the error message "ERROR: The mean parameter is either invalid or at a limit of its range for some observations", take a look at:

http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBkQFjAA&url=http%3A%2F%2Fwww2.sas.com%2Fproceed...

The author indicates that this message indicates the need for exact Poisson regression when you encounter this error message. You must have SAS version 9.22 in order to conduct exact Poisson regression (or use other software as indicated in the above link). See the SAS documentation for version 9.22 for appropriate syntax for exact Poisson regression. (I would note that version 9.22 also allows exact logistic regression. I might think that exact logistic regression would be appropriate for your problem. But don't take that to be a recommendation. I would have to spend more time thinking about the issues and perhaps spend time with the data itself in order to make a specific recommendation.) It is possible that exact Poisson and exact logistic regression were implemented in version 9.2 as undocumented features.

Finally, you might still run into memory issues even when the memsize option is specified as indicated above given the volume of data that you have. (In fact, since the memsize option default is -memsize 0 and if you have not previously modified your SAS configuration file to change the memsize specification, then editing your configuration file will probably not change the behavior of SAS at all.) Certainly, to perform exact logistic or exact Poisson regression, you will encounter additional performance issues on top of the problems you already have. Since you have only a single observation per subject, you can greatly reduce the data processing requirements by constructing a summary data set where each combination of the response and predictors is observed only once and you record a variable that indicates the number of times that combination occurs in your data set. You can get this as follows:

proc sort data=dataset;

by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;

run;

data summary;

set dataset;

by smoke refGroup MultGest HighParity PreviousPreterm STD WtGnLT15 Inadequate MEDICAID;

if first.MEDICAID then count=0;

count+1;

if last.MEDICAID then output;

run;

You can then name the data set summary to the GENMOD procedure and specify a FREQ statement naming the variable COUNT. Since you apparently have only binary predictors and a binary response, then the number of unique combinations of these 9 variables would be at most 2^9=512. Using the summary data set should produce great improvements in data processing.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-22-2011 09:11 AM

I ran across this macro that does just what I need and it looks like it works.

http://www.cdc.gov/niosh/ext-supp-mat/pr-sasmac/

Thanks for your help with this. You gave me some good tricks.

http://www.cdc.gov/niosh/ext-supp-mat/pr-sasmac/

Thanks for your help with this. You gave me some good tricks.