Programming the statistical procedures from SAS

Integer overflow in Proc GLM

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

Integer overflow in Proc GLM

what does the message "Integer overflow on computing amount of memory required" in Proc GLM mean? The model is correct and already worked for other data. Thanks for any help!


Accepted Solutions
Solution
‎01-26-2015 10:52 AM
Valued Guide
Valued Guide
Posts: 684

Re: Integer overflow in Proc GLM

You need a data= statement in MIXED. Procedures use the most recently created data file as default if a data= option is not given. The proc is trying to use covpe as the data file since this is the most recently created. By the way, you are estimating a LOT of random effects. I would be cautious -- your model could be overparametrized.

View solution in original post


All Replies
Respected Advisor
Posts: 2,655

Re: Integer overflow in Proc GLM

Can you share your GLM code, and some info on the datasets where it worked, and on the dataset where it now does not?

Steve Denham

Occasional Contributor
Posts: 18

Re: Integer overflow in Proc GLM

The code ist:

Proc GLM;

class habitat site block treatment mesh substrate sampling;

model trans =

habitat

treatment

treatment*habitat

substrate

substrate*habitat

mesh

mesh*habitat

sampling

sampling*habitat

substrate*treatment

substrate*treatment*habitat

mesh*treatment

mesh*treatment*habitat

sampling*treatment

sampling*treatment*habitat

substrate*mesh

substrate*mesh*habitat

substrate*sampling

substrate*sampling*habitat

mesh*sampling

mesh*sampling*habitat

substrate*mesh*treatment

substrate*mesh*treatment*habitat

substrate*mesh*sampling

substrate*mesh*sampling*habitat

substrate*treatment*sampling

substrate*treatment*sampling*habitat

mesh*treatment*sampling

mesh*treatment*sampling*habitat

substrate*mesh*treatment*sampling

substrate*mesh*treatment*sampling*habitat

site(habitat)

block(site*habitat)

treatment*site(habitat)

treatment*block(site*habitat)

substrate*site(habitat)

substrate*block(site*habitat)

mesh*site(habitat)

mesh*block(site*habitat)

sampling*site(habitat)

sampling*block(site*habitat)

substrate*treatment*site(habitat)

substrate*treatment*block(site*habitat)

mesh*treatment*site(habitat)

mesh*treatment*block(site*habitat)

sampling*treatment*site(habitat)

sampling*treatment*block(site*habitat)

substrate*mesh*site(habitat)

substrate*mesh*block(site*habitat)

substrate*sampling*site(habitat)

substrate*sampling*block(site*habitat)

mesh*sampling*site(habitat)

mesh*sampling*block(site*habitat)

substrate*mesh*treatment*site(habitat)

substrate*mesh*treatment*block(site*habitat)

substrate*mesh*sampling*site(habitat)

substrate*mesh*sampling*block(site*habitat)

substrate*treatment*sampling*site(habitat)

substrate*treatment*sampling*block(site*habitat)

mesh*treatment*sampling*site(habitat)

mesh*treatment*sampling*block(site*habitat)

substrate*mesh*treatment*sampling*site(habitat)

substrate*mesh*treatment*sampling*block(site*habitat);

test h = habitat e = site(habitat);

test h = site(habitat) e = block(site*habitat);

test h = treatment treatment*habitat e = treatment*site(habitat);

test h = treatment*site(habitat) e = treatment*block(site*habitat);

test h = substrate substrate*habitat e = substrate*site(habitat);

test h = substrate*site(habitat) e = substrate*block(site*habitat);

test h = mesh mesh*habitat e = mesh*site(habitat);

test h = mesh*site(habitat) e = mesh*block(site*habitat);

test h = sampling sampling*habitat e = sampling*site(habitat);

test h = sampling*site(habitat) e = sampling*block(site*habitat);

test h = substrate*treatment substrate*treatment*habitat e = substrate*treatment*site(habitat);

test h = substrate*treatment*site(habitat) e = substrate*treatment*block(site*habitat);

test h = mesh*treatment mesh*treatment*habitat e = mesh*treatment*site(habitat);

test h = mesh*treatment*site(habitat) e = mesh*treatment*block(site*habitat);

test h = sampling*treatment sampling*treatment*habitat e = sampling*treatment*site(habitat);

test h = sampling*treatment*site(habitat) e = sampling*treatment*block(site*habitat);

test h = substrate*mesh substrate*mesh*habitat e = substrate*mesh*site(habitat);

test h = substrate*mesh*site(habitat) e = substrate*mesh*block(site*habitat);

test h = substrate*sampling substrate*sampling*habitat e = substrate*sampling*site(habitat);

test h = substrate*sampling*site(habitat) e = substrate*sampling*block(site*habitat);

test h = mesh*sampling mesh*sampling*habitat e = mesh*sampling*site(habitat);

test h = mesh*Sampling*site(habitat) e = mesh*sampling*block(site*habitat);

test h = substrate*mesh*treatment substrate*mesh*treatment*habitat e = substrate*mesh*treatment*site(habitat);

test h = substrate*mesh*treatment*site(habitat) e = substrate*mesh*treatment*block(site*habitat);

test h = substrate*mesh*sampling substrate*mesh*sampling*habitat e = substrate*mesh*sampling*site(habitat);

test h = substrate*mesh*sampling*site(habitat) e = substrate*mesh*sampling*block(Site*habitat);

test h = substrate*treatment*sampling substrate*treatment*sampling*habitat e = substrate*treatment*sampling*site(habitat);

test h = substrate*treatment*sampling*site(habitat) e = substrate*treatment*sampling*block(site*habitat);

test h = mesh*treatment*sampling mesh*treatment*sampling*habitat e = mesh*treatment*sampling*site(habitat);

test h = mesh*treatment*sampling*site(habitat) e = mesh*treatment*sampling*block(site*habitat);

test h = substrate*mesh*treatment*sampling substrate*mesh*treatment*sampling*habitat e = substrate*mesh*treatment*sampling*site(habitat);

test h = substrate*mesh*treatment*sampling*site(habitat) e = substrate*mesh*treatment*sampling*block(site*habitat);

run;

The design is a double-nested split-plot-design. The factor sampling has 3 levels. If I only use 2 level the syntax works (=2700 replicates) , if I use 3 levels for "sampling" (=c. 4000 replicates) I get this error message. However, even if 2 levels it takes about one hour... Initially I 've tried Proc Mixed but then the calculation takes about 40 hours...

Regular Contributor
Posts: 152

Re: Integer overflow in Proc GLM

Read the section of the PROC GLM documentation, "Computational Resources".  Under the subsection, "Memory", this documentation states that the number of levels of classification variables and their interactions in the model cannot exceed 32,767.   With three levels of sampling, you probably have exceeded this limit, yielding the error message you observed.  This subsection shows ways that you might reduce the number of these levels.  Or, you might rethink your model.

Respected Advisor
Posts: 2,655

Re: Integer overflow in Proc GLM

Or you might move to PROC HPMIXED.  With this many levels and a split-split plot, MIXED is dealing with a pretty sparse matrix.  This looks exactly like what HPMIXED was devised for.  The hardest part will be recoding all of the GLM TEST statements to HPMIXED TEST statements.

Steve Denham

Valued Guide
Valued Guide
Posts: 684

Re: Integer overflow in Proc GLM

It looks like you have several random effects (based on the test statements in GLM) but you are not specifying any random effects with RANDOM statements. This is the very old fashioned way of doing split plots, nested designs, etc. This is now much easier and simpler using true mixed-model software (such as MIXED, HPMIXED, and GLIMMIX), if you learn how to specify the random effects. (GLM is not a true mixed-model procedure). But you have to learn how to list the random effects, and not put them in a model statement. Then, all of these individual test statements will be automatically handled (you won't need to write them out). There is a bunch to learn, and you should start with the great book: SAS for Mixed Models, second edition (2006) by Littell et al. When the MIXED procedure was new in the mid-1990s, there were several articles written explaining the better way of fitting these models and testing effects. You might be able to find some from the SAS USers' Group International online (not called SAS Global Forum).

Occasional Contributor
Posts: 18

Re: Integer overflow in Proc GLM

Thank you for all your answers. I will check out HPMIXED when I am back in the office next week!

@lvm: I am experienced with Proc Mixed and I usually prefer it. However, as noted above, each calculation with Proc Mixed takes about 40 hours using this model and my data and often ends with "did not converge". This is why I used GLM.

Respected Advisor
Posts: 2,655

Re: Integer overflow in Proc GLM

If you are familiar with MIXED, the move to HPMIXED won't be difficult, especially if you are used to writing TEST statements.

Steve Denham

Occasional Contributor
Posts: 18

Re: Integer overflow in Proc GLM

2 years later.... Smiley Wink

I now tried to use HPMixed with a subsequent Proc Mixed step, especially to use the covtest option in Mixed, like in http://support.sas.com/resources/papers/proceedings12/332-2012.pdf

(page7/8)

I used HPMIXED with the statement "ods output covparms=covpe;".:

Proc HPMixed;

class habitat site block treatment mesh substrate sampling;

model trans =

habitat

treatment

treatment*habitat

substrate

substrate*habitat

mesh

mesh*habitat

sampling

sampling*habitat

substrate*treatment

substrate*treatment*habitat

mesh*treatment

mesh*treatment*habitat

sampling*treatment

sampling*treatment*habitat

substrate*mesh

substrate*mesh*habitat

substrate*sampling

substrate*sampling*habitat

mesh*sampling

mesh*sampling*habitat

substrate*mesh*treatment

substrate*mesh*treatment*habitat

substrate*mesh*sampling

substrate*mesh*sampling*habitat

substrate*treatment*sampling

substrate*treatment*sampling*habitat

mesh*treatment*sampling

mesh*treatment*sampling*habitat

substrate*mesh*treatment*sampling

substrate*mesh*treatment*sampling*habitat;

lsmeans habitat|treatment|substrate|mesh / solution;

random site(habitat)

block(site*habitat)

treatment*site(habitat)

treatment*block(site*habitat)

substrate*site(habitat)

substrate*block(site*habitat)

mesh*site(habitat)

mesh*block(site*habitat)

sampling*site(habitat)

sampling*block(site*habitat)

substrate*treatment*site(habitat)

substrate*treatment*block(site*habitat)

mesh*treatment*site(habitat)

mesh*treatment*block(site*habitat)

sampling*treatment*site(habitat)

sampling*treatment*block(site*habitat)

substrate*mesh*site(habitat)

substrate*mesh*block(site*habitat)

substrate*sampling*site(habitat)

substrate*sampling*block(site*habitat)

mesh*sampling*site(habitat)

mesh*sampling*block(site*habitat)

substrate*mesh*treatment*site(habitat)

substrate*mesh*treatment*block(site*habitat)

substrate*mesh*sampling*site(habitat)

substrate*mesh*sampling*block(site*habitat)

substrate*treatment*sampling*site(habitat)

substrate*treatment*sampling*block(site*habitat)

mesh*treatment*sampling*site(habitat)

mesh*treatment*sampling*block(site*habitat)

substrate*mesh*treatment*sampling*site(habitat)

substrate*mesh*treatment*sampling*block(site*habitat);

lsmeans habitat|treatment|substrate|mesh;

ods output covparms=covpe;

run;

Then I used the Proc Mixed syntax again including the statement "parms/pdata=covpe noiter;"

Proc Mixed covtest;

class habitat site block treatment mesh substrate sampling;

model trans =...(see above)

....

.....

parms/pdata=covpe noiter;

run;

HPMixed worked well but for Mixed I get the message that the variables habitat, site...etc. were not found.

Does anybody has an idea?

Solution
‎01-26-2015 10:52 AM
Valued Guide
Valued Guide
Posts: 684

Re: Integer overflow in Proc GLM

You need a data= statement in MIXED. Procedures use the most recently created data file as default if a data= option is not given. The proc is trying to use covpe as the data file since this is the most recently created. By the way, you are estimating a LOT of random effects. I would be cautious -- your model could be overparametrized.

Occasional Contributor
Posts: 18

Re: Integer overflow in Proc GLM

Thank you, now it works!! Yes, I am aware of the problem with so many random terms.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 841 views
  • 6 likes
  • 4 in conversation