BookmarkSubscribeRSS Feed
FrederikPL
Calcite | Level 5

Hi,

 

I am running a pretty large data set for 500,000 individuals on a daily level over 13 years. I have approximately 500 million observations.

I have a problem constructing clustered standard errors on the individual level.

It seems like I have a server size problem. I have access to 2 TB, which allows me to construct clustered standard errors for up to 60,000 individuals.

It seems like proc surveyreg can run with fewer clusters than proc genmod.

 

Do anyone have an idea to how to overcome this or what to try?

 

Best,

Frederik

 

I get this error message with proc surveyreg:

ERROR: The SAS System stopped processing this step because of insufficient
memory.
NOTE: PROCEDURE SURVEYREG used (Total process time):
real time 4:34:52.34
cpu time 4:34:23.42

 

This is my code, if I run with proc surveyreg: Attached.

 

1 REPLY 1
ballardw
Super User

It never hurts to show the code you are using. That way we can avoid making suggestions that look like what you are doing.

 

What exactly is the problem? No output, incorrect (or at least unexpected output), missing errors for some records?

 

From the surveyreg documentation

Let

  • H be the total number of strata

  • $n_ c$ be the total number of clusters in your sample across all H strata, if you specify a CLUSTER statement

  • p be the total number of parameters in the model

The memory needed (in bytes) is

\[ 48H+8pH+4p(p+1)H \]

For a cluster sample, the additional memory needed (in bytes) is

\[ 48H+8pH+4p(p+1)H+ 4p(p+1)n_ c + 16n_ c \]

The SURVEYREG procedure also uses other small amounts of additional memory. However, when you have a large number of clusters or strata, or a large number of parameters in your model, the memory described previously dominates the total memory required by the procedure.

So using the above information does the memory requirement come close to being within your available.

 

Also there is the consideration of the output. ODS Select or Exclude might reduce some of the output table overhead.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1352 views
  • 0 likes
  • 2 in conversation