Hello, I am running the following regression for a sample of 104,000 observations (some are omitted because of missing values).
I want to add clusters by firm (there are approx.1700 firms in the sample ). I already tried to increase the memsize manually to 6G, but with no luck.
I tried both locally and via WRDS servers.
This is the code:
%macro reg_s(var_name, data_name);
proc surveyreg data=&data_name;
cluster gvkey;
CLASS gvkey year_data;
MODEL &var_name=incu_post_ipo sign_inc_p sign_p sale_rank market_share ind_code_count gvkey year_data /SOLUTION;
ods output ParameterEstimates=&var_name._&data_name;
RUN;
ods trace off;
%mend;
%reg_s(capx_1,ff);
run;
This is the error output:
NOTE: In data set FF, total 104383 observations read, 43366 observations with missing values are
omitted.
ERROR: The SAS System stopped processing this step because of insufficient memory.
NOTE: PROCEDURE SURVEYREG used (Total process time):
real time 1.06 seconds
user cpu time 0.89 seconds
system cpu time 0.10 seconds
memory 51160.06k
OS Memory 77100.00k
Timestamp 01/02/2022 04:54:23 àçäØö
Step Count 20 Switch Count 0
2452 run;
P.S when I run the code without the clusters this is what I get:
NOTE: In data set FF, total 104383 observations read, 43366 observations with missing values are
omitted.
NOTE: The data set WORK.CAPX_1_FF has 1127 observations and 6 variables.
NOTE: PROCEDURE SURVEYREG used (Total process time):
real time 42.59 seconds
user cpu time 39.06 seconds
system cpu time 0.87 seconds
memory 101172.81k
OS Memory 126420.00k
Timestamp 01/02/2022 05:06:44 àçäØö
Step Count 22 Switch Count 819
I already read some relevant Q&A but couldn't find an answer to my case.
Thank you in advance.
I am not an expert in survey regression, but are you sure you want to include gvkey (the clustering variable) on the MODEL statement? I noticed in the documentation examples, the variable on the CLUSTER statement is not included on the MODEL statement.
Please post the result of
proc options group=memory;
run;
and some information about the hardware used: how much memory is installed, how much is free?
It seems strange that the first run terminates with ~50% memory used, compared with the second run.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.