Hello, All
I understand that, in data step, SAS process data record by record; so memory should not be a problem when dealing with large data set.
Now suppose I have a dataset "test" which is 100G, and available memory is 16G; if I want to do some statistics about the data, such as a logistic regression:
PROC LOGISTIC DATA=test;
How does SAS deal with such Procs? still record by record? will memory shortage be a problem in such case?
Thank you in advance for educating me.
Memory should only be a problem if the procedure or datastep is using a method that requires such memory, like the hash method.
In other cases, it should only affect performance. However, at least with some of the cluster algorithms, that degredation of performance is "almost" equivalent to non-functionality (e.g., a 2 second task still running after 10 hours).
However, with 16GB, I doubt if you will often confront such issues.
Thank you.
So, you suggest that memory should not be a problem, even if the memory is lower than 16G (say 2 G); only the calculation speed might be lower. Right?
Yes, with those few exceptions, at least from my own experience.
I guess in " PROC LOGISTIC DATA=test;" SAS does NOT process data record by record, or does it?
I think that the correct answer depends upon the options selected. Take a look at:
Thank you very much. I will look into the article you recommended.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.