Team,
I have scenario where am expecting a dataset with size approximately equal to 5.5 TB. My concern is using SAS EG 7.1 or 8.2 can I able to analyze this 5.5 TB sized dataset? If yes, how much time will it take to run a simple proc freq code or so.. If there is any complex code how much time does it take run.
Thanks in Advance.
Best Regards,
Sairam
In general SAS EG doesn't care about the data size, as it's SAS on the server that does the processing. Is your data a SAS data set file or is it in a database?
Your data set is pretty large and PROC FREQ can be memory-intensive. So depending what you're trying to do, you might consider looking at different methods like PROC SUMMARY or PROC SQL -- if they can support the computations you need. If your data is in a database though, then SAS will push the operation down to the database to limit the amount of data movement.
In general SAS EG doesn't care about the data size, as it's SAS on the server that does the processing. Is your data a SAS data set file or is it in a database?
Your data set is pretty large and PROC FREQ can be memory-intensive. So depending what you're trying to do, you might consider looking at different methods like PROC SUMMARY or PROC SQL -- if they can support the computations you need. If your data is in a database though, then SAS will push the operation down to the database to limit the amount of data movement.
Thanks Chris for your inputs. The SAS dataset am speaking is a file not a database. I am worried if it takes lot of time to run.
This is not a matter of Enterprise Guide, but of the workspace server that does all the work. The performance depends on the system/hardware of the server.
Say you've got average hardware that gives you a sustained data transfer rate of 200 MB/s.
This lets you read 1 GB in 5 seconds.
5.5 TB are roughly 5500 GB, so you would need up to 30,000 seconds to process it for one sequential pass (which is what FREQ needs).
The cardinality of your variables will determine the amount of memory needed.
So it comes down to do some measurements on your existing hardware to get your transfer rates, and get to know your data.
For approximations, you can start with a small subset to get a feel for the dimensions.
@Sairampulipati wrote:
Team,
I have scenario where am expecting a dataset with size approximately equal to 5.5 TB. My concern is using SAS EG 7.1 or 8.2 can I able to analyze this 5.5 TB sized dataset? If yes, how much time will it take to run a simple proc freq code or so.. If there is any complex code how much time does it take run.
Thanks in Advance.
Best Regards,
Sairam
What ever you do, it may help to explicitly reduce the number of variables if they are not needed.
If by simple you meant:
Proc freq data=bigset; run;
is suspect you may run out of results window space if you have many levels of many varibles. You can reduce the memory use a bit with something like this to only the variables of interest/ needed for any analysis.
Proc freq data=bigset (keep= var1 var2 var5 ); run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.