I was trying proc sgplot to generate a box plot. It was aborted due to memory issue. The data set has about 1.7 million records, but there's only 16 distinct values for the x-axis. There are also some other procedures in SAS that can do box plot. I was wondering which procedure can handle big data sets better. I would like to add a limit that if the number of rows exceeds the limit it will not do the box plot. But I'm not sure how to calculate the limit based on the memory I have.
You should show the code used in case you specified something that might be an issue. If you requested plots for multiple variables that might cause some issue.
I just generated a random data set with 1.7 million records and with one analysis variable and 16 x values proc boxplot completed the graph in under 1 second. So the file size is unlikely to be the main issue.
The code is super simple:
proc sgplot data = t2ds;
vbox T2 / category = fis_week;
refline LCL;
refline UCL;
refline CTL;
format fis_week $4.;
run;
Thanks ballardw for pointing me to the right direction. I think it's the format statement that's causing the problem. After I removed the format. It completed in seconds.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.