10-25-2011 12:28 AM
I've been using Proc SQL summary functions for some time until today when a friend of mine (a student, by the way) who has been using SAS for years at the university advised me not to use Proc SQL for large data sets because it is slow. He instead suggests using DOW loop, as in the following article:
Could you SAS experts please advise which one is better for large datasets (>5GB)? What is the main advantage of DOW loop over Proc SQL summary function? Say I want to compute the percentage of money that each mutual fund invested in each of the sample firms for each of the quarters over the 20 years. The Proc SQL summary function is simple to use, and I have been using it for such a purpose.
Thank you so much
10-25-2011 01:13 AM
While DOW loops are a construct that can be beneficial, and can be faster (especially if your data are already sorted), the programming can easily get to be a lot more complex than what you are currently doing and, as a result, the likelihood of your producing erroneous results can increase significantly.
Yes, SQL can be slow with large files (although that has improved quite a bit with recent SAS versions), but quite easy to code. If you are looking for faster processing, and don't want to risk errors due to faulty programming, I would compare proc summary with proc sql for what you are doing.