Dear all,
I am merging two datasets of size 30GB and 2 GB using proc sql. How much memory is required to accomplish this merge.
thanks
Lokendra
There are many aspects which impact a query. but it looks this is perfect case where it is better to use hash join instead of SQL join. Please furnish more information related to data, so that someone can help you better. Usually databases have the capacity to guess amount of space and time a query runs through a explain plan, if not sure SAS has something available that can give space kind of information.
If you run your query using the SAS option FULLSTIMER it will report how much memory it is using. You can increase your MEMSIZE setting if you are running out of memory - please note this option must be set at the invocation of your SAS session.
That's not so much a memory problem, but a disk storage problem. Proc sql will need space to do the sorting and for the utility file, and the size of your output will depend on the variables named in the select, and the relationship between the two tables.
Depending on the data and your needs, you might find that SQL is not a good solution at all. See Maxim 10.
This question cannot be answered without you giving a lot more information.
At least:
1. Are the tables sorted? indexed?
2. What sort of join is it? Cartesian, inner, left etc?
3. How many rows in each table?
4. Is(are) the join key(s) unique in either or both tables?
5. What is (roughly) the expected number of rows in the output table?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.