DATA Step, Macro, Functions and more

double set that replicates a merge

Reply
Contributor
Posts: 43

double set that replicates a merge

[ Edited ]

I have heard that a double set is faster than a merge, but the example that I am looking at from Art Carpenter's Innovative SAS techniques (page 219) only keeps matching observations. I need it to do exactly what a merge would do only faster.  Know of any references that show how to do this? Thx!

 

Super User
Posts: 24,026

Re: double set that replicates a merge

It would probably be helpful if you explained in some more detail with a small example. Have you looked at hash tables?

 


@proctice wrote:

I have a program that takes days to run, so I am experimenting with efficiency techniques.  I have heard that a double set is faster than a merge, but the example that I am looking at from Art Carpenter's Innovative SAS techniques (page 219) only keeps matching observations. I need it to do exactly what a merge would do only faster.  Know of any references that show how to do this? Thx!



Super User
Posts: 13,942

Re: double set that replicates a merge

Is the time concern only from the "merge"?

How many records are you dealing with in your source tables?

How many variables?

If you are merging BY variables, how many by variables are you using?

Does any of the data involved reside on a network resource? or external DBMS? Both of these are potential bottlenecks.

 

or can you show the code you are currently using to combine the data sets?

Super User
Posts: 10,594

Re: double set that replicates a merge

Is it a single step that takes days, or is it a large program with many steps?

If the latter, scan the log and identify the time-consuming steps.

In both cases, run them with fullstimer, and post code and log.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 43

Re: double set that replicates a merge

Posted in reply to KurtBremser

I modified my post to focus on the double set technique instead of the broader issue of efficiency.  If anyone knows how to do that or has a reference, it might be useful to many.

Ask a Question
Discussion stats
  • 4 replies
  • 190 views
  • 0 likes
  • 4 in conversation