DATA Step, Macro, Functions and more

DataSets match MERGE

Posts: 39

DataSets match MERGE

It is possible to match Datasets using differents variables with merge?


If I have 3 DataSets (DS):

DS1 and DS2 can be matched by var1

DS2 and DS3 only by Var2 (Var2 has differente data than Var1)


I know that is possible in diferent data steps or using proc SQL, but I want to know if is possible with merge in one step or exists other method using SAS code.

Super User
Posts: 5,083

Re: DataSets match MERGE

MERGE won't do that at all.  You can force a DATA step to do what you are asking, but it is not necessarily simple.  And the complexities multiply if you have a possibility of mismatches or a possiblity of a many-to-one (or worse yet, many-to-many) match.  Here are a couple of ideas.


  • Create an index for one data set.  Merge the other two, and in the same DATA step use SET with KEY= to retrieve the matching data.
  • Create a hash table from one data set.  Merge the other two, and in the same DATA step look up matching information in the hash table.
  • Create a format that maps from one of your BY variables to the observation number that holds a unique value for that BY variable.  Then in a DATA step, merge the other two data sets and use the format to locate the matching observation from the third data set.  Use SET with POINT= to retrieve the matching information.

In terms of speed and simplicity, I would probably go with the hash table.  But nothing is simple.  For example, what should happen if the hash table contains a data value that does not have a match either of the other two data sets?

Respected Advisor
Posts: 4,649

Re: DataSets match MERGE

Proc SQL is the exact tool for that task. Be aware however that for many-to-many merges it is not equivalent to the data step.

Super User
Posts: 6,938

Re: DataSets match MERGE

Since you can only have one BY statement in the data step, this is not possible.

Use SQL for doing that in one step. But be aware that joining multiple large tables in SQL will often perform horribly compared to a sequence of SORT and MERGE steps.

Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 3 replies
  • 4 in conversation