Desktop productivity for business analysts and programmers

Merge Table

Reply
N/A
Posts: 0

Merge Table

I want to create a table by merging the 2 datasets below:

My raw datasets like this:

Data1:
ID Month
123 12/08
123 01/09
234 07/07
456 08/07
456 10/07
456 11/07


Data2:
ID Month Name
001 01/08 XYZ
001 02/08 XYZ
001 03/08 XYZ
123 12/08 ABC
123 01/09 ABC
123 02/09 ABC
123 03/09 ABC
234 07/07 EEE
234 08/07 EEE
456 08/07 PAS
456 09/07 PAS
456 10/07 PAS
456 11/07 PAS


Result:
ID Month Name
123 12/08 ABC
123 01/09 ABC
123 02/09 ABC
123 03/09 ABC
234 07/07 EEE
234 08/07 EEE
456 08/07 PAS
456 09/07 PAS
456 10/07 PAS
456 11/07 PAS
Super Contributor
Posts: 309

Re: Merge Table

Using a DATA STEP with the MERGE statement and the in= Option for Data2 is the common way to get the result. Both datasets must be sorted by "Id" and "Month".
SAS Employee
Posts: 149

Re: Merge Table

Hi Clau,

It looks like you want all the stuff in Data2 which matches an ID value from Data1. The month values in Data1 don't look they're being used for verification, yes?

As Andreas suggested, a merge with an IN= option would be a good way to go. Since this is an EG forum, I'll mention there's an alternative way to do this without writing any code.

1. Use a Query Task on Data1 to create a new data set containing only the ID column, and only the distinct values in that column. Within the query task, you'd drag ID into the Select and right below that there's a check box for distinct values.
2. In a second Query Task, join the newly created data set (let's call it "Distinct") and Data2. In the join window, verify that the join is ID to ID, and change the join type to left or right (as appropriate) so that only matches from data2 are coming in. Then add a filter on the ID column from the Distinct data set so that ID is not missing. You can type a space in the filter value box or just leave it blank for a missing value.

We had to include the step of getting just the distinct values first because otherwise the SQL join would create a cartesian product giving you 2*4=8 ID 123 values, for example.
Ask a Question
Discussion stats
  • 2 replies
  • 96 views
  • 0 likes
  • 3 in conversation