DATA Step, Macro, Functions and more

Merging Multiple Datasets

Accepted Solution Solved
Reply
Super Contributor
Posts: 326
Accepted Solution

Merging Multiple Datasets

I have tables with these structure, two IDs and a score.

 

 

id_A,id_B,score_01

id_A,id_B,score_02

id_A,id_B,score_03

id_A,id_B,score_04

id_A,id_B,score_0n

 

What's the most efficient way to combine so that i have a structure

id_A,id_B,score_01,score_02,score_03,score_04,score_0n?

 

Not that the combination of id_A and id_B may not be existing in all datasets.

 

Thank you very much!


Accepted Solutions
Solution
‎04-03-2016 11:44 PM
Super User
Posts: 19,855

Re: Merging Multiple Datasets

If you want something more efficient than your code, post your code.

You can try a data step merge.

 

data want;
merge data1 (rename=score=score1)
           data2 (rename=score=score2)
           ....
           data6 (rename=score=score6);
by idA idB;


run;

View solution in original post


All Replies
Super User
Posts: 19,855

Re: Merging Multiple Datasets

Do you have multiple records per IDs? Is it a one to one join, or many to many join? 

Super Contributor
Posts: 326

Re: Merging Multiple Datasets

For each dataset, there is only one instance of the combination of an id_A and id_B.

 

Currently I'm using full outer join but I think there's a more efficient way.

Solution
‎04-03-2016 11:44 PM
Super User
Posts: 19,855

Re: Merging Multiple Datasets

If you want something more efficient than your code, post your code.

You can try a data step merge.

 

data want;
merge data1 (rename=score=score1)
           data2 (rename=score=score2)
           ....
           data6 (rename=score=score6);
by idA idB;


run;
Super Contributor
Posts: 326

Re: Merging Multiple Datasets

Thanks for the idea. My code previously was like using full outer join like these

 

select coalesce(a.id_A,b.id_A,....j.id_A) as id_A,
select coalesce(a.id_B,b.id_B,....j.id_B) as id_B,
score_1,...score_10

 

 

Super User
Super User
Posts: 7,979

Re: Merging Multiple Datasets

As always in these type of questions, my first response is, is the strcuture you require the best fit from a programming perspective.  The answer here is no.  Personally I would set these together and get one long table with each row being a separate result.  This simplfies any coding you need to do across the variables, and seeing as the strcuture wouldn't change if the number of items changed (i.e. increase or decrease) then the maintenance of such code also becomes far easier.  I find transposed data is only useful in output reports.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 436 views
  • 1 like
  • 3 in conversation