DATA Step, Macro, Functions and more

Merging 3 datasets

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Merging 3 datasets

Hi, 

 

I want to merge 3 datasets dataA, dataB and dataC, keeping records only found in dataA or dataB and exclude anything found in  dataC. What approch would be more effective? SET or MERGE? I'm not able to run code at the moment so as to test the effeciency. Thus, I am trying to figure out through support and Help the merging funcions. I could do it with two datasets but still struggle with keeping only records found in A and B. I'd appreciate your ideas


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 17,963

Re: Merging 3 datasets

Close, but no cigar Woman Tongue

 

Double check your output, especially if you have duplicate X's in any of your datasets.

 

Data Merged_data
Merge dataA(IN=fromdataA) 
           dataB(IN=fromdataB) 
           dataC(in=fromdataC)
BY x;

*Create indicator variables to show in which dataset the observation is present;
SourceA=fromdataA;
SourceB=fromdataB;
SourceC=fromdataC;

*keep only from A and B;
if fromdataA AND fromdataB;

run; 

View solution in original post


All Replies
Super User
Posts: 17,963

Re: Merging 3 datasets

Are you merging (side by side) or appending(adding more rows)?
Occasional Contributor
Posts: 5

Re: Merging 3 datasets

[ Edited ]

It's an exercise and the description doesn't give more details on that. In my understanding it is asked to merge the datasets side by side

Super User
Posts: 17,963

Re: Merging 3 datasets

[ Edited ]

Well, set appends data sets so you're looking for a MERGE then. Look at the dataset IN option,example 4.
http://www.ats.ucla.edu/stat/sas/modules/merge.htm

Occasional Contributor
Posts: 5

Re: Merging 3 datasets

[ Edited ]

So, something like the following would be correct? inputs and numbers are totally random

 

Data dataA

input x, y, z;

cards;

5, 10, 15

55, 24, 29

run; 

 

proc sort data=dataA;

by x;

run; 

 

Data dataB

input x, y, z;

cards;

105, 110, 150

run; 

 

proc sort data=dataB;

by x;

run; 

 

Data dataB

input x, y, z;

cards;

665, 1110, 1150

780, 455, 447

159, 668, 31

;

run; 

 

proc sort data=dataC

by x;

run; 

 

Data Merged_data

Merge dataA(IN fromdataA) dataB(IN fromdataB) dataC 

BY x;

fromdataA=fromdataB; 

run; 

 

Solution
‎09-25-2015 06:23 AM
Super User
Posts: 17,963

Re: Merging 3 datasets

Close, but no cigar Woman Tongue

 

Double check your output, especially if you have duplicate X's in any of your datasets.

 

Data Merged_data
Merge dataA(IN=fromdataA) 
           dataB(IN=fromdataB) 
           dataC(in=fromdataC)
BY x;

*Create indicator variables to show in which dataset the observation is present;
SourceA=fromdataA;
SourceB=fromdataB;
SourceC=fromdataC;

*keep only from A and B;
if fromdataA AND fromdataB;

run; 
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 265 views
  • 1 like
  • 2 in conversation