Hi Folks, I'm new to SAS programming. I'm trying to merge two files and i'm not able to apply if the condition is Many to one.
merge inputfile_1 (in=a) inputfile_2 ;
by ID;
if a ; --> When i use this statement, I get output like below:
This just matched the first ID and not rest of the ID's present in the file1. Please help me what I'm missing.
Here is my below example.
inputfile_1
ID | Amount | Description |
12345 | 500,000 | Test 1 |
12345 | 600,000 | Test 2 |
12345 | 700,000 | Test 3 |
23456 | 500,000 | Test 1 |
23456 | 600,000 | Test 2 |
23456 | 700,000 | Test 3 |
34567 | 500,000 | Test 1 |
34567 | 600,000 | Test 2 |
34567 | 700,000 | Test 3 |
45678 | 800,000 | Test 33 |
inputfile_2
ID | Amount_new | Description_new |
12345 | 450,000 | Test 4 |
23456 | 350,000 | Test 5 |
34567 | 250,000 | Test 6 |
Desired Output :
ID | Amount | Description | Amount_new | Description_new |
12345 | 500,000 | Test 1 | 450,000 | Test 4 |
12345 | 600,000 | Test 2 | 450,000 | Test 4 |
12345 | 700,000 | Test 3 | 450,000 | Test 4 |
23456 | 500,000 | Test 1 | 350,000 | Test 5 |
23456 | 600,000 | Test 2 | 350,000 | Test 5 |
23456 | 700,000 | Test 3 | 350,000 | Test 5 |
34567 | 500,000 | Test 1 | 250,000 | Test 6 |
34567 | 600,000 | Test 2 | 250,000 | Test 6 |
34567 | 700,000 | Test 3 | 250,000 | Test 6 |
45678 | 800,000 | Test 33 |
Output I get :
ID | Amount | Description | Amount_new | Description_new |
12345 | 500,000 | Test 1 | 450,000 | Test 4 |
12345 | 600,000 | Test 2 | ||
12345 | 700,000 | Test 3 | ||
23456 | 500,000 | Test 1 | 350,000 | Test 5 |
23456 | 600,000 | Test 2 | ||
23456 | 700,000 | Test 3 | ||
34567 | 500,000 | Test 1 | 250,000 | Test 6 |
34567 | 600,000 | Test 2 | ||
34567 | 700,000 | Test 3 | ||
45678 | 800,000 | Test 33 |
SQL left join would be more straight forward.
Thanks for your Prompt reply Linus Hjorth. But, I have the data in Excel file.
How does the fact that the data is in Excel matter? Once you import the data into SAS they're SAS datasets and I agree with Linus - use a left join.
I don't even think there's an easy way for a data step merge to accomplish this.
You may want to list out your variables but the code should look something like this:
proc sql;
create table want as
select a.*, b.*
from table1 as a
left join table2 as b
on a.id=b.id;
quit;
Thank you Reeza. I will try this option and keep you posted. Also, thanks for your patience on providing a solution to newbie.
There must be more to your code in the data step than you posted
This code:
data in1;
infile cards dlm='09'x;
input ID amount :comma. description :$7.;
cards;
12345 500,000 Test 1
12345 600,000 Test 2
12345 700,000 Test 3
23456 500,000 Test 1
23456 600,000 Test 2
23456 700,000 Test 3
34567 500,000 Test 1
34567 600,000 Test 2
34567 700,000 Test 3
45678 800,000 Test 33
;
run;
data in2;
infile cards dlm='09'x;
input ID amount_new :comma. description_new :$7.;
cards;
12345 450,000 Test 4
23456 350,000 Test 5
34567 250,000 Test 6
;
run;
data want;
merge in1 (in=a) in2;
by ID;
if a ;
run;
gave me this result:
12345 500000 Test 1 450000 Test 4
12345 600000 Test 2 450000 Test 4
12345 700000 Test 3 450000 Test 4
23456 500000 Test 1 350000 Test 5
23456 600000 Test 2 350000 Test 5
23456 700000 Test 3 350000 Test 5
34567 500000 Test 1 250000 Test 6
34567 600000 Test 2 250000 Test 6
34567 700000 Test 3 250000 Test 6
45678 800000 Test 33 .
which is exactly how the data step merge should work.
Hi KurtBremser,
I executed the same code from above in SAS University, here is my output.
Attached my Sample code and result for your reference. Is results variation is because of the versions?
ALWAYS check ALL your steps.
Run a proc print on IN1 and IN2 from your code above and see what the output is.
Try this version which works for me in SAS UE:
data in1;
input ID amount :comma. description :$7.;
cards;
12345 500,000 Test 1
12345 600,000 Test 2
12345 700,000 Test 3
23456 500,000 Test 1
23456 600,000 Test 2
23456 700,000 Test 3
34567 500,000 Test 1
34567 600,000 Test 2
34567 700,000 Test 3
45678 800,000 Test 33
;
run;
proc print data=in1; run;
data in2;
input ID amount_new :comma. description_new :$7.;
cards;
12345 450,000 Test 4
23456 350,000 Test 5
34567 250,000 Test 6
;
run;
proc print data=in2; run;
data want;
merge in1 (in=a) in2;
by ID;
if a ;
run;
proc print data=want; run;
Since you have missing values, there must have been a problem copy/pasting the code. Inspect the data in the 2 inputfiles.
Is it possible that your INPUTFILE_1 already contains AMOUNT_NEW and DESCRIPTION_NEW? That would be consistent with the results you are seeing, and those variables could be dropped as late as on the MERGE statement:
merge inputfile_1 (in=1 drop=amount_new description_new) inputfile_2;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.