Hi Team,
I'm trying to merge/append/combine two datasets. there are few overlaps in both the datsets. I just want the data that is present only in Dataset 2 and not in Dataset1. any help is truly appreiciated!
using SAS EG 6.1
Dataset1
Date | Ord_no | item | update_date |
4/9/2017 | 111 | 56 | 4/10/2017 |
4/9/2017 | 112 | 45 | 4/10/2017 |
4/9/2017 | 113 | 32 | 4/10/2017 |
4/9/2017 | 114 | 46 | 4/10/2017 |
4/9/2017 | 115 | 64 | 4/10/2017 |
Dataset 2:
Date | Ord_no | item | update_date |
|
4/9/2017 | 111 | 56 | 4/11/2017 |
|
4/9/2017 | 112 | 78 | 4/11/2017 |
|
4/9/2017 | 113 | 32 | 4/11/2017 |
|
4/9/2017 | 114 | 46 | 4/11/2017 |
|
4/9/2017 | 122 | 34 | 4/11/2017 |
|
4/9/2017 | 124 | 56 | 4/11/2017 |
|
4/10/2017 | 134 | 75 | 4/11/2017 |
|
4/10/2017 | 135 | 76 | 4/11/2017 |
|
4/10/2017 | 136 | 77 | 4/11/2017 |
|
Output:
Date | Ord_no | item | update_date |
|
4/9/2017 | 111 | 56 | 4/10/2017 |
|
4/9/2017 | 112 | 45 | 4/10/2017 |
|
4/9/2017 | 113 | 32 | 4/10/2017 |
|
4/9/2017 | 114 | 46 | 4/11/2017 |
|
4/9/2017 | 122 | 34 | 4/11/2017 |
|
4/9/2017 | 124 | 56 | 4/11/2017 |
|
4/10/2017 | 134 | 75 | 4/11/2017 |
|
4/10/2017 | 135 | 76 | 4/11/2017 |
|
4/10/2017 | 136 | 77 | 4/11/2017 |
|
Try this:
Proc sort data=dat1 nodupkey out=dat1_updated;
by Date Ord_no item update_date;
run;
Proc sort data=dat2 nodupkey out=dat2_updated;
by Date Ord_no item update_date;
run;
data finaldat;
merge dat2_updated(in=a) dat1_updated(in=b);
by Date Ord_no item update_date;
if a and not b;
run;
Proc sql can also be used for this scenario.
I must admit it's not entirely clear to me what it is you want to do for the following reasons:
1. You say you want to merge/append/combine the two files however merging and appending are two very different things. Which is it you want to do?
2. Is the Ouput file in your example what you want to get or is it what you are getting?
3. In your output the update_date comes from Dataset! for order_no 111, 112 and 113 but from Dataset2 for 114 - what are the rules for this combination process?
Hi Chris,
Sorry for not being clear here.
To keep it simple.
Thanks
You can try to use join (inner join, left join, right join, full join) by PROC SQL, according to what output you want to report.
Try this:
Proc sort data=dat1 nodupkey out=dat1_updated;
by Date Ord_no item update_date;
run;
Proc sort data=dat2 nodupkey out=dat2_updated;
by Date Ord_no item update_date;
run;
data finaldat;
merge dat2_updated(in=a) dat1_updated(in=b);
by Date Ord_no item update_date;
if a and not b;
run;
Proc sql can also be used for this scenario.
@CiCi thank you!
Try this (it's close to CiCi's but I think gives you want you want):
data dataset1;
infile datalines delimiter=',';
input date ord_no item update_date ;
informat date ddmmyy10. update_date ddmmyy10.;
format date ddmmyy10. update_date ddmmyy10.;
datalines;
04/09/2017,111,56,04/10/2017
04/09/2017,112,45,04/10/2017
04/09/2017,113,32,04/10/2017
04/09/2017,114,46,04/10/2017
04/09/2017,115,64,04/10/2017
;
run;
data dataset2;
infile datalines delimiter=',';
input date ord_no item update_date ;
informat date ddmmyy10. update_date ddmmyy10.;
format date ddmmyy10. update_date ddmmyy10.;
datalines;
04/09/2017,111,56,04/11/2017
04/09/2017,112,78,04/11/2017
04/09/2017,113,32,04/11/2017
04/09/2017,114,46,04/11/2017
04/09/2017,122,34,04/11/2017
04/09/2017,124,56,04/11/2017
04/10/2017,134,75,04/11/2017
04/10/2017,135,76,04/11/2017
04/10/2017,136,77,04/11/2017
;
run;
proc sort data=dataset1;
by ord_no;
run;
proc sort data=dataset2;
by ord_no;
run;
data output;
merge dataset2(in=a) dataset1(in=b);
by ord_no;
if b and not a then delete;
run;
The only anomaly is in ord_no 114 which has a different ord_no than your output but I thbink follows your update rules
If it's transactional data with updates look at an UPDATE statement?
All the different methods are illustrated here:
@ChrisBrooks Thank you. I used the smiliar logic.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.