BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Skn
Fluorite | Level 6 Skn
Fluorite | Level 6

Hi Team,

 

I'm trying to merge/append/combine two datasets. there are few overlaps in both the datsets. I just want the data that is present only in Dataset 2 and not in Dataset1. any help is truly appreiciated!

 

using SAS EG 6.1

 

Dataset1 

Date

Ord_no

item

update_date

4/9/2017

111

56

4/10/2017

4/9/2017

112

45

4/10/2017

4/9/2017

113

32

4/10/2017

4/9/2017

114

46

4/10/2017

4/9/2017

115

64

4/10/2017

 

Dataset 2:

 

Date

Ord_no

item

update_date

 

4/9/2017

111

56

4/11/2017

 

4/9/2017

112

78

4/11/2017

 

4/9/2017

113

32

4/11/2017

 

4/9/2017

114

46

4/11/2017

 

4/9/2017

122

34

4/11/2017

 

4/9/2017

124

56

4/11/2017

 

4/10/2017

134

75

4/11/2017

 

4/10/2017

135

76

4/11/2017

 

4/10/2017

136

77

4/11/2017

 

 

Output:

Date

Ord_no

item

update_date

 

4/9/2017

111

56

4/10/2017

 

4/9/2017

112

45

4/10/2017

 

4/9/2017

113

32

4/10/2017

 

4/9/2017

114

46

4/11/2017

 

4/9/2017

122

34

4/11/2017

 

4/9/2017

124

56

4/11/2017

 

4/10/2017

134

75

4/11/2017

 

4/10/2017

135

76

4/11/2017

 

4/10/2017

136

77

4/11/2017

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
CiCi
Fluorite | Level 6

Try this:

 

Proc sort data=dat1 nodupkey out=dat1_updated;

 by Date Ord_no item update_date;

run;

 

Proc sort data=dat2 nodupkey out=dat2_updated;

 by Date Ord_no item update_date;

run;

 

data finaldat;

 merge dat2_updated(in=a) dat1_updated(in=b);

 by Date Ord_no item update_date;

 if a and not b;

run;

 

Proc sql can also be used for this scenario.

View solution in original post

9 REPLIES 9
ChrisBrooks
Ammonite | Level 13

I must admit it's not entirely clear to me what it is you want to do for the following reasons:

 

1. You say you want to merge/append/combine the two files however merging and appending are two very different things. Which is it you want to do?

 

2. Is the Ouput file in your example what you want to get or is it what you are getting?

 

3. In your output the update_date comes from Dataset! for order_no 111, 112 and 113 but from Dataset2 for 114 - what are the rules for this combination process?

 

 

Skn
Fluorite | Level 6 Skn
Fluorite | Level 6
Hi Chris,



Sorry for not being clear here.

To keep it simple.

1. I basically want a output dataset which will be result of dataset1 and dataset2. Either of the function from above works

2. Output data I have provided, is the desired result I wanted.

3. Rules:

- you can think this is a transactional data. Dataset1 is existing, Dataset2 is new. Update date is the date on which Dataset2 is run

- Every time when we run the Dataset2, there will be new records and overlap records on primary key.

- As an output I want, those new records from Dataset2. I don't want to change the overlapping records.



Thanks


Skn
Fluorite | Level 6 Skn
Fluorite | Level 6

Hi Chris,

 

Sorry for not being clear here.

To keep it simple.

  1. I basically want a output dataset which will be result of dataset1 and dataset2. Either of the function from above works
  2. Output data I have provided, is the desired result I wanted.
  3. Rules:
  • you can think this is a transactional data. Dataset1 is existing, Dataset2 is new. Update date is the date on which Dataset2 is run
  • Every time when we run the Dataset2, there will be new records and overlap records on primary key.
  • As an output I want, those new records from Dataset2. I don’t want to change the overlapping records.

 

Thanks

YiqunDai
Fluorite | Level 6

You can try to use join (inner join, left join, right join, full join) by PROC SQL, according to what output you want to report.

 

http://www2.sas.com/proceedings/sugi27/p070-27.pdf

CiCi
Fluorite | Level 6

Try this:

 

Proc sort data=dat1 nodupkey out=dat1_updated;

 by Date Ord_no item update_date;

run;

 

Proc sort data=dat2 nodupkey out=dat2_updated;

 by Date Ord_no item update_date;

run;

 

data finaldat;

 merge dat2_updated(in=a) dat1_updated(in=b);

 by Date Ord_no item update_date;

 if a and not b;

run;

 

Proc sql can also be used for this scenario.

Skn
Fluorite | Level 6 Skn
Fluorite | Level 6

@CiCi thank you!

ChrisBrooks
Ammonite | Level 13

Try this (it's close to CiCi's but I think gives you want you want):

 

 

data dataset1;
   infile datalines delimiter=','; 
   input date  ord_no  item  update_date ;
   informat date ddmmyy10. update_date ddmmyy10.;
   format date ddmmyy10. update_date ddmmyy10.;
   datalines;                      
04/09/2017,111,56,04/10/2017
04/09/2017,112,45,04/10/2017
04/09/2017,113,32,04/10/2017
04/09/2017,114,46,04/10/2017
04/09/2017,115,64,04/10/2017
;
run;

data dataset2;
   infile datalines delimiter=','; 
   input date  ord_no  item  update_date ;
   informat date ddmmyy10. update_date ddmmyy10.;
   format date ddmmyy10. update_date ddmmyy10.;
   datalines;                      
04/09/2017,111,56,04/11/2017
04/09/2017,112,78,04/11/2017
04/09/2017,113,32,04/11/2017
04/09/2017,114,46,04/11/2017
04/09/2017,122,34,04/11/2017
04/09/2017,124,56,04/11/2017
04/10/2017,134,75,04/11/2017
04/10/2017,135,76,04/11/2017
04/10/2017,136,77,04/11/2017
;
run;


proc sort data=dataset1;
	by ord_no;
run;
proc sort data=dataset2;
	by ord_no;
run;
data output;
	merge dataset2(in=a) dataset1(in=b);
	by ord_no;
	if b and not a then delete;
run;

The only anomaly is in ord_no 114 which has a different ord_no than your output but I thbink follows your update rules

Reeza
Super User

If it's transactional data with updates look at an UPDATE statement? 

 

All the different methods are illustrated here:

http://support.sas.com/documentation/cdl/en/basess/68381/HTML/default/viewer.htm#p19t3r1gxavpoin14jd...

Skn
Fluorite | Level 6 Skn
Fluorite | Level 6

@ChrisBrooks Thank you. I used the smiliar logic.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2132 views
  • 2 likes
  • 5 in conversation