BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nduksy
Obsidian | Level 7

Hello All,

I have two datasets I will like to merge, they both have the same columns. I want to check to see if the counter column for  a row has changed, if it hasn't, do nothing, if it has, create a new row

 

Dataset A

ID RunDate Count
S001 06OCT2021 6
S002 06OCT2021 2

 

Dataset B

ID RunDate Count
S001 07OCT2021 6
S002 07OCT2021 5
S003 07OCT2021 1

 

MERGED - The resultant dataset compares datasets A and B and will only to Dataset A where a record from B where the ID is same, but the Count is different and will also add a new row if an ID that isn't in A is now in B.

ID RunDate Count
S001 06OCT2021 6
S002 06OCT2021 2
S002 07OCT2021 5
S003 07OCT2021 1

 

I hope my explanation makes sense.

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Merging in SAS means to put data side-by-side. What you intend to do here is called appending or stacking.

You can use BY with a SET and use the resulting LAST. variable:

data want;
set
  a
  b
;
by id count; /* add notsorted after count if counts can decrease */
if last.count;
run;

Untested; for tested code, provide data in (correctly working!) data steps with datalines.

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Merging in SAS means to put data side-by-side. What you intend to do here is called appending or stacking.

You can use BY with a SET and use the resulting LAST. variable:

data want;
set
  a
  b
;
by id count; /* add notsorted after count if counts can decrease */
if last.count;
run;

Untested; for tested code, provide data in (correctly working!) data steps with datalines.

nduksy
Obsidian | Level 7

Thank you. This works as I'd like it to. The only problem I have is that the runDate for all the records is showing the runDate for the records in the second dataset. I want to maintain the runDates from their respective datasets. See my code below:

 

data keep.blankreport2;
	set keep.blankreport keep.blankrun%sysfunc(today(),date9.);
	by datasetName count;
	if last.count;
run;
nduksy
Obsidian | Level 7

I reordered the datasets in my set statement and that seems to have fixed that.

Reeza
Super User

SQL Union which will not insert duplicate rows?

 

 

proc sql;
create table3 as
select * from table1
union 
select * from table2
quit;

Or UPDATE on all three columns?

data want;
update A B;
by ID RunDate Count;
run;

Untested.

Sajid01
Meteorite | Level 14

Hello @nduksy 
The following code should give what you want.

 

data one ;
informat rundate date9.;
format rundate date9.;
input ID $ 	RunDate 	Count;
datalines;
S001 	06OCT2021 	6
S002 	06OCT2021 	2
;
run;
data two ;
informat rundate date9.;
format rundate date9.;
input ID $	RunDate 	Count;
datalines;
S001 	07OCT2021 	6
S002 	07OCT2021 	5
S003 	07OCT2021 	1
;
run;
/* merge the datasets*/
data merged;
merged two one;
by ID count;
run;
data merged;
merge two one;
by ID count;
run;
data merged;
merge two one;
by ID count;
run;

The output will be as follows

Sajid01_0-1633621063835.png

 

And lastly ,in future, please post the tables as a datastep .