BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mrzlatan91
Obsidian | Level 7

Hey guys, 

I have the following 2 tables old and update and want to have the output like want:

 

data work.old;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  6$
;

data work.update;
input (Date cusip fundid mod) ($);
datalines;
Apr2012 124  B  7$
May2012 124  B  10$
;

data work.want;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  7$
May2012 124  B  10$
;


That means: Table want should have all (date-cusip)-observations from work.update and 

only those (date-cusip)-observations from work.old that are not in work.update. 

Each date-cusip-Combination can exist more than once in both tables (with different fundids) but always have the same

value for mod. 

It can be said, that work.old is a bigger table than work.update, so it would be perfect if someone could provide me a hash solution 

(for better performance).

 

Thanks in advance 🙂

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

There's probably a hash solution for this, but here's my plain vanilla solution:

 

proc sort data=old;

by date cusip fundid;

run;

 

proc sort data=update;

by date cusip fundid;

run;

 

data want;

merge old update;

by date cusip fundid;

run;

 

It brings in all the data, but when there is a match uses the data from UPDATE to overwrite the data from OLD.

View solution in original post

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

A simple merge by should do this perfectly. Why a hash solution?

Astounding
PROC Star

There's probably a hash solution for this, but here's my plain vanilla solution:

 

proc sort data=old;

by date cusip fundid;

run;

 

proc sort data=update;

by date cusip fundid;

run;

 

data want;

merge old update;

by date cusip fundid;

run;

 

It brings in all the data, but when there is a match uses the data from UPDATE to overwrite the data from OLD.

mrzlatan91
Obsidian | Level 7

thanks, your solution has the best performance 🙂

Reeza
Super User

Would an UPDATE statement work? You need to sort the data ahead of time but it works well. 

 

data want;
update old update;
by date cusip fundid;
run;

@mrzlatan91 wrote:

Hey guys, 

I have the following 2 tables old and update and want to have the output like want:

 

data work.old;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  6$
;

data work.update;
input (Date cusip fundid mod) ($);
datalines;
Apr2012 124  B  7$
May2012 124  B  10$
;

data work.want;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  7$
May2012 124  B  10$
;


That means: Table want should have all (date-cusip)-observations from work.update and 

only those (date-cusip)-observations from work.old that are not in work.update. 

Each date-cusip-Combination can exist more than once in both tables (with different fundids) but always have the same

value for mod. 

It can be said, that work.old is a bigger table than work.update, so it would be perfect if someone could provide me a hash solution 

(for better performance).

 

Thanks in advance 🙂

 

 

 


 

mrzlatan91
Obsidian | Level 7

yes, this worked as well. thank you 🙂

novinosrin
Tourmaline | Level 20
Looks a plain append/interleave to me
s_lassen
Meteorite | Level 14

If your OLD dataset is large, and you do not need to keep the previous version, updating in place may be the fast way to do it:

data work.old;
  modify old update;
  by date cusip;
  if _iorc_ then do;
    output;
	_error_=0;
	end;
  else replace;
run;

The _IORC_ is set when there is an observation in UPDATE that does not have a correspondent key in OLD. 

 

If you need to improve the performance more, index your OLD table:

Proc sql;
  create Unique index idx on old(date,cusip);
quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1702 views
  • 2 likes
  • 6 in conversation