BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mrzlatan91
Obsidian | Level 7

Hey guys, 

I have the following 2 tables old and update and want to have the output like want:

 

data work.old;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  6$
;

data work.update;
input (Date cusip fundid mod) ($);
datalines;
Apr2012 124  B  7$
May2012 124  B  10$
;

data work.want;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  7$
May2012 124  B  10$
;


That means: Table want should have all (date-cusip)-observations from work.update and 

only those (date-cusip)-observations from work.old that are not in work.update. 

Each date-cusip-Combination can exist more than once in both tables (with different fundids) but always have the same

value for mod. 

It can be said, that work.old is a bigger table than work.update, so it would be perfect if someone could provide me a hash solution 

(for better performance).

 

Thanks in advance 🙂

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

There's probably a hash solution for this, but here's my plain vanilla solution:

 

proc sort data=old;

by date cusip fundid;

run;

 

proc sort data=update;

by date cusip fundid;

run;

 

data want;

merge old update;

by date cusip fundid;

run;

 

It brings in all the data, but when there is a match uses the data from UPDATE to overwrite the data from OLD.

View solution in original post

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

A simple merge by should do this perfectly. Why a hash solution?

Astounding
PROC Star

There's probably a hash solution for this, but here's my plain vanilla solution:

 

proc sort data=old;

by date cusip fundid;

run;

 

proc sort data=update;

by date cusip fundid;

run;

 

data want;

merge old update;

by date cusip fundid;

run;

 

It brings in all the data, but when there is a match uses the data from UPDATE to overwrite the data from OLD.

mrzlatan91
Obsidian | Level 7

thanks, your solution has the best performance 🙂

Reeza
Super User

Would an UPDATE statement work? You need to sort the data ahead of time but it works well. 

 

data want;
update old update;
by date cusip fundid;
run;

@mrzlatan91 wrote:

Hey guys, 

I have the following 2 tables old and update and want to have the output like want:

 

data work.old;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  6$
;

data work.update;
input (Date cusip fundid mod) ($);
datalines;
Apr2012 124  B  7$
May2012 124  B  10$
;

data work.want;
input (Date cusip fundid mod) ($);
datalines;
Mar2012 123  A  5$
Mar2012 124  A  5$
Apr2012 124  B  7$
May2012 124  B  10$
;


That means: Table want should have all (date-cusip)-observations from work.update and 

only those (date-cusip)-observations from work.old that are not in work.update. 

Each date-cusip-Combination can exist more than once in both tables (with different fundids) but always have the same

value for mod. 

It can be said, that work.old is a bigger table than work.update, so it would be perfect if someone could provide me a hash solution 

(for better performance).

 

Thanks in advance 🙂

 

 

 


 

mrzlatan91
Obsidian | Level 7

yes, this worked as well. thank you 🙂

novinosrin
Tourmaline | Level 20
Looks a plain append/interleave to me
s_lassen
Meteorite | Level 14

If your OLD dataset is large, and you do not need to keep the previous version, updating in place may be the fast way to do it:

data work.old;
  modify old update;
  by date cusip;
  if _iorc_ then do;
    output;
	_error_=0;
	end;
  else replace;
run;

The _IORC_ is set when there is an observation in UPDATE that does not have a correspondent key in OLD. 

 

If you need to improve the performance more, index your OLD table:

Proc sql;
  create Unique index idx on old(date,cusip);
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 873 views
  • 2 likes
  • 6 in conversation