BookmarkSubscribeRSS Feed
dwah
Fluorite | Level 6

Hi!

 

Currently, I am trying to write an etl process to update a main table with periodic updates(these updates could either be changes to existing rows or could contain new rows or both) 

 

For example:

 

Main table has 5 rows with 2 columns(name and age):

  

Name            Age

John Doe        13

John Smith     15

Bo Derek        21

Randy Snow   28

Matt Damon    38

 

New incoming table(6 rows, one update to the age and 1 new row)

Name               Age

Bo Derek          28

Camilla White   39

John Doe          13

John Smith        15

Matt Damon      38

Randy Snow     28

 

 

Ideal Result Table:

 

Name            Age

John Doe        13

John Smith     15

Bo Derek        28

Randy Snow   28

Matt Damon    38

Camilla White   39

 

What i'm hoping to do is merge the two tables(main_tbl and updt_tbl)  into a new table and have "Bo Derek"'s age be updated with the new age in the second table and the insertion of the new row "Camilla White" all to the result_tbl. The repeated rows in the update table should not be appended to the new table to prevent duplicates.

 

 

proc sql;

create table s.result_tbl as

select *

from s.main_tbl, s.updt_tbl
where main_tbl.name  = updt_tbl.name
order by name;

 

is my current solution, but I don't think that's the right way at all of approaching this. I would prefer to stick with using proc sql; and not having to resort to data steps. What would be the most efficient and effective way of achieving the desired outcome? Thanks!

5 REPLIES 5
Reeza
Super User
Have you tried a data step update?

data s.result_tbl;
update main_tbl updt_tb;
by name;
run;

dwah
Fluorite | Level 6
The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.
Reeza
Super User
Looks like a network time out issue?
Patrick
Opal | Level 21

@dwah wrote:
The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.

Oh, your table(s) are in Teradata. That makes a difference.

I'd be using explicit pass-through SQL and then use a Teradata SQL MERGE. If your source table is in SAS and not already Teradata then I'd upload this table first into the database - either into a staging area or then as a temporary table.

novinosrin
Tourmaline | Level 20

Hello @dwah 

 

data main;
input Name    & $20.            Age;
cards;
John Doe        13
John Smith     15
Bo Derek        21
Randy Snow   28
Matt Damon    38
;


data new;
input Name    & $20.            Age;
cards;
Bo Derek          28
Camilla White   39
John Doe          13
John Smith        15
Matt Damon      38
Randy Snow     28
;

proc sql;
create table result as
select coalescec(a.name,b.name) as name ,b.age 
from main a full join new b
on a.name=b.name;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 664 views
  • 5 likes
  • 4 in conversation