BookmarkSubscribeRSS Feed
dwah
Fluorite | Level 6

Hi!

 

Currently, I am trying to write an etl process to update a main table with periodic updates(these updates could either be changes to existing rows or could contain new rows or both) 

 

For example:

 

Main table has 5 rows with 2 columns(name and age):

  

Name            Age

John Doe        13

John Smith     15

Bo Derek        21

Randy Snow   28

Matt Damon    38

 

New incoming table(6 rows, one update to the age and 1 new row)

Name               Age

Bo Derek          28

Camilla White   39

John Doe          13

John Smith        15

Matt Damon      38

Randy Snow     28

 

 

Ideal Result Table:

 

Name            Age

John Doe        13

John Smith     15

Bo Derek        28

Randy Snow   28

Matt Damon    38

Camilla White   39

 

What i'm hoping to do is merge the two tables(main_tbl and updt_tbl)  into a new table and have "Bo Derek"'s age be updated with the new age in the second table and the insertion of the new row "Camilla White" all to the result_tbl. The repeated rows in the update table should not be appended to the new table to prevent duplicates.

 

 

proc sql;

create table s.result_tbl as

select *

from s.main_tbl, s.updt_tbl
where main_tbl.name  = updt_tbl.name
order by name;

 

is my current solution, but I don't think that's the right way at all of approaching this. I would prefer to stick with using proc sql; and not having to resort to data steps. What would be the most efficient and effective way of achieving the desired outcome? Thanks!

5 REPLIES 5
Reeza
Super User
Have you tried a data step update?

data s.result_tbl;
update main_tbl updt_tb;
by name;
run;

dwah
Fluorite | Level 6
The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.
Reeza
Super User
Looks like a network time out issue?
Patrick
Opal | Level 21

@dwah wrote:
The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.

Oh, your table(s) are in Teradata. That makes a difference.

I'd be using explicit pass-through SQL and then use a Teradata SQL MERGE. If your source table is in SAS and not already Teradata then I'd upload this table first into the database - either into a staging area or then as a temporary table.

novinosrin
Tourmaline | Level 20

Hello @dwah 

 

data main;
input Name    & $20.            Age;
cards;
John Doe        13
John Smith     15
Bo Derek        21
Randy Snow   28
Matt Damon    38
;


data new;
input Name    & $20.            Age;
cards;
Bo Derek          28
Camilla White   39
John Doe          13
John Smith        15
Matt Damon      38
Randy Snow     28
;

proc sql;
create table result as
select coalescec(a.name,b.name) as name ,b.age 
from main a full join new b
on a.name=b.name;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 944 views
  • 5 likes
  • 4 in conversation