ETL Question - Merging two tables using proc sql

dwah · Posted 07-16-2019 11:46 AM

Hi!

Currently, I am trying to write an etl process to update a main table with periodic updates(these updates could either be changes to existing rows or could contain new rows or both)

For example:

Main table has 5 rows with 2 columns(name and age):

Name Age

John Doe 13

John Smith 15

Bo Derek 21

Randy Snow 28

Matt Damon 38

New incoming table(6 rows, one update to the age and 1 new row)

Name Age

Bo Derek 28

Camilla White 39

John Doe 13

John Smith 15

Matt Damon 38

Randy Snow 28

Ideal Result Table:

Name Age

John Doe 13

John Smith 15

Bo Derek 28

Randy Snow 28

Matt Damon 38

Camilla White 39

What i'm hoping to do is merge the two tables(main_tbl and updt_tbl) into a new table and have "Bo Derek"'s age be updated with the new age in the second table and the insertion of the new row "Camilla White" all to the result_tbl. The repeated rows in the update table should not be appended to the new table to prevent duplicates.

proc sql;

create table s.result_tbl as

select *

from s.main_tbl, s.updt_tbl
where main_tbl.name = updt_tbl.name
order by name;

is my current solution, but I don't think that's the right way at all of approaching this. I would prefer to stick with using proc sql; and not having to resort to data steps. What would be the most efficient and effective way of achieving the desired outcome? Thanks!

Reeza · Posted 07-16-2019 11:49 AM

Have you tried a data step update?

data s.result_tbl;
update main_tbl updt_tb;
by name;
run;

dwah · Posted 07-16-2019 12:09 PM

The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.

Reeza · Posted 07-16-2019 12:26 PM

Looks like a network time out issue?

Patrick · Posted 07-17-2019 05:54 AM

@dwah wrote:
The code seemed to work but im getting a teradata error. This issue might no longer be related to the query in itself but i'm not sure either.

ERROR: Teradata insert: Response limit exceeded.

Oh, your table(s) are in Teradata. That makes a difference.

I'd be using explicit pass-through SQL and then use a Teradata SQL MERGE. If your source table is in SAS and not already Teradata then I'd upload this table first into the database - either into a staging area or then as a temporary table.

novinosrin · Posted 07-16-2019 12:09 PM

Hello @dwah

data main;
input Name    & $20.            Age;
cards;
John Doe        13
John Smith     15
Bo Derek        21
Randy Snow   28
Matt Damon    38
;


data new;
input Name    & $20.            Age;
cards;
Bo Derek          28
Camilla White   39
John Doe          13
John Smith        15
Matt Damon      38
Randy Snow     28
;

proc sql;
create table result as
select coalescec(a.name,b.name) as name ,b.age 
from main a full join new b
on a.name=b.name;
quit;

ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Re: ETL Question - Merging two tables using proc sql

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!