BookmarkSubscribeRSS Feed
Swapnil_21
Obsidian | Level 7

Hi There,

 

I have a monthly table containing 4000000 observations and Historical table containing around 16700000 observations (Oracle table) On monthly loading day, I use proc append to append data to historical table. However, after 1-2 hours loading, it usually fails giving unique key violation error. Hence, I need to write validation code before proc append. There is unique key on target table for 4 variables. 

 

Dummy Source data looks like as below 

 

KeyAccount NumPremiumBalanceOutstandingSource_IDReporing DateYear
1236665200677444Customer Care12-Oct-192
34445899556888Bank17-Oct-192
789554900777998Partner Site04-Oct-192
747947829087655978Bank16-Oct-191
765754688865467665Bank16-Oct-191

 

And below is my Target Historical Table before  loading 

 

KeyAccount NumPremiumBalanceOutstandingSource_IDReporing DateYear
12345489777887899Customer Care12-Oct-181
34445899556888Bank17-Oct-181
789554900777998Partner Site04-Oct-181
789554900777998Partner Site04-Oct-192

 

Unique Key constraints is on Key, Account Num, Reporting Date and Year. 

 

 

Below is the code 

 

Proc append base=Hist_mart data=month_tab_oct2019 force; run;

 

So this fails for account no 554 (Record no 3 in source) which is correct.  My problem is how do I findout before applying proc append.  Any help or ideas is really appreciated. 

5 REPLIES 5
Kurt_Bremser
Super User
proc sort data=month_tab_oct2019;
by key account_num  reporting_date year;
run;

proc sort
  data=hist_mart (keep=key account_num  reporting_date year)
  out=lookup
;
by key account_num  reporting_date year;
run;

data month_append;
merge
  month_tab_oct2019 (in=in_app)
  lookup (in=in_look)
;
by key account_num  reporting_date year;
if in_app and not in_look;
run;
Shmuel
Garnet | Level 18

@Kurt_Bremser ,

Assuming that monthly data are all OCT2019 and the fact that historic data is huge

I suggest to add a WHERE statement to the 2nd sort step, in order to eliminate time execution:

proc sort
  data=hist_mart (keep=key account_num  reporting_date year
        where=(reporting_date between '01oct2019'd and '31oct2019'd))
  out=lookup
;
by key account_num  reporting_date year;
run;
Kurt_Bremser
Super User

Good idea. Thinking about it, I'd draw the min/max date first from the update dataset, just to be sure:

proc sql no print;
select min(reporting_date), max(reporting_date) into :mindate, maxdate
from month_tab_oct2019;
quit;

proc sort
  data=hist_mart (
    keep=key account_num  reporting_date year
    where=(reporting_date between &mindate. and &maxdate.))
  )
  out=lookup
;
by key account_num reporting_date year;
run;

You never know what people put into datasets 😉

Patrick
Opal | Level 21

@Swapnil_21 

What should happen if there is a matching key? Update, just ignore or raise an error? Do you want to insert whatever is possible or not insert anything if there is a matching key.

 

Using Oracle:

I would first upload your SAS data into an Oracle staging table and have all the remaining processing done in Oracle using explicit pass-through SQL. Depending on the details of the load logic you need using an Oracle SQL Merge statement could eventually make it very simple to deal with your data. 

 

2+ hours for Inserting 400K records sounds like bad performance. Changing some of the SAS parameters like insertbuff and dbcommit might help to improve performance. Also bulk-loading could eventually speed-up time especially if loading into a staging table.

s_lassen
Meteorite | Level 14

Instead of using append, you could use a data step to add the values:

data oralib.target;
  modify oralib.target monthly_source;
  by Key Account_Num Reporting_Date Year;
  if _iorc_=%sysrc(_DSENMR) then do; /* key not found in master data */
    _error_=0;
    output;
    end;
run;

As the Oracle database is checking the key values anyway (because of the unique key restraint), this may be just as fast as PROC APPEND.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1366 views
  • 0 likes
  • 5 in conversation