Question translated to English via chatGPT
Could you help me with the error below? Do you know where I can find the error in my tables? I've already created the keys as specified in the program, but I can't find any duplicate information. Thank you for your help.
Your master table IFRSSDM.INSURANCE_CASHFLOW got a primary key integrity constraint defined. Which variables are part of this primary key is visible in the table metadata definition. If the table already exists physically then you can also query it using for example Proc Contents. Below an example for a primary key constraint created over a variable with name key_var
proc contents data=work.master;
run;
SAS hits the integrity constraint and throws the warning because you are trying to insert (append) rows from your transaction table work.W37MYOB with key values that already exist in the master table.
If your DIS job doesn't create the master table and you re-run your DIS job then it's possible that you hit this issue simply because you already loaded the new transactions in an earlier run.
When using Proc Append with SAS datasets then SAS will still insert all rows that don't hit the integrity constraint. If you would use a Proc SQL insert then it would be an Error and the rows wouldn't get inserted. You would also get an error if the master table would be a database table as then the database would throw the error.
If you want to ensure that you never get such a warning in any case then you need first to work out the delta. You could alternatively also try the DIS SQL Merge transformation and just do nothing in case of a match (not sure if the transformation allows for this).
Below some sample code to replicate what you observe.
/* create sample data */
data work.master;
input key_var value_var $;
datalines;
1 A
2 A
;
data work.transaction;
input key_var value_var $;
datalines;
2 X
5 X
6 X
;
/* add primary key to sample data */
proc datasets lib=work nolist;
modify master;
ic create prim_key=primary key(key_var);
run;
quit;
/* replicate the warning by appending/inserting the rows */
proc append base=work.master data=work.transaction;
run;quit;
NOTE: Appending WORK.TRANSACTION to WORK.MASTER.
WARNING: Add/Update failed for data set WORK.MASTER because data value(s) do not comply with integrity constraint prim_key, 1
observations rejected.
You get the warning because key value 2 in the transaction table already exists in the master table.
The log also tells you that it rejected 1 row which means the other two rows where there hasn't been a conflict got inserted.
If you re-run the Proc Append on its own then the SAS log will tell you that 3 rows have been rejected.
If you want to ensure that you're only ever loading new transactions then you need first to identify the delta. This of course is additional processing with some performance impact. In below code table work.transaction_delta must be a table and not a view.
proc sql;
create table work.transaction_delta as
select t.key_var, t.value_var
from
work.transaction as t
left join
work.master m
on t.key_var=m.key_var
having missing(m.key_var)
;
quit;
proc append base=work.master data=work.transaction_delta;
run;quit;
And as already said earlier an alternative to above code could be the DIS SQL Merge transformation. I never used it so can't tell if it allows you to define a "no action" in case of a key match (the update case).
From a performance perspective a Proc Append will be fastest. If you can be sure that you never get "old" transactions from source and that you only hit the integrity constraint in case of a re-run then I'd consider the Warning as acceptable and you could document it as something that can happen during development/re-run but that it is of no substantial concern in a Prod environment. What's the right thing to do really depends on how important performance is and if you can be certain that you only ever get the delta transactions from source (or already worked out the delta earlier).
And last but not least:
I recommend to not use views but only tables during DIS development. Only switch over to views when things work for a last test.
The reason is that with views you only get the errors in the node where the view executes and not where you define it which can make debugging quite a bit harder.
... View more