Re: DI APPEND -

Rogerio_Alves · Posted 02-09-2021 12:14 PM

I have an APPEND process that duplicates data if I run it twice. Is there some way to avoid data duplication using APPEND?

Rogerio Alves

Ksharp · Posted 02-10-2021 07:29 AM

Add some code after PROC APPEND to remove the duplicated obs . Like:

proc append base=have data=temp force;run;
proc sql;
create table want as
select distinct * from have;
quit;

ralves2 · Posted 02-10-2021 09:44 AM

Tks a lot for suggestion! Do you think proc sort could be more performatic?

Ksharp · Posted 02-11-2021 06:16 AM

I think both have same performance , due to both take multi session to run.
But you could test it . I welcome you post the compare result .

ralves2 · Posted 02-11-2021 07:09 AM

Tks a lot for all attention here. I will try both and I let you know.

LinusH · Posted 02-10-2021 08:31 AM

For production ready code, you shouldn't be able to run the process twice with the same data.

If you can't gurantee that, redesign the job to either have an Update loading strategy, or as @Ksharp suggests - have a post process that removes duplicates.

This might be fine if your data is manageble, but it will affect the performance for sure.

Data never sleeps

Rogerio_Alves · Posted 02-25-2021 06:18 AM

To avoid a problem to duplicate data with an APPEND I created an INDEX on DI with some columns. When I run the job it is executed but send this message and JOB is finished with EXIT on FLOW MANAGER.

ERROR ON LOG:

Add/Update failed for data set <TABLE TARGET> because data value(s) do not comply with integrity constraint.

How could I solve that?

Thanks!

Rogerio Alves

ErikLund_Jensen · Posted 03-03-2021 01:39 PM

Hi @Rogerio_Alves

I wonder how you come to get duplicate data. The DI Studio Append Transformation (at least in my version 4.9) always deletes the output table before appending (check the transformation code in properties -> code), so it should be impossible to get duplicates unless your output table is input to a step before the append, so existing data goes into the appendtogether with new data.

It is a little bit difficult to imagine what's going on, so please post a screenshot of your job canvas.

Registration is open