BookmarkSubscribeRSS Feed
Rogerio_Alves
Obsidian | Level 7

I have an APPEND process that duplicates data if I run it twice. Is there some way to avoid data duplication using APPEND?

 

Rogerio Alves
7 REPLIES 7
Ksharp
Super User
Add some code after PROC APPEND to remove the duplicated obs . Like:

proc append base=have data=temp force;run;
proc sql;
create table want as
select distinct * from have;
quit;
ralves2
Calcite | Level 5
Tks a lot for suggestion! Do you think proc sort could be more performatic?
Ksharp
Super User
I think both have same performance , due to both take multi session to run.
But you could test it . I welcome you post the compare result .
ralves2
Calcite | Level 5
Tks a lot for all attention here. I will try both and I let you know.
LinusH
Tourmaline | Level 20

For production ready code, you shouldn't be able to run the process twice with the same data.

If you can't gurantee that, redesign the job to either have an Update loading strategy, or as @Ksharp suggests - have a post process that removes duplicates.

This might be fine if your data is manageble, but it will affect the performance for sure.

Data never sleeps
Rogerio_Alves
Obsidian | Level 7

To avoid a problem to duplicate data with an APPEND I created an INDEX on DI with some columns. When I run the job it is executed but send this message and JOB is finished with EXIT on FLOW MANAGER.

 

ERROR ON LOG:

Add/Update failed for data set <TABLE TARGET> because data value(s) do not comply with integrity constraint.

 

How could I solve that?

 

Thanks!

Rogerio Alves
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Rogerio_Alves 

 

I wonder how you come to get duplicate data. The DI Studio Append Transformation (at least in my version 4.9) always deletes the output table before appending (check the transformation code in properties -> code), so it should be impossible to get duplicates unless your output table is input to a step before the append, so existing data goes into the appendtogether with new data.

 

It is a little bit difficult to imagine what's going on, so please post a screenshot of your job canvas.

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1785 views
  • 0 likes
  • 5 in conversation