DATA Step, Macro, Functions and more

proc append- stop appending duplicate observation in Base table

Reply
Contributor
Posts: 40

proc append- stop appending duplicate observation in Base table

I have a question in proc append.

I have two data sets.

inv_base

 

part_no Part_name description
1002 windshield wind shield
1003 steering steering wheel
1004 Tire Good Year tire

inv_tr

 

part_no Part_name description
1001 mirror front mirror
1003 steering steering wheel

 

I am using proc append to concatenate these two datasets.

proc append base=inv_base data=inv_tr force;

run;

After append I have duplicate records in base dataset.

How can I stop appending duplicate records in other words how can I stop adding a record that is already exist in base dataset.?

is there any other way I can get the required result.

 

Thank you in advance

 

Super User
Posts: 5,498

Re: proc append- stop appending duplicate observation in Base table

There are other choices:  SET statement, MERGE statement, UPDATE statement, even PROC SQL.  All of these choices will take longer to run.  PROC APPEND does not have to process records in the BASE= data set, only add to it.  But other choices require processing the records in both data sets.

 

To pick the best method, you'll have to think about how to handle conflicts.  For example, suppose one data set contains:

 

10003   steering   steering wheel

 

But the other contains

 

10003   steering   leather steering wheel

 

What should the outcome be in that case?  Once you have made your choices, the programming can begin.

Contributor
Posts: 56

Re: proc append- stop appending duplicate observation in Base table

Detailed discussion with a few solutions already on this forum here.

Valued Guide
Posts: 2,177

Re: proc append- stop appending duplicate observation in Base table

To identify that a new record is a duplicate either needs some kind of merge or for the "master" table to be indexed on the keys that would define duplicates.
Proc Append is very effective without duplicates check because of that reason. With an indexed "master" similar efficiency - avoiding the rewriting of all rows in the "master" - is available by using SQL INSERT or by using the data step MODIFY statement. See the examples in the online doc.
Regular Contributor
Posts: 161

Re: proc append- stop appending duplicate observation in Base table

Please try the SYNCADD option in your PROC APPEND step as shown below. The rejects will be saved with the UNIQUESAVE option. 

 

(syncadd=no uniquesave=yes)

 

Please refer to the following URL for detailed explanation with example & log.....

 

http://support.sas.com/documentation/cdl/en/engspde/67961/HTML/default/viewer.htm#p105mkhis1sjfln1qv...

 

Good Luck...!!!

Kannan Deivasigamani
Regular Contributor
Posts: 161

Re: proc append- stop appending duplicate observation in Base table

[ Edited ]

It might be worthwhile to note that the OPTIONS syntax SYNCADD and UNIQUESAVE does not work on normal libraries. I tried and the syntax wasn't recognized (without SPDE).  You need to define your library where you have the master dataset as as SPD engine. I found some literature on SPD (Scalable Performance Data) listed below that may help provide some background from SAS 9. The document reveals many things of which the primary focus of SPD is for rapid data access with heavy CPU operation through parallel processing and multithreading operations. 

 

https://support.sas.com/documentation/cdl/en/engspde/62981/PDF/default/engspde.pdf

 

Here is the example, I tried out using SPDE in the LIBNAME statement that provided the desired PROC APPEND result eliminating duplicates.

 

libname lib1 spde '/folders/myfolders/.sasstudio/';
data lib1.d1(index=(name/unique));
     set sashelp.class(obs=5);
run;
data d2(index=(name/unique));
     set sashelp.class(obs=7 firstobs=4);
run;
proc append base=lib1.d1(syncadd=no uniquesave=yes) data=work.d2 ;
run;
proc print data=lib1.d1; title '==d1 after unique append==';
run;
proc print data=&spdsusds; title '==Uniques append rejects==';
run;

The result it produced is here for your reference:

 

==d1 after unique append==
Obs	Name	Sex	Age	Height	Weight
1	Alfred	M	14	69.0	112.5
2	Alice	F	13	56.5	84.0
3	Barbara	F	13	65.3	98.0
4	Carol	F	14	62.8	102.5
5	Henry	M	14	63.5	102.5
6	James	M	12	57.3	83.0
7	Jane	F	12	59.8	84.5
==Uniques append rejects==
Obs	Name	Sex	Age	Height	Weight	XXX00000
1	Carol	F	14	62.8	102.5	Name
2	Henry	M	14	63.5	102.5	Name

Hope this helps...!!!

 

 

Kannan Deivasigamani
Ask a Question
Discussion stats
  • 5 replies
  • 2180 views
  • 4 likes
  • 5 in conversation