BookmarkSubscribeRSS Feed
nbonda
Obsidian | Level 7

I have a question in proc append.

I have two data sets.

inv_base

 

part_no Part_name description
1002 windshield wind shield
1003 steering steering wheel
1004 Tire Good Year tire

inv_tr

 

part_no Part_name description
1001 mirror front mirror
1003 steering steering wheel

 

I am using proc append to concatenate these two datasets.

proc append base=inv_base data=inv_tr force;

run;

After append I have duplicate records in base dataset.

How can I stop appending duplicate records in other words how can I stop adding a record that is already exist in base dataset.?

is there any other way I can get the required result.

 

Thank you in advance

 

5 REPLIES 5
Astounding
PROC Star

There are other choices:  SET statement, MERGE statement, UPDATE statement, even PROC SQL.  All of these choices will take longer to run.  PROC APPEND does not have to process records in the BASE= data set, only add to it.  But other choices require processing the records in both data sets.

 

To pick the best method, you'll have to think about how to handle conflicts.  For example, suppose one data set contains:

 

10003   steering   steering wheel

 

But the other contains

 

10003   steering   leather steering wheel

 

What should the outcome be in that case?  Once you have made your choices, the programming can begin.

JoshB
Quartz | Level 8

Detailed discussion with a few solutions already on this forum here.

Peter_C
Rhodochrosite | Level 12
To identify that a new record is a duplicate either needs some kind of merge or for the "master" table to be indexed on the keys that would define duplicates.
Proc Append is very effective without duplicates check because of that reason. With an indexed "master" similar efficiency - avoiding the rewriting of all rows in the "master" - is available by using SQL INSERT or by using the data step MODIFY statement. See the examples in the online doc.
kannand
Lapis Lazuli | Level 10

Please try the SYNCADD option in your PROC APPEND step as shown below. The rejects will be saved with the UNIQUESAVE option. 

 

(syncadd=no uniquesave=yes)

 

Please refer to the following URL for detailed explanation with example & log.....

 

http://support.sas.com/documentation/cdl/en/engspde/67961/HTML/default/viewer.htm#p105mkhis1sjfln1qv...

 

Good Luck...!!!

Kannan Deivasigamani
kannand
Lapis Lazuli | Level 10

It might be worthwhile to note that the OPTIONS syntax SYNCADD and UNIQUESAVE does not work on normal libraries. I tried and the syntax wasn't recognized (without SPDE).  You need to define your library where you have the master dataset as as SPD engine. I found some literature on SPD (Scalable Performance Data) listed below that may help provide some background from SAS 9. The document reveals many things of which the primary focus of SPD is for rapid data access with heavy CPU operation through parallel processing and multithreading operations. 

 

https://support.sas.com/documentation/cdl/en/engspde/62981/PDF/default/engspde.pdf

 

Here is the example, I tried out using SPDE in the LIBNAME statement that provided the desired PROC APPEND result eliminating duplicates.

 

libname lib1 spde '/folders/myfolders/.sasstudio/';
data lib1.d1(index=(name/unique));
     set sashelp.class(obs=5);
run;
data d2(index=(name/unique));
     set sashelp.class(obs=7 firstobs=4);
run;
proc append base=lib1.d1(syncadd=no uniquesave=yes) data=work.d2 ;
run;
proc print data=lib1.d1; title '==d1 after unique append==';
run;
proc print data=&spdsusds; title '==Uniques append rejects==';
run;

The result it produced is here for your reference:

 

==d1 after unique append==
Obs	Name	Sex	Age	Height	Weight
1	Alfred	M	14	69.0	112.5
2	Alice	F	13	56.5	84.0
3	Barbara	F	13	65.3	98.0
4	Carol	F	14	62.8	102.5
5	Henry	M	14	63.5	102.5
6	James	M	12	57.3	83.0
7	Jane	F	12	59.8	84.5
==Uniques append rejects==
Obs	Name	Sex	Age	Height	Weight	XXX00000
1	Carol	F	14	62.8	102.5	Name
2	Henry	M	14	63.5	102.5	Name

Hope this helps...!!!

 

 

Kannan Deivasigamani

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 12655 views
  • 5 likes
  • 5 in conversation