11-11-2015 04:02 PM
I have a question in proc append.
I have two data sets.
|1004||Tire||Good Year tire|
I am using proc append to concatenate these two datasets.
proc append base=inv_base data=inv_tr force;
After append I have duplicate records in base dataset.
How can I stop appending duplicate records in other words how can I stop adding a record that is already exist in base dataset.?
is there any other way I can get the required result.
Thank you in advance
11-11-2015 04:23 PM
There are other choices: SET statement, MERGE statement, UPDATE statement, even PROC SQL. All of these choices will take longer to run. PROC APPEND does not have to process records in the BASE= data set, only add to it. But other choices require processing the records in both data sets.
To pick the best method, you'll have to think about how to handle conflicts. For example, suppose one data set contains:
10003 steering steering wheel
But the other contains
10003 steering leather steering wheel
What should the outcome be in that case? Once you have made your choices, the programming can begin.
11-11-2015 05:00 PM
11-11-2015 10:19 PM
Please try the SYNCADD option in your PROC APPEND step as shown below. The rejects will be saved with the UNIQUESAVE option.
Please refer to the following URL for detailed explanation with example & log.....
11-12-2015 12:50 AM - edited 11-12-2015 12:52 AM
It might be worthwhile to note that the OPTIONS syntax SYNCADD and UNIQUESAVE does not work on normal libraries. I tried and the syntax wasn't recognized (without SPDE). You need to define your library where you have the master dataset as as SPD engine. I found some literature on SPD (Scalable Performance Data) listed below that may help provide some background from SAS 9. The document reveals many things of which the primary focus of SPD is for rapid data access with heavy CPU operation through parallel processing and multithreading operations.
Here is the example, I tried out using SPDE in the LIBNAME statement that provided the desired PROC APPEND result eliminating duplicates.
libname lib1 spde '/folders/myfolders/.sasstudio/'; data lib1.d1(index=(name/unique)); set sashelp.class(obs=5); run; data d2(index=(name/unique)); set sashelp.class(obs=7 firstobs=4); run; proc append base=lib1.d1(syncadd=no uniquesave=yes) data=work.d2 ; run; proc print data=lib1.d1; title '==d1 after unique append=='; run; proc print data=&spdsusds; title '==Uniques append rejects=='; run;
The result it produced is here for your reference:
==d1 after unique append== Obs Name Sex Age Height Weight 1 Alfred M 14 69.0 112.5 2 Alice F 13 56.5 84.0 3 Barbara F 13 65.3 98.0 4 Carol F 14 62.8 102.5 5 Henry M 14 63.5 102.5 6 James M 12 57.3 83.0 7 Jane F 12 59.8 84.5 ==Uniques append rejects== Obs Name Sex Age Height Weight XXX00000 1 Carol F 14 62.8 102.5 Name 2 Henry M 14 63.5 102.5 Name
Hope this helps...!!!