BookmarkSubscribeRSS Feed
fredbell
Fluorite | Level 6
Hello

I need to append records from one table to another but have the append fuction not append if record already exists in Base dataset.

Example of Base Dataset, Name of Base dataset "Employees"

Name Age Sex Province
--------------------------------------------------------------
FRED 48 M ONT.
SAM 55 M ONT.
KIM 19 F ONT.
KAREN 33 F ONT.


Example of dataset that i want to append to Base, Name "NewEmployees"

Name Age Sex Province
-------------------------------------------------------------
FRED 48 M ONT.
Tony 51 M ONT.

I don't want it to append record below because it already exists in Base.

Name Age Sex Province
-------------------------------------------------------------
FRED 48 M ONT.

I do want it to append this record because it does not appear in Base.

Name Age Sex Province
-------------------------------------------------------------
Tony 51 M ONT.


proc append Base = Employees Data = NewEmployees FORCE ;
run;


Thanks and i hope this is clear and possible.

Fred
14 REPLIES 14
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Consider using a DATA step approach (on sorted input files using your defined BY statement variable criteria for matching "like" conditions), and use the SET statement with a BY statement, and combine that with FIRST. and LAST. to detect conditions where you do or do not want to OUTPUT an observation. Suggest you add for diagnostics while programming the DATA step, this statement which will help in debugging the DATA step processing (where you might have one or more of these statements in a DATA step and the "nn" will help reveal in your SAS log just which one is being executed at a given time):

PUT '>DIAG-nn>' / _ALL_;


Scott Barry
SBBWorks, Inc.


Suggested Google advanced search arguments, this topic / post:

data step by group procesing site:sas.com

data step append site:sas.com
fredbell
Fluorite | Level 6
I am very new to SAS but i will try to work through what you have posted,
any chance you could post example?

Thanks sbb

Fred
Cynthia_sas
SAS Super FREQ
Hi:
In addition to Scott's suggestion, you may also want to move away from the simple PROC APPEND approach and move toward using an INDEX with the UNIQUE option (assuming you can find a combination of variables that define "uniqueness -- other than an ID):

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000440261.htm
http://support.sas.com/kb/24/603.html
http://www2.sas.com/proceedings/sugi30/008-30.pdf

cynthia
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
The SAS-hosted DOC is your friend...especially now, with much information and technical reference materials available at the SAS Support website.

In addition to the Google searches already provided, here's where I would encourage your attention, to get started:

Methods of Combining SAS Data Sets (and related sections that follow):
http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/a001303248.htm


Scott Barry
SBBWorks, Inc.
SASPhile
Quartz | Level 8
if the purpose is to eliminate the dup records then you can use proc sort the following way:

data base1;
input name$1-5 age 6-7 sex $9-10 province $11-15;
datalines;
FRED 48 M ONT.
SAM 55 M ONT.
KIM 19 F ONT.
KAREN 33 F ONT.
;
run;


data base2;
input name$1-5 age 6-7 sex $9-10 province $11-14;
datalines;
FRED 48 M ONT.
TONY 51 M ONT.
;
run;



proc append base=base1 data=base2 force;
run;


proc sort data=base1 nodup;
by _all_;
run;
data_null__
Jade | Level 19
MODIFY with KEYed access.

Create a key on BASE1. Then code a MODIFY step to trap the transaction records that are not in BASE1 and OUTPUT. As written this cannot be used to update obs in BASE1 with changes to age sex etc., you have to turn the method over for that.

[pre]
data base1(index=(name));
input name$ age sex $ province $;
datalines;
FRED 48 M ONT.
SAM 55 M ONT.
KIM 19 F ONT.
KAREN 33 F ONT.
;
run;


data base2;
input name $ age sex $ province $;
datalines;
FRED 48 M ONT.
TONY 51 M ONT.
MARY 42 F ONT.
;
run;


data base1;
set base2;
modify base1 key=name/unique;
if _iorc_ ne 0 then do;
output;
_error_=0;
end;
run;
proc print data=base1;
run;
[/pre]
Ksharp
Super User
@data _null_; gives a method, and there is another way (proc sql) to get it.


[pre]
data base1;
input name$1-5 age 6-7 sex $9-10 province $11-15;
datalines;
FRED 48 M ONT.
SAM 55 M ONT.
KIM 19 F ONT.
KAREN 33 F ONT.
;
run;


data base2;
input name$1-5 age 6-7 sex $9-10 province $11-14;
datalines;
FRED 48 M ONT.
TONY 51 M ONT.
;
run;

proc sql feedback;
create table append as
select * from base1
union corresponding
select * from base2
;
quit;
proc print;
run;

[/pre]




Ksharp
atul_desh
Quartz | Level 8

Hey KSharp,

 

when I'm running this query its checking for all variables for duplicity.

 

I need to check based on only 3 Variables.

 

Is there any other way around. 

 

Thanks

Sunny
Calcite | Level 5
here is another approach to get the final output. its quite simple approach.

data base1(index=(name));
input name$ age sex $ province $;
datalines;
FRED 48 M ONT.
SAM 55 M ONT.
KIM 19 F ONT.
KAREN 33 F ONT.
;
run;


data base2;
input name $ age sex $ province $;
datalines;
FRED 48 M ONT.
TONY 51 M ONT.
MARY 42 F ONT.
;
run;

proc sort data = base1;
by _all_;
run;

proc sort data = base2;
by _all_;
run;

data final;
merge base1(in = ina) base2(in = inb);
by _all_;
if ina = 1 or inb = 1 then output;
run;
data_null__
Jade | Level 19
While the result may be the same you did not append records to an existing file.
Sunny
Calcite | Level 5
yes. I thought, we are more concerned about output as compare to appending the datasets.

So i wrote the script keeping output in my mind. But there may be other way to achieve it.
Ksharp
Super User
Hi.
Definitely using merge-by and sort can do it.
But there is a disadvantage that the order of obs is changed.
And Somebody do not like it.
The @data_null_;'s advantage is to keep the original order.


Ksharp
nbonda
Obsidian | Level 7
Thank you Everybody for giving suggestions. At last I decided to use modify statement. this update (overwrite) if observation exits the master dataset. I believe this is slower process than append.
shaffner
Calcite | Level 5

Can you post your code please?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 12922 views
  • 2 likes
  • 10 in conversation