BookmarkSubscribeRSS Feed
Rahul_SAS
Quartz | Level 8

Hi Experts,

 

I am trying to create a daily incremental code for Impala tables in SAS with Proc Append. But if I execute proc append twice or multiple time so the data gets appended multiple time.

 

How can I avoide this replication of data. Or else, is there any way to overwrite or update the data in impala table using sas.

Please help.

 

-Rahul

4 REPLIES 4
ChrisBrooks
Ammonite | Level 13

Assuming you have SAS/ACCESS to Impala there are lots of ways - you can use a data step merge to update existing records and add any new ones at the same time, Proc SQL Union to append, data step Update statement etc.

Kurt_Bremser
Super User

I use two different ways to avoid duplicate data:

 

- set a variable that identifies a group of new records. This can be an infile name, a date, or similar. While concatenating (I do not use proc append), observations with the same values as those that are to be appended are excluded from the master dataset.

- identify a unique key (this may be one or more variables). After appending/concatenating, do a proc sort with nodupkey.

Rahul_SAS
Quartz | Level 8

hi KurtBremser....could you please share a sample code for the first scenario.

Kurt_Bremser
Super User

A piece of blueprint code might look like this:

%let infile1=/shared/data/data_20170913.dat;
%let outlib=out;
%let masterfile=my_dataset;

data infile;
infile "&infile1";
input
  indata $
;
todays_file = "&infile1";
run;

data &outlib..&masterfile._new;
set
  &outlib..&masterfile (where=(todays_file ne "&infile1"))
  infile
;
run;

proc datasets library=&outlib nolist;
delete &masterfile;
change &masterfile._new=&masterfile;
run;

Note that I do the "append" in a separate step. That way I can wrap the final proc datasets into a macro that checks for &syscc=0, to prevent replacing the master dataset if anything went wrong.

Also note that this is a SAS-only solution; you may have to check your options with the administrators of the Impala DBMS.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2105 views
  • 0 likes
  • 3 in conversation