Hi Experts,
I am trying to create a daily incremental code for Impala tables in SAS with Proc Append. But if I execute proc append twice or multiple time so the data gets appended multiple time.
How can I avoide this replication of data. Or else, is there any way to overwrite or update the data in impala table using sas.
Please help.
-Rahul
Assuming you have SAS/ACCESS to Impala there are lots of ways - you can use a data step merge to update existing records and add any new ones at the same time, Proc SQL Union to append, data step Update statement etc.
I use two different ways to avoid duplicate data:
- set a variable that identifies a group of new records. This can be an infile name, a date, or similar. While concatenating (I do not use proc append), observations with the same values as those that are to be appended are excluded from the master dataset.
- identify a unique key (this may be one or more variables). After appending/concatenating, do a proc sort with nodupkey.
hi KurtBremser....could you please share a sample code for the first scenario.
A piece of blueprint code might look like this:
%let infile1=/shared/data/data_20170913.dat;
%let outlib=out;
%let masterfile=my_dataset;
data infile;
infile "&infile1";
input
indata $
;
todays_file = "&infile1";
run;
data &outlib..&masterfile._new;
set
&outlib..&masterfile (where=(todays_file ne "&infile1"))
infile
;
run;
proc datasets library=&outlib nolist;
delete &masterfile;
change &masterfile._new=&masterfile;
run;
Note that I do the "append" in a separate step. That way I can wrap the final proc datasets into a macro that checks for &syscc=0, to prevent replacing the master dataset if anything went wrong.
Also note that this is a SAS-only solution; you may have to check your options with the administrators of the Impala DBMS.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.