Hello SAS Community,
I have an ETL process that loads data into a versioned table (ESPORTA.ANALISI_CREDITI).
Currently, every time I run the process, a new version is created and appended, even if the data for the same period is identical to the previous version.
What I need:
I want to increment the version and append data only if the new data is different from the previous version for the same period (DATA_REF).
If the data is identical, I want to avoid creating a new version and not append anything.
Question:
What is the best way in SAS to compare the new data with the previous version and only increment the version if there are actual changes?
Any code example or best practice is welcome!
Thank you!
sas
%macro fix_versioning;
%global next_version;
%let next_version = 1;
%put NOTE: === AVVIO MACRO FIX_VERSIONING ===;
%put NOTE: DATA_REF = &DATA_PERIO;
%put NOTE: DT_VERSION = &dt_version;
/* STEP 1: Determina il numero di versione */
%if %sysfunc(exist(ESPORTA.ANALISI_CREDITI)) %then %do;
proc sql noprint;
select max(VERSION) into :last_ver
from ESPORTA.ANALISI_CREDITI
where DATA_REF = &DATA_PERIO_NUM;
quit;
%if %sysevalf(%superq(last_ver) = , boolean) or %sysevalf(&last_ver = ., boolean) %then %do;
%let next_version = 1;
%put NOTE: Prima versione per DATA_REF=&DATA_PERIO;
%end;
%else %do;
%let next_version = %eval(&last_ver + 1);
%put NOTE: Incremento a versione &next_version;
%end;
/* STEP 2: Crea il dataset temporaneo con la versione corretta */
data WORK.ANALISI_CREDITI;
set WORK.ANALISI_CREDITI;
VERSION = &next_version;
DT_VERSION = &dt_version;
DT_VERSION= input(put(DT_VERSION, 8.), yymmdd8.);
format DT_VERSION ddmmyy10.;
VERSION_ID = cats(put(DT_VERSION, yymmddn8.), '_V', put(VERSION, z2.));
run;
/* STEP 3: Backup tabella esistente */
%let timestamp = %sysfunc(datetime(), datetime20.);
%let timestamp = %sysfunc(translate(×tamp, '_', ' '));
%let timestamp = %sysfunc(translate(×tamp, '_', ':'));
data ESPORTA.ANALISI_CREDITI_BACKUP;
set ESPORTA.ANALISI_CREDITI;
backup_date = "×tamp";
run;
/* STEP 4: Append dei nuovi dati */
proc append base=ESPORTA.ANALISI_CREDITI data=WORK.ANALISI_CREDITI force;
run;
%put NOTE: Nuovi dati con versione &next_version aggiunti ad ANALISI_CREDITI esistente;
%end;
%else %do;
/* Se il dataset non esiste, crealo direttamente */
data ESPORTA.ANALISI_CREDITI;
set WORK.ANALISI_CREDITI;
run;
%put NOTE: Creato nuovo dataset ESPORTA.ANALISI_CREDITI (prima versione);
%end;
/* STEP 5: Aggiorna version_history */
%if %sysfunc(exist(ESPORTA.version_history)) %then %do;
proc sql;
insert into ESPORTA.version_history (DT_VERSION, VERSION)
values(&dt_version, &next_version);
quit;
%end;
%else %do;
data ESPORTA.version_history;
length DT_VERSION 8 VERSION 8;
DT_VERSION = &dt_version;
VERSION = &next_version;
output;
run;
%end;
/* STEP 6: Verifica finale */
proc sql;
title "Verifica record dopo aggiornamento";
select DATA_REF, VERSION, count(*) as num_records
from ESPORTA.ANALISI_CREDITI
group by DATA_REF, VERSION
order by DATA_REF, VERSION;
quit;
%put NOTE: === FINE MACRO FIX_VERSIONING - VERSIONE FINALE = &next_version ===;
%mend;
The simplest approach would probably be:
1) create data set with "old" data,
2) create data set with "new" data
3) use proc compare
4) get value of SYSINFO (SAS Help Center: Results: PROC COMPARE)
5) looking at returned value decide (%if-%then) shoud the data be updated
6) if "yes" then updata
7) if "no" then do not update
Bart
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.