Okay, thanks Patrick. Now by applying the code to numerous data, the number of final rows corresponds, but I encounter 2 problems:
1) the date filter doesn't seem to work, as shown by the put I inserted, from which it can be seen in the log that it doesn't print, therefore it doesn't enter the "if".
data RUOLO_DANNI;
input cod_source:$1. cod_anagrafico:$9. ts_inizio_validita:datetime20. ts_fine_validita:datetime20.;
format ts_inizio_validita ts_fine_validita datetime20.;
datalines;
1 1044576 28JUN2019:10:57:45 28JUN2019:10:57:51
1 1044576 28JUN2019:10:57:45 28JUN2019:10:57:51
1 1044576 28JUN2019:10:57:52 28JUN2019:10:58:36
;
data T_A_SOGGETTI_DN;
input cod_source:$2. cod_anagrafico:$9. cod_anagrafico_edwh:$20. ts_inizio_validita:datetime20. ts_fine_validita:datetime20.;
format ts_inizio_validita ts_fine_validita datetime20.;
datalines;
1 1044576 F_LLSSTR03A61D612H 12APR2019:03:32:12 29MAY2019:13:22:11
1 1044576 F_LLSSTR03A61D612H 29MAY2019:13:22:12 29MAY2019:13:22:45
;
/*SQL*/
proc sql noprint;
create table ruolo_danni_SQL as
select RU.*,
SO.cod_anagrafico_EDWH
from RUOLO_DANNI RU
left join T_A_SOGGETTI_DN SO
on
RU.cod_anagrafico = SO.cod_anagrafico
and ru.ts_inizio_validita between so.ts_inizio_validita and so.ts_fine_validita
and compress(SO.cod_source)='1';
quit;
/*HASH*/
data ruolo_danni_hash;
set ruolo_danni;
if _n_ = 1 then
do;
length ti tf 8;
length cod_anagrafico_edwh $20.;
format ti tf datetime20.;
declare hash so (multidata:'y',dataset:"t_a_soggetti_dn (where=(compress(cod_source)='1')
rename=(ts_inizio_validita=ti ts_fine_validita=tf))");
so.definekey("cod_anagrafico");
so.definedata("cod_anagrafico_edwh","ti","tf");
so.definedone();
end;
do while(so.do_over()=0);
if
(
ti<=ts_inizio_validita<= tf
or
tf<=ts_inizio_validita<= ti
)
and compress(cod_source)='1'
then do; put cod_anagrafico=; _true_flg='1';output; end;
end;
if _true_flg ne '1' then output;
drop _true_flg ;
run;
2) the processing time compared to proc sql has increased rather than decreased (2:08.60 versus 1:56.14), it would make the conversion of this step to a hash useless:
NOTE: PROCEDURE SQL ha utilizzato (tempo totale di elaborazione): real time 1:56.14 user cpu time 1:41.01 system cpu time 23.65 seconds memory 9414131.15k OS Memory 9432964.00k Timestamp 01/05/2023 08:14:10 m. Step Count 10 Switch Count 85
----------------------------------------------------------------------------
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). NOTE: Variable ti is uninitialized. NOTE: Variable tf is uninitialized. NOTE: Variable cod_anagrafico_edwh is uninitialized. NOTE: There were 6927902 observations read from the data set EDWH_ODS.T_A_SOGGETTI_DN. WHERE COMPRESS(cod_source)='1'; NOTE: There were 66878514 observations read from the data set STG_VT.RUOLO_DANNI. NOTE: The data set WORK.RUOLO_DANNI_HASH2 has 66878637 observations and 11 variables. NOTE: DATA statement ha utilizzato (tempo totale di elaborazione): real time 2:08.60 user cpu time 1:07.26 system cpu time 13.15 seconds memory 1252530.56k OS Memory 1275076.00k Timestamp 01/05/2023 08:16:18 m. Step Count 11 Switch Count 208
... View more