BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

I should transform a proc sql join into a hash, but I'm having trouble because I don't know how to define the "DefineKey" since there are various operations, not a single key to define.
this is the example:

 

data RUOLO_DANNI;
input cod_source:$2. cod_anagrafico:$9. ts_inizio_validita:datetime20.  ts_fine_validita:datetime20.;
datalines;
1 A 02JAN2015:00:00:00 01JAN5999:00:00:00
5 B 17MAY2017:00:00:00 01JAN5999:00:00:00
5 C 09JAN2022:00:00:00 01JAN5999:00:00:00
;



data T_A_SOGGETTI_DN;
input cod_source:$2. cod_anagrafico:$9. cod_anagrafico_edwh:$20. ts_inizio_validita:datetime20.  ts_fine_validita:datetime20.;
datalines;
1 A AAA 02JAN2015:00:00:00 01JAN5999:00:00:00
;


   proc sql noprint;
   create table ruolo_danni_NEW
   as select RU.*
			,SO.cod_anagrafico_EDWH
  from      RUOLO_DANNI RU

   left join   T_A_SOGGETTI_DN SO
   on        
             RU.cod_anagrafico = SO.cod_anagrafico
             and ru.ts_inizio_validita between so.ts_inizio_validita and so.ts_fine_validita
             and compress(SO.cod_source)='1';
quit;

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Then it should be

data ruolo_danni_NEW;
set ruolo_danni;
if _n_ = 1
then do;
  length ts_inizio_validita ts_fine_validita 8;
  declare hash so (dataset:"ta_soggetti_dn (where=(compress(cod_source)='1'))");
  so.definekey("cod_anagrafico");
  so.definedata("ts_inizio_validita","ts_fine_validita");
  so.definedone();
  call missing(ts_inizio_validita,ts_fine_validita);
end;
rc = so.find();
if rc = 0 and ts_inizio_validita le  ts_inizio_validita le ts_fine_validita;
drop rc ts_inizio_validita ts_fine_validita;
run;
Spoiler
Spoiler
 

View solution in original post

9 REPLIES 9
mariopellegrini
Quartz | Level 8

no because there is also cod_source

Kurt_Bremser
Super User

But you don't use that in your ON condition in the SQL? This would mean that you may get multiple observations in your SQL for a single obs in the "left" dataset.

mariopellegrini
Quartz | Level 8

no because there is the final expression and compress(SO.cod_source)='1', so you only take the cod_source = '1'

Kurt_Bremser
Super User

Then it should be

data ruolo_danni_NEW;
set ruolo_danni;
if _n_ = 1
then do;
  length ts_inizio_validita ts_fine_validita 8;
  declare hash so (dataset:"ta_soggetti_dn (where=(compress(cod_source)='1'))");
  so.definekey("cod_anagrafico");
  so.definedata("ts_inizio_validita","ts_fine_validita");
  so.definedone();
  call missing(ts_inizio_validita,ts_fine_validita);
end;
rc = so.find();
if rc = 0 and ts_inizio_validita le  ts_inizio_validita le ts_fine_validita;
drop rc ts_inizio_validita ts_fine_validita;
run;
Spoiler
Spoiler
 
mariopellegrini
Quartz | Level 8

thanks Kurt_Bremser. for completeness I report the final code to obtain the same result of the proc sql;

 

data ruolo_danni_HASH;
set ruolo_danni;
if _n_ = 1
then do;
  length ts_inizio_validita ts_fine_validita 8;
  length cod_anagrafico_edwh $20.;
  declare hash so (dataset:"t_a_soggetti_dn (where=(compress(cod_source)='1'))");
  so.definekey("cod_anagrafico");
  so.definedata("cod_anagrafico_edwh","ts_inizio_validita","ts_fine_validita");
  so.definedone();
  call missing(cod_anagrafico_edwh, ts_inizio_validita,ts_fine_validita);
end;
rc = so.find();
if rc >= 0 and ts_inizio_validita le  ts_inizio_validita le ts_fine_validita;
drop rc ;
run;
mariopellegrini
Quartz | Level 8

Kurt  I found some data inconsistencies between sql and hash, because the "ON" condition is different: ts_start_validation is taken from the left dataset, while ts_start_validation  ts_end_validation of the "beetween" are taken from the right dataset. This example explains the problem with a particular case ((the results of ruolo_danni_sql and ruolo_danni_hash are different):

data RUOLO_DANNI;
input cod_source:$2. cod_anagrafico:$9. ts_inizio_validita:datetime20.;
datalines;
1 21278 02JUL11:03:25:44
1 21278 02JUL11:03:25:45
1 21278 02JUL11:03:25:46
;
data T_A_SOGGETTI_DN;
input cod_source:$2. cod_anagrafico:$9. cod_anagrafico_edwh:$20. ts_inizio_validita:datetime20.  ts_fine_validita:datetime20.;
datalines;
1 21278 F_SCDDNL64D45Z133D 01JUL2011:11:15:02 01JUL2011:11:15:01
;
proc sql noprint;
create table ruolo_danni_sql as 
select RU.cod_anagrafico,
       RU.ts_inizio_validita,
	   RU.cod_source,
	   SO.cod_anagrafico_edwh
from RUOLO_DANNI RU
left join   T_A_SOGGETTI_DN SO
on      RU.cod_anagrafico = SO.cod_anagrafico
    and ru.ts_inizio_validita between so.ts_inizio_validita and so.ts_fine_validita
    and compress(SO.cod_source)='1';
quit;
data ruolo_danni_HASH;
set ruolo_danni;
if _n_ = 1
then do;
  length ts_inizio_validita ts_fine_validita 8;
  length cod_anagrafico_edwh $20.;
  declare hash so (dataset:"t_a_soggetti_dn (where=(compress(cod_source)='1'))");
  so.definekey("cod_anagrafico");
  so.definedata("cod_anagrafico_edwh","ts_inizio_validita","ts_fine_validita");
  so.definedone();
  call missing(cod_anagrafico_edwh, ts_inizio_validita,ts_fine_validita);
end;
rc = so.find();
if rc >= 0 and ts_inizio_validita le ts_inizio_validita le ts_fine_validita;
drop rc ;
run;

 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 286 views
  • 2 likes
  • 2 in conversation