@smilbuta
To generate code in DIS as below:
proc sql;
create table work.Target as
select
source_for_subselect.Id length = 8,
source_for_subselect.Group length = 8,
source_for_subselect.effective_date length = 8
format = date9.,
max_group_effective_date length = 8
format = date9.
from
tests.source_for_subselect as source_for_subselect,
(
select
source_for_subselect_i.Group length = 8,
source_for_subselect_i.effective_date as max_group_effective_date length = 8
format = date9.
from
tests.source_for_subselect as source_for_subselect_i
group by
source_for_subselect_i.Group
)
;
quit;
Your flow needs to look like:
I personally find sub-selects in DIS too labor intensive to implement and even more importantly I don't like how deep the logic gets hidden in the transformation and that I can't see "everything" visually at once.
For this reason I normally don't implement using sub-selects but I'm using two SQL joins and my job flow would look like:
The code generated by the Join node looks then like:
proc sql;
create table work.Target as
select
source_for_subselect.Id length = 8,
source_for_subselect.Group length = 8,
source_for_subselect.effective_date length = 8
format = date9.,
max_eff_dt.max_group_effective_date length = 8
format = date9.
from
tests.source_for_subselect as source_for_subselect,
work.max_eff_dt as max_eff_dt
where
source_for_subselect.Group = max_eff_dt.Group
;
quit;
I've created source work.max_eff_dt as a view with a definition of:
proc sql;
create view work.max_eff_dt as
select
Group,
(max(effective_date)) as max_group_effective_date length = 8
format = date9.
from &SYSLAST
order by
Group
;
quit;
Given that a view can be seen as encapsulated SQL code which only gets executed when used.... if you compare what the flow using a sub-select generates as code with what my 2nd approach creates then you'll see that this is basically the same code when it comes to execution time.
I personally find the 2nd option easier to implement and much easier to "read" and maintain.
... View more