Hi,
I have a use-case where table contains 1 or more rows for the same email address column. There is a columd "date".
From all rows with same email address, only the one with newest date should be taken in a result table, regardless of values of other columns (20-ish columns are in the table all together).
Is there a way to handle it in DI Studio without using custom written code?
Thanks!
Best regards,
I found this article that deals with a quite similar problem by using the Query Builder in Enterprise Guide. Maybe you can apply the method to DI Studio as well.
Below two coding options should both be quite simple to implement in DIS using standard transformations.
data have;
set sashelp.class;
email_addr='abc.efg@blah.com';
dt=today()+_n_;
format dt date9.;
run;
/* option 1 */
proc sort data=have out=want1;
by email_addr DESCENDING dt;
run;
proc sort data=want1 nodupkey;
by email_addr;
run;
/* option 2 */
proc sql;
create table want2 as
select *
from have
group by email_addr
having max(dt)=dt
;
quit;
Personally, this is just a bit of "SQL maths", easily done with Extract nodes.
extract node to
select email, max(date) as maxdate from table group by email into work.interim
extract node
join that back onto the original table, joining on email = email and date = maxdate
All done in extract nodes.
Remember, DI Studio isn't a supposed to be programming tool.
Hi,
Makes sense. Thanks!
In the meantime, I checked RANK transformation and that does the trick - gives me ranking within parameters I need. Then I just take out rows with rank=1 for example.
I completelly agree about DI Studio not being programming tool. It is possible to run all in user-written code but we are avoiding it by all means.
Best regards,
Just to clarify: The two coding option I've posted weren't meant to be implemented as user written code but as logic using standard DIS transformations.
Option 1 can get implemented using two SORT transformations.
Option 2 can get implemented using a SQL Join transformation.
Option 1 would should also allow to easily collect the rejected records in a second table.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.