SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Picking rows based on a newst date

Reply
Contributor
Posts: 28

Picking rows based on a newst date

Hi,

 

I have a use-case where table contains 1 or more rows for the same email address column. There is a columd "date".

From all rows with same email address, only the one with newest date should be taken in a result table, regardless of values of other columns (20-ish columns are in the table all together).

 

Is there a way to handle it in DI Studio without using custom written code?

 

Thanks!

 

Best regards,

 

--
Mario
Super User
Posts: 10,571

Re: Picking rows based on a newst date

I found this article that deals with a quite similar problem by using the Query Builder in Enterprise Guide. Maybe you can apply the method to DI Studio as well.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Respected Advisor
Posts: 4,797

Re: Picking rows based on a newst date

@strsljen

Below two coding options should both be quite simple to implement in DIS using standard transformations.

data have;
  set sashelp.class;
  email_addr='abc.efg@blah.com';
  dt=today()+_n_;
  format dt date9.;
run;

/* option 1 */
proc sort data=have out=want1;
  by email_addr DESCENDING dt;
run;

proc sort data=want1 nodupkey;
  by email_addr;
run;

/* option 2 */
proc sql;
  create table want2 as 
    select *
    from have
    group by email_addr
    having max(dt)=dt
  ;
quit;
SAS Employee
Posts: 17

Re: Picking rows based on a newst date

Personally, this is just a bit of "SQL maths", easily done with Extract nodes.

 

extract node to

  select email, max(date) as maxdate from table group by email into work.interim

 

extract node

  join that back onto the original table, joining on email = email and date = maxdate

 

All done in extract nodes.

 

Remember, DI Studio isn't a supposed to be programming tool.

Contributor
Posts: 28

Re: Picking rows based on a newst date

Posted in reply to AngusLooney

Hi,

 

 

Makes sense. Thanks!

 

In the meantime, I checked RANK transformation and that does the trick - gives me ranking within parameters I need. Then I just take out rows with rank=1 for example.

I completelly agree about DI Studio not being programming tool. It is possible to run all in user-written code but we are avoiding it by all means.

 

 

Best regards,

--
Mario
Respected Advisor
Posts: 4,797

Re: Picking rows based on a newst date

[ Edited ]

@strsljen

Just to clarify: The two coding option I've posted weren't meant to be implemented as user written code but as logic using standard DIS transformations.

 

Option 1 can get implemented using two SORT transformations. 

Option 2 can get implemented using a SQL Join transformation.

 

Option 1 would should also allow to easily collect the rejected records in a second table.

Ask a Question
Discussion stats
  • 5 replies
  • 190 views
  • 0 likes
  • 4 in conversation