Help using Base SAS procedures

Help selecting a particular date from a data set

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

Help selecting a particular date from a data set

Hello,

I'd appreciate some help regarding user-written base code. Consider the following example input table, which is filtered on ORDER_NO=4687:

ORDER_NOORDER_SEQ_NOORDER_STATUSCOMMENTDATE
46871OPP15SEP2011
46872FIN27SEP2011
46873AB05OCT2011
46874FIN06OCT2011
46875FINHello world!27JAN2014

Each time a change is made to an order, a new row is triggered (the one with the highest ORDER_SEQ_NO being the most recent). This ideally only happens when the order status changes. However, if a consultant manually edits the order (here by adding a comment), another row is triggered despite the status not having changed.

So far, my job has always selected the date from the most recent row (27JAN2014), i.e. from the row with the highest ORDER_SEQ_NO. However, I need it to select the first date for the most recently triggered order status. So, in the example above, the logic should select the date 06OCT2011. Not 27SEP2011, as there's been a different status since back then.

I can't simply group by the most recent ORDER_STATUS and select min(DATE), as that would select the date from row 2. I need row 4 from this example, and of course I can't just select ORDER_SEQ_NO=4 directly since the logic must apply to numerous other orders as well.


Accepted Solutions
Solution
‎10-27-2014 10:45 AM
Regular Contributor
Posts: 180

Re: Help selecting a particular date from a data set

Posted in reply to TurnTheBacon

Consider this code asuming the dataset is sorted by ORDER_NO and ORDER_SEQ_NO:

data have;
input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;
format date date9.;
cards;
4687 1OPP        15-Sep-11
4687 2FIN        27-Sep-11
4687 3AB        5-Oct-11
4687 4FIN        6-Oct-11
4687 5FINHello world! 27-Jan-14
;
run;

data want;
  set have;
  retain STATUS ORDER_DATE;
  format order_date date9.;
  by order_no;
  if first.order_no
     then do;
             status=order_status;
             order_date=date;
          end;
     else if order_status ne status
             then do;
                     status=order_status;
                     order_date=date;
                  end;
  if last.order_no then output;
  keep order_no status order_date;
run;

View solution in original post


All Replies
Super Contributor
Posts: 490

Re: Help selecting a particular date from a data set

Posted in reply to TurnTheBacon

So what you want is the most recent.

But if there exist earlier record with the

  1. Same details
  2. Without comment
  3. and there is no any in between record with different variable values between it and the most recent

then this record should be selected.

Is that what you want?

Frequent Contributor
Posts: 89

Re: Help selecting a particular date from a data set

Posted in reply to mohamed_zaki

It should select the date from the most recent row, unless an earlier row has the same status sequentially, in which case it should select the date from that row.

You might think of it like this: 'The most recent status change was from AB to FIN. That happened on 06OCT2011, so I select that date."

Solution
‎10-27-2014 10:45 AM
Regular Contributor
Posts: 180

Re: Help selecting a particular date from a data set

Posted in reply to TurnTheBacon

Consider this code asuming the dataset is sorted by ORDER_NO and ORDER_SEQ_NO:

data have;
input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;
format date date9.;
cards;
4687 1OPP        15-Sep-11
4687 2FIN        27-Sep-11
4687 3AB        5-Oct-11
4687 4FIN        6-Oct-11
4687 5FINHello world! 27-Jan-14
;
run;

data want;
  set have;
  retain STATUS ORDER_DATE;
  format order_date date9.;
  by order_no;
  if first.order_no
     then do;
             status=order_status;
             order_date=date;
          end;
     else if order_status ne status
             then do;
                     status=order_status;
                     order_date=date;
                  end;
  if last.order_no then output;
  keep order_no status order_date;
run;

Super User
Posts: 10,044

Re: Help selecting a particular date from a data set

CTorres,

Your code is not right . Check the following data :

data have;

input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;

format date date9.;

cards;

4687 1OPP        15-Sep-11

4687 2FIN        27-Sep-11

4687 3AB        5-Oct-11

4687 4AB        6-Oct-11

4687 5FINHello world! 27-Jan-14

;

run;

data have; 
input ORDER_NO  ORDER_SEQ_NO  ORDER_STATUS $ COMMENT & $18. DATE : date9.;
format date date9.;
cards;
4687 1 OPP    .    15-Sep-11
4687 2 FIN    .    27-Sep-11
4687 3 AB     .   5-Oct-11
4687 4 FIN    .    6-Oct-11
4687 5 FIN Hello world!  27-Jan-14
;
run;


proc sql;
create table want as
select a.* from have as a,(select * from have group by ORDER_NO having ORDER_SEQ_NO=max(ORDER_SEQ_NO)) as b
 where a.ORDER_NO=b.ORDER_NO and a.ORDER_STATUS=b.ORDER_STATUS and a.ORDER_SEQ_NO ne b.ORDER_SEQ_NO
  group by a.ORDER_NO
   having a.ORDER_SEQ_NO=max(a.ORDER_SEQ_NO);
quit;

Xia Keshan

Message was edited by: xia keshan Sorry. Missing  a GROUP clause.

Regular Contributor
Posts: 180

Re: Help selecting a particular date from a data set

Xia,

I think the code works fine, even with the change you suggest.

If I understand correctly, the purpose of the program is to get the most recent update record that includes a change of status, independently of any other change.

CTorres

Super User
Posts: 10,044

Re: Help selecting a particular date from a data set

CTorres,

Yeah. If I understand OP correctly . He also want the SAME ORDER_STATUS . But apparently you didn't consider it .

Xia Keshan

Trusted Advisor
Posts: 1,228

Re: Help selecting a particular date from a data set

Posted in reply to TurnTheBacon

data want;
set have;
by ORDER_NO ORDER_SEQ_NO;
if ORDER_STATUS ne lag(ORDER_STATUS) then flag+1;
run;

proc sql;

select order_no,date from want

group by order_no, flag

having ORDER_SEQ_NO=min(ORDER_SEQ_NO)

and count(ORDER_STATUS)>1;

quit;

Regular Contributor
Posts: 180

Re: Help selecting a particular date from a data set

I think your code does not work for the general case: more than one Order with different number of order changes. For example with the following "have" dataset:

data have;

input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;

format date date9.;

cards;

4687 1OPP       15-Sep-11

4687 2FIN       27-Sep-11

4687 3AB       5-Oct-11

4687 4FIN       6-Oct-11

4687 5FINHello world! 27-Jan-14

4999 1FINComplete     20-Aug-14

5680 1OPP             18-Dec-13

5680 2AB              19-Dec-13

5700 1OPP             10-Dec-13

5700 2OPPXXX          20-Jan-14

;

run;

Trusted Advisor
Posts: 1,228

Re: Help selecting a particular date from a data set

Hi,

I think it should work. Please try to validate.

Regards,

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 406 views
  • 3 likes
  • 5 in conversation