BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TurnTheBacon
Fluorite | Level 6

Hello,

I'd appreciate some help regarding user-written base code. Consider the following example input table, which is filtered on ORDER_NO=4687:

ORDER_NOORDER_SEQ_NOORDER_STATUSCOMMENTDATE
46871OPP15SEP2011
46872FIN27SEP2011
46873AB05OCT2011
46874FIN06OCT2011
46875FINHello world!27JAN2014

Each time a change is made to an order, a new row is triggered (the one with the highest ORDER_SEQ_NO being the most recent). This ideally only happens when the order status changes. However, if a consultant manually edits the order (here by adding a comment), another row is triggered despite the status not having changed.

So far, my job has always selected the date from the most recent row (27JAN2014), i.e. from the row with the highest ORDER_SEQ_NO. However, I need it to select the first date for the most recently triggered order status. So, in the example above, the logic should select the date 06OCT2011. Not 27SEP2011, as there's been a different status since back then.

I can't simply group by the most recent ORDER_STATUS and select min(DATE), as that would select the date from row 2. I need row 4 from this example, and of course I can't just select ORDER_SEQ_NO=4 directly since the logic must apply to numerous other orders as well.

1 ACCEPTED SOLUTION

Accepted Solutions
CTorres
Quartz | Level 8

Consider this code asuming the dataset is sorted by ORDER_NO and ORDER_SEQ_NO:

data have;
input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;
format date date9.;
cards;
4687 1OPP        15-Sep-11
4687 2FIN        27-Sep-11
4687 3AB        5-Oct-11
4687 4FIN        6-Oct-11
4687 5FINHello world! 27-Jan-14
;
run;

data want;
  set have;
  retain STATUS ORDER_DATE;
  format order_date date9.;
  by order_no;
  if first.order_no
     then do;
             status=order_status;
             order_date=date;
          end;
     else if order_status ne status
             then do;
                     status=order_status;
                     order_date=date;
                  end;
  if last.order_no then output;
  keep order_no status order_date;
run;

View solution in original post

9 REPLIES 9
mohamed_zaki
Barite | Level 11

So what you want is the most recent.

But if there exist earlier record with the

  1. Same details
  2. Without comment
  3. and there is no any in between record with different variable values between it and the most recent

then this record should be selected.

Is that what you want?

TurnTheBacon
Fluorite | Level 6

It should select the date from the most recent row, unless an earlier row has the same status sequentially, in which case it should select the date from that row.

You might think of it like this: 'The most recent status change was from AB to FIN. That happened on 06OCT2011, so I select that date."

CTorres
Quartz | Level 8

Consider this code asuming the dataset is sorted by ORDER_NO and ORDER_SEQ_NO:

data have;
input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;
format date date9.;
cards;
4687 1OPP        15-Sep-11
4687 2FIN        27-Sep-11
4687 3AB        5-Oct-11
4687 4FIN        6-Oct-11
4687 5FINHello world! 27-Jan-14
;
run;

data want;
  set have;
  retain STATUS ORDER_DATE;
  format order_date date9.;
  by order_no;
  if first.order_no
     then do;
             status=order_status;
             order_date=date;
          end;
     else if order_status ne status
             then do;
                     status=order_status;
                     order_date=date;
                  end;
  if last.order_no then output;
  keep order_no status order_date;
run;

Ksharp
Super User

CTorres,

Your code is not right . Check the following data :

data have;

input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;

format date date9.;

cards;

4687 1OPP        15-Sep-11

4687 2FIN        27-Sep-11

4687 3AB        5-Oct-11

4687 4AB        6-Oct-11

4687 5FINHello world! 27-Jan-14

;

run;

data have; 
input ORDER_NO  ORDER_SEQ_NO  ORDER_STATUS $ COMMENT & $18. DATE : date9.;
format date date9.;
cards;
4687 1 OPP    .    15-Sep-11
4687 2 FIN    .    27-Sep-11
4687 3 AB     .   5-Oct-11
4687 4 FIN    .    6-Oct-11
4687 5 FIN Hello world!  27-Jan-14
;
run;


proc sql;
create table want as
select a.* from have as a,(select * from have group by ORDER_NO having ORDER_SEQ_NO=max(ORDER_SEQ_NO)) as b
 where a.ORDER_NO=b.ORDER_NO and a.ORDER_STATUS=b.ORDER_STATUS and a.ORDER_SEQ_NO ne b.ORDER_SEQ_NO
  group by a.ORDER_NO
   having a.ORDER_SEQ_NO=max(a.ORDER_SEQ_NO);
quit;

Xia Keshan

Message was edited by: xia keshan Sorry. Missing  a GROUP clause.

CTorres
Quartz | Level 8

Xia,

I think the code works fine, even with the change you suggest.

If I understand correctly, the purpose of the program is to get the most recent update record that includes a change of status, independently of any other change.

CTorres

Ksharp
Super User

CTorres,

Yeah. If I understand OP correctly . He also want the SAME ORDER_STATUS . But apparently you didn't consider it .

Xia Keshan

stat_sas
Ammonite | Level 13

data want;
set have;
by ORDER_NO ORDER_SEQ_NO;
if ORDER_STATUS ne lag(ORDER_STATUS) then flag+1;
run;

proc sql;

select order_no,date from want

group by order_no, flag

having ORDER_SEQ_NO=min(ORDER_SEQ_NO)

and count(ORDER_STATUS)>1;

quit;

CTorres
Quartz | Level 8

I think your code does not work for the general case: more than one Order with different number of order changes. For example with the following "have" dataset:

data have;

input ORDER_NO 4. ORDER_SEQ_NO 2. ORDER_STATUS $3. COMMENT $13. DATE date9.;

format date date9.;

cards;

4687 1OPP       15-Sep-11

4687 2FIN       27-Sep-11

4687 3AB       5-Oct-11

4687 4FIN       6-Oct-11

4687 5FINHello world! 27-Jan-14

4999 1FINComplete     20-Aug-14

5680 1OPP             18-Dec-13

5680 2AB              19-Dec-13

5700 1OPP             10-Dec-13

5700 2OPPXXX          20-Jan-14

;

run;

stat_sas
Ammonite | Level 13

Hi,

I think it should work. Please try to validate.

Regards,

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1900 views
  • 3 likes
  • 5 in conversation