BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lizzy28
Quartz | Level 8

In the following file, I need to identify any "Paid" that happens after the first "Shipped", in terms of both order and time.  For example, for OrderID#1, the third record counts as it occurs after at least one shipment takes place. OrderID#2 does not have any record that counts because "Paid" cannot happen on the same day as "Shipped". For OrderID#4, its third record counts as "Paid" is supposed to occur after at least one "Shipped" and it needs be on the days after "Shipped". The last column "Completed" is my target.

Any suggestions?

OrderIDShippedPaiddateCompleted
1011/1/20140
1101/2/20140
1111/5/20141
2111/2/20140
2101/4/20140
3101/5/20140
3111/6/20141
3101/8/20140
4102/2/20140
4112/2/20140
4012/5/20141
4102/6/20140

Thanks a bunch!

Lizi

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

A Proc SQL alternative if your data is not too big: (raw data stoled from Ksharp):

data have;

     infile cards expandtabs truncover;

     input OrderID Shipped  Paid  date : mmddyy10.;

     format date mmddyy10.;

     cards;

1     0 1     1/1/2014

1     1 0     1/2/2014

1     1 1     1/5/2014

2     1 1     1/2/2014

2     1 0     1/4/2014

3     1 0     1/5/2014

3     1 1     1/6/2014

3     1 0     1/8/2014

4     1 0     2/2/2014

4     1 1     2/2/2014

4     0 1     2/5/2014

4     1 0     2/6/2014

;

run;

proc sql;

     create table want as

           select *, exists(select * from have where shipped =1 and orderid=a.orderid

                and a.paid=1 having  min(date) < a.date) as completed

           from have a

                order by a.orderid, a.date, a.paid

     ;

quit;

View solution in original post

8 REPLIES 8
Ksharp
Super User

Why 4 shouldn't be

OrderIDShippedPaiddateCompleted
1011/1/20140
1101/2/20140
1111/5/20141
2111/2/20140
2101/4/20140
3101/5/20140
3111/6/20141
3101/8/20140
4102/2/20140
4112/2/20140
4012/5/20141
4102/6/20140


data have;
infile cards expandtabs truncover;
input OrderID Shipped  Paid  date : mmddyy10.;
format date mmddyy10.;
cards;     
1     0     1     1/1/2014
1     1     0     1/2/2014
1     1     1     1/5/2014
2     1     1     1/2/2014
2     1     0     1/4/2014
3     1     0     1/5/2014
3     1     1     1/6/2014
3     1     0     1/8/2014
4     1     0     2/2/2014
4     1     1     2/2/2014
4     0     1     2/5/2014
4     1     0     2/6/2014
;
run;
data want;
 set have;
 by  OrderID;
 retain found flag .;
 if first.OrderID then do;found=.; flag=.;end;
 if found and Paid and not flag then do;Completed=1; flag=1; end;
  else Completed=0;
 if Shipped then found=1;
 drop found flag;
run;

Xia Keshan

lizzy28
Quartz | Level 8

Thank you so much, Keshan! But I realized that the time issue is not taken care of in your program. I have to consider both order and time.

Haikuo
Onyx | Level 15

A Proc SQL alternative if your data is not too big: (raw data stoled from Ksharp):

data have;

     infile cards expandtabs truncover;

     input OrderID Shipped  Paid  date : mmddyy10.;

     format date mmddyy10.;

     cards;

1     0 1     1/1/2014

1     1 0     1/2/2014

1     1 1     1/5/2014

2     1 1     1/2/2014

2     1 0     1/4/2014

3     1 0     1/5/2014

3     1 1     1/6/2014

3     1 0     1/8/2014

4     1 0     2/2/2014

4     1 1     2/2/2014

4     0 1     2/5/2014

4     1 0     2/6/2014

;

run;

proc sql;

     create table want as

           select *, exists(select * from have where shipped =1 and orderid=a.orderid

                and a.paid=1 having  min(date) < a.date) as completed

           from have a

                order by a.orderid, a.date, a.paid

     ;

quit;

lizzy28
Quartz | Level 8

Thanks a lot, Kuo.

Your way is pretty intuitive to me as I am still more familiar with small datahase management than the SAS retain function.

Ksharp
Super User

I just realize Shipped and Paid should not be the same day.

data have;
infile cards expandtabs truncover;
input OrderID Shipped  Paid  date : mmddyy10.;
format date mmddyy10.;
cards;     
1     0     1     1/1/2014
1     1     0     1/2/2014
1     1     1     1/5/2014
2     1     1     1/2/2014
2     1     0     1/4/2014
3     1     0     1/5/2014
3     1     1     1/6/2014
3     1     0     1/8/2014
4     1     0     2/2/2014
4     1     1     2/2/2014
4     0     1     2/5/2014
4     1     0     2/6/2014
;
run;
data want;
 set have;
 by  OrderID;
 retain found flag _date .;
 if first.OrderID then do;found=.; flag=.;_date=.;end;
 if found and Paid and not flag and date ne _date then do;Completed=1; flag=1; end;
  else Completed=0;
 if Shipped then do;found=1;_date=date;end;
 drop found flag _date;
run;

Xia Keshan

lizzy28
Quartz | Level 8

Keshan, it is more the matter of combination of timing and order than timing only or order only. Say if I have OrderID#5 that has the following shipping and payment timing, the program above would not be able to catch this order (Completed=0)

OrderIDShippedPaiddateCompleted
5106/3/20140
5106/5/20140
5116/5/20141
5106/5/20140

However, as you can see, Order #5 does have Completed=1.

In sum, I am looking for the record per order that indicates "Paid" occurs after (both in terms of order and time) at least one "Shipped" takes place.

Thanks again!

Ksharp
Super User

It is more complicated than I image .This time could work.

data have;

infile cards expandtabs truncover;

input OrderID Shipped  Paid  date : mmddyy10.;

format date mmddyy10.;

cards;    

1     0     1     1/1/2014

1     1     0     1/2/2014

1     1     1     1/5/2014

2     1     1     1/2/2014

2     1     0     1/4/2014

3     1     0     1/5/2014

3     1     1     1/6/2014

3     1     0     1/8/2014

4     1     0     2/2/2014

4     1     1     2/2/2014

4     0     1     2/5/2014

4     1     0     2/6/2014

5 1 0 6/3/2014

5 1 0 6/5/2014

5 1 1 6/5/2014

5 1 0 6/5/2014

;

run;

data want;

set have;

by  OrderID;

retain found flag _date .;

if first.OrderID then do;found=.; flag=.;_date=.;end;

if found and Paid and not flag and date ne _date then do;Completed=1; flag=1; end;

  else Completed=0;

if not found and Shipped then do;found=1;_date=date;end;

drop found flag _date;

run;

Xia Keshan

lizzy28
Quartz | Level 8

I think you got it right. Thank you for spending so much time, Keshan.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 962 views
  • 6 likes
  • 3 in conversation