Hi everyone, I am working on order data. In my data as below, each row represents a package for an order, and each order has at least one package. When shipped=1, it means that the package was shipped. paid=1 means that the package was paid. I need to identify 1st payment for 1st shipment of an order. Some special examples are highlighted in my data. For example, for order #2, the 5th package (pack_num=5) counts; for #4, the 3rd package (pack_num=3) counts, and for #7, the 2nd visit (pack_num=3) counts, and for 17, the 3rd package counts.
pack_tot is the total number of packages per order.
Thanks a lot!
orderid | pack_num | pack_tot | shipped | paid |
1 | 1 | 2 | 0 | 0 |
1 | 2 | 2 | 1 | 1 |
2 | 1 | 5 | 0 | 1 |
2 | 2 | 5 | 0 | 0 |
2 | 3 | 5 | 0 | 0 |
2 | 4 | 5 | 0 | 0 |
2 | 5 | 5 | 1 | 1 |
3 | 1 | 9 | 1 | 1 |
3 | 2 | 9 | 0 | 0 |
3 | 3 | 9 | 0 | 0 |
3 | 4 | 9 | 1 | 1 |
3 | 5 | 9 | 0 | 1 |
3 | 6 | 9 | 0 | 1 |
3 | 7 | 9 | 0 | 1 |
3 | 8 | 9 | 0 | 0 |
3 | 9 | 9 | 0 | 1 |
4 | 1 | 4 | 0 | 0 |
4 | 2 | 4 | 0 | 1 |
4 | 3 | 4 | 1 | 1 |
4 | 4 | 4 | 1 | 1 |
5 | 1 | 23 | 0 | 0 |
5 | 2 | 23 | 0 | 0 |
5 | 3 | 23 | 0 | 0 |
5 | 4 | 23 | 0 | 0 |
5 | 5 | 23 | 1 | 1 |
5 | 6 | 23 | 1 | 1 |
5 | 7 | 23 | 0 | 0 |
5 | 8 | 23 | 1 | 1 |
5 | 9 | 23 | 0 | 0 |
5 | 10 | 23 | 0 | 0 |
5 | 11 | 23 | 1 | 1 |
5 | 12 | 23 | 0 | 0 |
5 | 13 | 23 | 1 | 1 |
5 | 14 | 23 | 0 | 0 |
5 | 15 | 23 | 1 | 1 |
5 | 16 | 23 | 1 | 1 |
5 | 17 | 23 | 1 | 1 |
5 | 18 | 23 | 1 | 1 |
5 | 19 | 23 | 1 | 1 |
5 | 20 | 23 | 1 | 1 |
5 | 21 | 23 | 1 | 1 |
5 | 22 | 23 | 0 | 0 |
5 | 23 | 23 | 1 | 1 |
6 | 1 | 4 | 0 | 0 |
6 | 2 | 4 | 1 | 1 |
6 | 3 | 4 | 0 | 0 |
6 | 4 | 4 | 0 | 0 |
7 | 1 | 6 | 0 | 1 |
7 | 2 | 6 | 1 | 1 |
7 | 3 | 6 | 0 | 0 |
7 | 4 | 6 | 0 | 0 |
7 | 5 | 6 | 0 | 0 |
7 | 6 | 6 | 0 | 0 |
8 | 1 | 4 | 0 | 0 |
8 | 2 | 4 | 0 | 0 |
8 | 3 | 4 | 0 | 0 |
8 | 4 | 4 | 1 | 1 |
9 | 1 | 2 | 0 | 1 |
9 | 2 | 2 | 1 | 1 |
10 | 1 | 1 | 1 | 1 |
11 | 1 | 1 | 1 | 1 |
12 | 1 | 4 | 0 | 0 |
12 | 2 | 4 | 0 | 0 |
12 | 3 | 4 | 0 | 0 |
12 | 4 | 4 | 1 | 1 |
13 | 1 | 1 | 1 | 1 |
14 | 1 | 3 | 0 | 0 |
14 | 2 | 3 | 0 | 0 |
14 | 3 | 3 | 1 | 1 |
15 | 1 | 4 | 0 | 0 |
15 | 2 | 4 | 1 | 1 |
15 | 3 | 4 | 0 | 0 |
15 | 4 | 4 | 0 | 0 |
16 | 1 | 4 | 0 | 0 |
16 | 2 | 4 | 0 | 0 |
16 | 3 | 4 | 0 | 0 |
16 | 4 | 4 | 1 | 1 |
17 | 1 | 5 | 0 | 1 |
17 | 2 | 5 | 1 | 0 |
17 | 3 | 5 | 0 | 1 |
17 | 4 | 5 | 1 | 0 |
17 | 5 | 5 | 0 | 1 |
18 | 1 | 2 | 1 | 1 |
18 | 2 | 2 | 0 | 1 |
19 | 1 | 1 | 1 | 1 |
20 | 1 | 1 | 1 | 1 |
I think you need something like:
data want;
do until(last.orderId);
set have; by orderId;
hasShipped = hasShipped or shipped;
if hasShipped and not isPaid then
if paid then do;
output;
isPaid = 1;
end;
end;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
I think you need something like:
data want;
do until(last.orderId);
set have; by orderId;
hasShipped = hasShipped or shipped;
if hasShipped and not isPaid then
if paid then do;
output;
isPaid = 1;
end;
end;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
Thank you so much, PG. You are AWESOME!
Hi PG, actually is it possible to make flag variables for the first Shipped and the first Paid after the first Shipped in the same dataset?
Thanks again!
Please give an example of the result you want. - PG
Thanks so much, PG. My target data is listed as below.
So the target variables are Firstship and Firstpay_afterfirstship. Basically the same question as previously, but I need the binary variable Firstship to indicate a row/record has the first shipping. Firstpay_afterfirstship is generated to flag that a row/record has the first payment after first shipping.
orderid | pack_num | pack_tot | shipped | paid | Firstship | Firstpay_afterfirstship |
1 | 1 | 2 | 0 | 0 | 0 | 0 |
1 | 2 | 2 | 1 | 1 | 1 | 1 |
2 | 1 | 5 | 0 | 1 | 0 | 0 |
2 | 2 | 5 | 0 | 0 | 0 | 0 |
2 | 3 | 5 | 0 | 0 | 0 | 0 |
2 | 4 | 5 | 0 | 0 | 0 | 0 |
2 | 5 | 5 | 1 | 1 | 1 | 1 |
3 | 1 | 9 | 1 | 1 | 1 | 1 |
3 | 2 | 9 | 0 | 0 | 0 | 0 |
3 | 3 | 9 | 0 | 0 | 0 | 0 |
3 | 4 | 9 | 1 | 1 | 0 | 0 |
3 | 5 | 9 | 0 | 1 | 0 | 0 |
3 | 6 | 9 | 0 | 1 | 0 | 0 |
3 | 7 | 9 | 0 | 1 | 0 | 0 |
3 | 8 | 9 | 0 | 0 | 0 | 0 |
3 | 9 | 9 | 0 | 1 | 0 | 0 |
4 | 1 | 4 | 0 | 0 | 0 | 0 |
4 | 2 | 4 | 0 | 1 | 0 | 0 |
4 | 3 | 4 | 1 | 1 | 1 | 1 |
4 | 4 | 4 | 1 | 1 | 0 | 0 |
5 | 1 | 23 | 0 | 0 | 0 | 0 |
5 | 2 | 23 | 0 | 0 | 0 | 0 |
5 | 3 | 23 | 0 | 0 | 0 | 0 |
5 | 4 | 23 | 0 | 0 | 0 | 0 |
5 | 5 | 23 | 1 | 1 | 1 | 1 |
5 | 6 | 23 | 1 | 1 | 0 | 0 |
5 | 7 | 23 | 0 | 0 | 0 | 0 |
5 | 8 | 23 | 1 | 1 | 0 | 0 |
5 | 9 | 23 | 0 | 0 | 0 | 0 |
5 | 10 | 23 | 0 | 0 | 0 | 0 |
5 | 11 | 23 | 1 | 1 | 0 | 0 |
5 | 12 | 23 | 0 | 0 | 0 | 0 |
5 | 13 | 23 | 1 | 1 | 0 | 0 |
5 | 14 | 23 | 0 | 0 | 0 | 0 |
5 | 15 | 23 | 1 | 1 | 0 | 0 |
5 | 16 | 23 | 1 | 1 | 0 | 0 |
5 | 17 | 23 | 1 | 1 | 0 | 0 |
5 | 18 | 23 | 1 | 1 | 0 | 0 |
5 | 19 | 23 | 1 | 1 | 0 | 0 |
5 | 20 | 23 | 1 | 1 | 0 | 0 |
5 | 21 | 23 | 1 | 1 | 0 | 0 |
5 | 22 | 23 | 0 | 0 | 0 | 0 |
5 | 23 | 23 | 1 | 1 | 0 | 0 |
6 | 1 | 4 | 0 | 0 | 0 | 0 |
6 | 2 | 4 | 1 | 1 | 1 | 1 |
6 | 3 | 4 | 0 | 0 | 0 | 0 |
6 | 4 | 4 | 0 | 0 | 0 | 0 |
7 | 1 | 6 | 0 | 1 | 0 | 0 |
7 | 2 | 6 | 1 | 1 | 1 | 1 |
7 | 3 | 6 | 0 | 0 | 0 | 0 |
7 | 4 | 6 | 0 | 0 | 0 | 0 |
7 | 5 | 6 | 0 | 0 | 0 | 0 |
7 | 6 | 6 | 0 | 0 | 0 | 0 |
8 | 1 | 4 | 0 | 0 | 0 | 0 |
8 | 2 | 4 | 0 | 0 | 0 | 0 |
8 | 3 | 4 | 0 | 0 | 0 | 0 |
8 | 4 | 4 | 1 | 1 | 1 | 1 |
9 | 1 | 2 | 0 | 1 | 0 | 0 |
9 | 2 | 2 | 1 | 1 | 1 | 1 |
10 | 1 | 1 | 1 | 1 | 1 | 1 |
11 | 1 | 1 | 1 | 1 | 1 | 1 |
12 | 1 | 4 | 0 | 0 | 0 | 0 |
12 | 2 | 4 | 0 | 0 | 0 | 0 |
12 | 3 | 4 | 0 | 0 | 0 | 0 |
12 | 4 | 4 | 1 | 1 | 1 | 1 |
13 | 1 | 1 | 1 | 1 | 1 | 1 |
14 | 1 | 3 | 0 | 0 | 0 | 0 |
14 | 2 | 3 | 0 | 0 | 0 | 0 |
14 | 3 | 3 | 1 | 1 | 1 | 1 |
15 | 1 | 4 | 0 | 0 | 0 | 0 |
15 | 2 | 4 | 1 | 1 | 1 | 1 |
15 | 3 | 4 | 0 | 0 | 0 | 0 |
15 | 4 | 4 | 0 | 0 | 0 | 0 |
16 | 1 | 4 | 0 | 0 | 0 | 0 |
16 | 2 | 4 | 0 | 0 | 0 | 0 |
16 | 3 | 4 | 0 | 0 | 0 | 0 |
16 | 4 | 4 | 1 | 1 | 1 | 1 |
17 | 1 | 5 | 0 | 1 | 0 | 0 |
17 | 2 | 5 | 1 | 0 | 1 | 0 |
17 | 3 | 5 | 0 | 1 | 0 | 1 |
17 | 4 | 5 | 1 | 0 | 0 | 0 |
17 | 5 | 5 | 0 | 1 | 0 | 0 |
18 | 1 | 2 | 1 | 1 | 1 | 1 |
18 | 2 | 2 | 0 | 1 | 0 | 0 |
19 | 1 | 1 | 1 | 1 | 1 | 1 |
20 | 1 | 1 | 1 | 1 | 1 | 1 |
That's a bit different:
data want;
retain hasShipped isPaid;
set have; by orderId;
if first.orderId then call missing(hasShipped, isPaid); /* missing = FALSE */
firstShip = shipped and not hasShipped;
hasShipped = hasShipped or firstShip;
Firstpay_afterfirstship = hasShipped and paid and not isPaid;
isPaid = isPaid or Firstpay_afterfirstship;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
Thanks so much again, PG!
It's a nice learning process to me.
Lizi
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.