Hi everyone, I am working on order data. In my data as below, each row represents a package for an order, and each order has at least one package. When shipped=1, it means that the package was shipped. paid=1 means that the package was paid. I need to identify 1st payment for 1st shipment of an order. Some special examples are highlighted in my data. For example, for order #2, the 5th package (pack_num=5) counts; for #4, the 3rd package (pack_num=3) counts, and for #7, the 2nd visit (pack_num=3) counts, and for 17, the 3rd package counts.
pack_tot is the total number of packages per order.
Thanks a lot!
orderid | pack_num | pack_tot | shipped | paid |
1 | 1 | 2 | 0 | 0 |
1 | 2 | 2 | 1 | 1 |
2 | 1 | 5 | 0 | 1 |
2 | 2 | 5 | 0 | 0 |
2 | 3 | 5 | 0 | 0 |
2 | 4 | 5 | 0 | 0 |
2 | 5 | 5 | 1 | 1 |
3 | 1 | 9 | 1 | 1 |
3 | 2 | 9 | 0 | 0 |
3 | 3 | 9 | 0 | 0 |
3 | 4 | 9 | 1 | 1 |
3 | 5 | 9 | 0 | 1 |
3 | 6 | 9 | 0 | 1 |
3 | 7 | 9 | 0 | 1 |
3 | 8 | 9 | 0 | 0 |
3 | 9 | 9 | 0 | 1 |
4 | 1 | 4 | 0 | 0 |
4 | 2 | 4 | 0 | 1 |
4 | 3 | 4 | 1 | 1 |
4 | 4 | 4 | 1 | 1 |
5 | 1 | 23 | 0 | 0 |
5 | 2 | 23 | 0 | 0 |
5 | 3 | 23 | 0 | 0 |
5 | 4 | 23 | 0 | 0 |
5 | 5 | 23 | 1 | 1 |
5 | 6 | 23 | 1 | 1 |
5 | 7 | 23 | 0 | 0 |
5 | 8 | 23 | 1 | 1 |
5 | 9 | 23 | 0 | 0 |
5 | 10 | 23 | 0 | 0 |
5 | 11 | 23 | 1 | 1 |
5 | 12 | 23 | 0 | 0 |
5 | 13 | 23 | 1 | 1 |
5 | 14 | 23 | 0 | 0 |
5 | 15 | 23 | 1 | 1 |
5 | 16 | 23 | 1 | 1 |
5 | 17 | 23 | 1 | 1 |
5 | 18 | 23 | 1 | 1 |
5 | 19 | 23 | 1 | 1 |
5 | 20 | 23 | 1 | 1 |
5 | 21 | 23 | 1 | 1 |
5 | 22 | 23 | 0 | 0 |
5 | 23 | 23 | 1 | 1 |
6 | 1 | 4 | 0 | 0 |
6 | 2 | 4 | 1 | 1 |
6 | 3 | 4 | 0 | 0 |
6 | 4 | 4 | 0 | 0 |
7 | 1 | 6 | 0 | 1 |
7 | 2 | 6 | 1 | 1 |
7 | 3 | 6 | 0 | 0 |
7 | 4 | 6 | 0 | 0 |
7 | 5 | 6 | 0 | 0 |
7 | 6 | 6 | 0 | 0 |
8 | 1 | 4 | 0 | 0 |
8 | 2 | 4 | 0 | 0 |
8 | 3 | 4 | 0 | 0 |
8 | 4 | 4 | 1 | 1 |
9 | 1 | 2 | 0 | 1 |
9 | 2 | 2 | 1 | 1 |
10 | 1 | 1 | 1 | 1 |
11 | 1 | 1 | 1 | 1 |
12 | 1 | 4 | 0 | 0 |
12 | 2 | 4 | 0 | 0 |
12 | 3 | 4 | 0 | 0 |
12 | 4 | 4 | 1 | 1 |
13 | 1 | 1 | 1 | 1 |
14 | 1 | 3 | 0 | 0 |
14 | 2 | 3 | 0 | 0 |
14 | 3 | 3 | 1 | 1 |
15 | 1 | 4 | 0 | 0 |
15 | 2 | 4 | 1 | 1 |
15 | 3 | 4 | 0 | 0 |
15 | 4 | 4 | 0 | 0 |
16 | 1 | 4 | 0 | 0 |
16 | 2 | 4 | 0 | 0 |
16 | 3 | 4 | 0 | 0 |
16 | 4 | 4 | 1 | 1 |
17 | 1 | 5 | 0 | 1 |
17 | 2 | 5 | 1 | 0 |
17 | 3 | 5 | 0 | 1 |
17 | 4 | 5 | 1 | 0 |
17 | 5 | 5 | 0 | 1 |
18 | 1 | 2 | 1 | 1 |
18 | 2 | 2 | 0 | 1 |
19 | 1 | 1 | 1 | 1 |
20 | 1 | 1 | 1 | 1 |
I think you need something like:
data want;
do until(last.orderId);
set have; by orderId;
hasShipped = hasShipped or shipped;
if hasShipped and not isPaid then
if paid then do;
output;
isPaid = 1;
end;
end;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
I think you need something like:
data want;
do until(last.orderId);
set have; by orderId;
hasShipped = hasShipped or shipped;
if hasShipped and not isPaid then
if paid then do;
output;
isPaid = 1;
end;
end;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
Thank you so much, PG. You are AWESOME!
Hi PG, actually is it possible to make flag variables for the first Shipped and the first Paid after the first Shipped in the same dataset?
Thanks again!
Please give an example of the result you want. - PG
Thanks so much, PG. My target data is listed as below.
So the target variables are Firstship and Firstpay_afterfirstship. Basically the same question as previously, but I need the binary variable Firstship to indicate a row/record has the first shipping. Firstpay_afterfirstship is generated to flag that a row/record has the first payment after first shipping.
orderid | pack_num | pack_tot | shipped | paid | Firstship | Firstpay_afterfirstship |
1 | 1 | 2 | 0 | 0 | 0 | 0 |
1 | 2 | 2 | 1 | 1 | 1 | 1 |
2 | 1 | 5 | 0 | 1 | 0 | 0 |
2 | 2 | 5 | 0 | 0 | 0 | 0 |
2 | 3 | 5 | 0 | 0 | 0 | 0 |
2 | 4 | 5 | 0 | 0 | 0 | 0 |
2 | 5 | 5 | 1 | 1 | 1 | 1 |
3 | 1 | 9 | 1 | 1 | 1 | 1 |
3 | 2 | 9 | 0 | 0 | 0 | 0 |
3 | 3 | 9 | 0 | 0 | 0 | 0 |
3 | 4 | 9 | 1 | 1 | 0 | 0 |
3 | 5 | 9 | 0 | 1 | 0 | 0 |
3 | 6 | 9 | 0 | 1 | 0 | 0 |
3 | 7 | 9 | 0 | 1 | 0 | 0 |
3 | 8 | 9 | 0 | 0 | 0 | 0 |
3 | 9 | 9 | 0 | 1 | 0 | 0 |
4 | 1 | 4 | 0 | 0 | 0 | 0 |
4 | 2 | 4 | 0 | 1 | 0 | 0 |
4 | 3 | 4 | 1 | 1 | 1 | 1 |
4 | 4 | 4 | 1 | 1 | 0 | 0 |
5 | 1 | 23 | 0 | 0 | 0 | 0 |
5 | 2 | 23 | 0 | 0 | 0 | 0 |
5 | 3 | 23 | 0 | 0 | 0 | 0 |
5 | 4 | 23 | 0 | 0 | 0 | 0 |
5 | 5 | 23 | 1 | 1 | 1 | 1 |
5 | 6 | 23 | 1 | 1 | 0 | 0 |
5 | 7 | 23 | 0 | 0 | 0 | 0 |
5 | 8 | 23 | 1 | 1 | 0 | 0 |
5 | 9 | 23 | 0 | 0 | 0 | 0 |
5 | 10 | 23 | 0 | 0 | 0 | 0 |
5 | 11 | 23 | 1 | 1 | 0 | 0 |
5 | 12 | 23 | 0 | 0 | 0 | 0 |
5 | 13 | 23 | 1 | 1 | 0 | 0 |
5 | 14 | 23 | 0 | 0 | 0 | 0 |
5 | 15 | 23 | 1 | 1 | 0 | 0 |
5 | 16 | 23 | 1 | 1 | 0 | 0 |
5 | 17 | 23 | 1 | 1 | 0 | 0 |
5 | 18 | 23 | 1 | 1 | 0 | 0 |
5 | 19 | 23 | 1 | 1 | 0 | 0 |
5 | 20 | 23 | 1 | 1 | 0 | 0 |
5 | 21 | 23 | 1 | 1 | 0 | 0 |
5 | 22 | 23 | 0 | 0 | 0 | 0 |
5 | 23 | 23 | 1 | 1 | 0 | 0 |
6 | 1 | 4 | 0 | 0 | 0 | 0 |
6 | 2 | 4 | 1 | 1 | 1 | 1 |
6 | 3 | 4 | 0 | 0 | 0 | 0 |
6 | 4 | 4 | 0 | 0 | 0 | 0 |
7 | 1 | 6 | 0 | 1 | 0 | 0 |
7 | 2 | 6 | 1 | 1 | 1 | 1 |
7 | 3 | 6 | 0 | 0 | 0 | 0 |
7 | 4 | 6 | 0 | 0 | 0 | 0 |
7 | 5 | 6 | 0 | 0 | 0 | 0 |
7 | 6 | 6 | 0 | 0 | 0 | 0 |
8 | 1 | 4 | 0 | 0 | 0 | 0 |
8 | 2 | 4 | 0 | 0 | 0 | 0 |
8 | 3 | 4 | 0 | 0 | 0 | 0 |
8 | 4 | 4 | 1 | 1 | 1 | 1 |
9 | 1 | 2 | 0 | 1 | 0 | 0 |
9 | 2 | 2 | 1 | 1 | 1 | 1 |
10 | 1 | 1 | 1 | 1 | 1 | 1 |
11 | 1 | 1 | 1 | 1 | 1 | 1 |
12 | 1 | 4 | 0 | 0 | 0 | 0 |
12 | 2 | 4 | 0 | 0 | 0 | 0 |
12 | 3 | 4 | 0 | 0 | 0 | 0 |
12 | 4 | 4 | 1 | 1 | 1 | 1 |
13 | 1 | 1 | 1 | 1 | 1 | 1 |
14 | 1 | 3 | 0 | 0 | 0 | 0 |
14 | 2 | 3 | 0 | 0 | 0 | 0 |
14 | 3 | 3 | 1 | 1 | 1 | 1 |
15 | 1 | 4 | 0 | 0 | 0 | 0 |
15 | 2 | 4 | 1 | 1 | 1 | 1 |
15 | 3 | 4 | 0 | 0 | 0 | 0 |
15 | 4 | 4 | 0 | 0 | 0 | 0 |
16 | 1 | 4 | 0 | 0 | 0 | 0 |
16 | 2 | 4 | 0 | 0 | 0 | 0 |
16 | 3 | 4 | 0 | 0 | 0 | 0 |
16 | 4 | 4 | 1 | 1 | 1 | 1 |
17 | 1 | 5 | 0 | 1 | 0 | 0 |
17 | 2 | 5 | 1 | 0 | 1 | 0 |
17 | 3 | 5 | 0 | 1 | 0 | 1 |
17 | 4 | 5 | 1 | 0 | 0 | 0 |
17 | 5 | 5 | 0 | 1 | 0 | 0 |
18 | 1 | 2 | 1 | 1 | 1 | 1 |
18 | 2 | 2 | 0 | 1 | 0 | 0 |
19 | 1 | 1 | 1 | 1 | 1 | 1 |
20 | 1 | 1 | 1 | 1 | 1 | 1 |
That's a bit different:
data want;
retain hasShipped isPaid;
set have; by orderId;
if first.orderId then call missing(hasShipped, isPaid); /* missing = FALSE */
firstShip = shipped and not hasShipped;
hasShipped = hasShipped or firstShip;
Firstpay_afterfirstship = hasShipped and paid and not isPaid;
isPaid = isPaid or Firstpay_afterfirstship;
drop hasShipped isPaid;
run;
proc print data=want noobs; run;
PG
Thanks so much again, PG!
It's a nice learning process to me.
Lizi
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.