Hi all,
I have to admit my do-loop skill is too weak. I need to sort out the first and last months when shipping was made for each year within a year. As shown below, the columns of startmon and endmon are my objective variables I want.
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year | startmon | endmon |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 4 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 2 | 12 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 1 | 12 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 1 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 | 4 | 9 |
Also, for the orders across years as below, I need to abstract a separate file with startmon, endmon, startyr and endyr.
Original data:
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
Objective data:
OrderID | startmon | endmon | staryr | endyr |
1 | 1 | 4 | 2011 | 2012 |
2 | 2 | 10 | 2012 | 2013 |
3 | 10 | 2 | 2011 | 2013 |
4 | 2 | 1 | 2011 | 2012 |
5 | 4 | 12 | 2012 | 2013 |
Your ideas are greatly appreciated!
Lizi
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon; cards; 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};output; end; keep OrderID i value; run; data want; set temp; by OrderID value notsorted; retain start_mon ; if first.value then start_mon=i; if last.value and value=1 then do;end_mon=i;output;end; drop i value; run;
For the second one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID value notsorted; retain start_mon start_year; if first.value then do;start_mon=month(date); start_year=year(date); end; if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Here's the first part, first/last month. I'm not sure about the logic of first/last year? You just want the first/last year per order id? Which start/end month should align with the years?
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
run;
data want;
set have;
length string $12.;
array mnth(12) mon1-mon12;
string=cats(of mon1-mon12);
first_month=find(string, '1');
last_month=12-find(reverse(string), '1')+1;
run;
Thanks a lot, Reeza.
You addressed my first question.
For the second part, I need to observe the months across several years and find the first and last months over the years as well as the first and last years of shipping per order. I think I can use your idea once I transform the data from long to wide. Thanks again!
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon; cards; 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};output; end; keep OrderID i value; run; data want; set temp; by OrderID value notsorted; retain start_mon ; if first.value then start_mon=i; if last.value and value=1 then do;end_mon=i;output;end; drop i value; run;
For the second one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID value notsorted; retain start_mon start_year; if first.value then do;start_mon=month(date); start_year=year(date); end; if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Thanks so much, Keshan! You have what I've been looking for.
Lizi
Keshan, I didn't pay attention to your comments. But yes, there can be some nonconsecutive ranges. If I only care about the start month and the end month only despite non-consecutive ranges, how would the program change? Thanks again!
You mean consecutive range's start month and end month? If it was, keep my code intact .
I mean there can be breaks during a year. Just like OrderID#6 as below:
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
OrderID #6 has 4 breaks, but I only need the first start, month mon1 (January) of 2011, and the last end month, month mon10 (October) of 2013
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
So, the target data is
OrderID | startmon | startyr | endmon | endyr |
1 | 1 | 2011 | 4 | 2012 |
2 | 2 | 2012 | 10 | 2013 |
3 | 10 | 2011 | 2 | 2013 |
4 | 2 | 2011 | 1 | 2012 |
5 | 4 | 2012 | 12 | 2013 |
6 | 1 | 2011 | 10 | 2013 |
Thanks for all the help, Keshan!
OK. Here is .
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 6 5 1 1 0 0 0 0 0 0 0 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 6 11 1 1 1 1 1 1 1 1 1 1 1 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 6 8 0 1 1 1 1 1 1 0 1 1 0 0 2013 ; run; data temp(where=(value=1)); set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID; retain start_mon start_year; if first.OrderID then do;start_mon=month(date); start_year=year(date); end; if last.OrderID then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Thanks a lot, Keshan. You helped me tons.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.