Hi all,
I have to admit my do-loop skill is too weak. I need to sort out the first and last months when shipping was made for each year within a year. As shown below, the columns of startmon and endmon are my objective variables I want.
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year | startmon | endmon |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 4 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 2 | 12 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 1 | 12 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 1 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 | 4 | 9 |
Also, for the orders across years as below, I need to abstract a separate file with startmon, endmon, startyr and endyr.
Original data:
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
Objective data:
OrderID | startmon | endmon | staryr | endyr |
1 | 1 | 4 | 2011 | 2012 |
2 | 2 | 10 | 2012 | 2013 |
3 | 10 | 2 | 2011 | 2013 |
4 | 2 | 1 | 2011 | 2012 |
5 | 4 | 12 | 2012 | 2013 |
Your ideas are greatly appreciated!
Lizi
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon; cards; 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};output; end; keep OrderID i value; run; data want; set temp; by OrderID value notsorted; retain start_mon ; if first.value then start_mon=i; if last.value and value=1 then do;end_mon=i;output;end; drop i value; run;
For the second one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID value notsorted; retain start_mon start_year; if first.value then do;start_mon=month(date); start_year=year(date); end; if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Here's the first part, first/last month. I'm not sure about the logic of first/last year? You just want the first/last year per order id? Which start/end month should align with the years?
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
run;
data want;
set have;
length string $12.;
array mnth(12) mon1-mon12;
string=cats(of mon1-mon12);
first_month=find(string, '1');
last_month=12-find(reverse(string), '1')+1;
run;
Thanks a lot, Reeza.
You addressed my first question.
For the second part, I need to observe the months across several years and find the first and last months over the years as well as the first and last years of shipping per order. I think I can use your idea once I transform the data from long to wide. Thanks again!
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon; cards; 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};output; end; keep OrderID i value; run; data want; set temp; by OrderID value notsorted; retain start_mon ; if first.value then start_mon=i; if last.value and value=1 then do;end_mon=i;output;end; drop i value; run;
For the second one:
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 ; run; data temp; set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID value notsorted; retain start_mon start_year; if first.value then do;start_mon=month(date); start_year=year(date); end; if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Thanks so much, Keshan! You have what I've been looking for.
Lizi
Keshan, I didn't pay attention to your comments. But yes, there can be some nonconsecutive ranges. If I only care about the start month and the end month only despite non-consecutive ranges, how would the program change? Thanks again!
You mean consecutive range's start month and end month? If it was, keep my code intact .
I mean there can be breaks during a year. Just like OrderID#6 as below:
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
OrderID #6 has 4 breaks, but I only need the first start, month mon1 (January) of 2011, and the last end month, month mon10 (October) of 2013
OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
So, the target data is
OrderID | startmon | startyr | endmon | endyr |
1 | 1 | 2011 | 4 | 2012 |
2 | 2 | 2012 | 10 | 2013 |
3 | 10 | 2011 | 2 | 2013 |
4 | 2 | 2011 | 1 | 2012 |
5 | 4 | 2012 | 12 | 2013 |
6 | 1 | 2011 | 10 | 2013 |
Thanks for all the help, Keshan!
OK. Here is .
data have; input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ; cards; 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011 3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011 4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011 6 5 1 1 0 0 0 0 0 0 0 1 1 1 2011 1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 6 11 1 1 1 1 1 1 1 1 1 1 1 0 2012 1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013 3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013 5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013 6 8 0 1 1 1 1 1 1 0 1 1 0 0 2013 ; run; data temp(where=(value=1)); set have; array m{*} mon1-mon12; do i=1 to dim(m); value=m{i};date=mdy(i,1,year);output; end; format date date9.; keep OrderID date value; run; proc sort data=temp; by OrderID date;run; data want; set temp; by OrderID; retain start_mon start_year; if first.OrderID then do;start_mon=month(date); start_year=year(date); end; if last.OrderID then do;end_mon=month(date); end_year=year(date);output;end; drop value; run;
Xia Keshan
Thanks a lot, Keshan. You helped me tons.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.