Hi all,
I have to admit my do-loop skill is too weak. I need to sort out the first and last months when shipping was made for each year within a year. As shown below, the columns of startmon and endmon are my objective variables I want.
| OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year | startmon | endmon |
| 1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 4 |
| 2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 2 | 12 |
| 3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 | 1 | 12 |
| 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 1 | 1 |
| 5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 | 4 | 9 |
Also, for the orders across years as below, I need to abstract a separate file with startmon, endmon, startyr and endyr.
Original data:
| OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
| 1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
| 3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
| 4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
| 1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
| 2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
| 3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
| 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
| 5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
| 1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
| 2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
| 3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
| 5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
Objective data:
| OrderID | startmon | endmon | staryr | endyr |
| 1 | 1 | 4 | 2011 | 2012 |
| 2 | 2 | 10 | 2012 | 2013 |
| 3 | 10 | 2 | 2011 | 2013 |
| 4 | 2 | 1 | 2011 | 2012 |
| 5 | 4 | 12 | 2012 | 2013 |
Your ideas are greatly appreciated!
Lizi
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
run;
data temp;
set have;
array m{*} mon1-mon12;
do i=1 to dim(m);
value=m{i};output;
end;
keep OrderID i value;
run;
data want;
set temp;
by OrderID value notsorted;
retain start_mon ;
if first.value then start_mon=i;
if last.value and value=1 then do;end_mon=i;output;end;
drop i value;
run;
For the second one:
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ;
cards;
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011
3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011
4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013
3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013
5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
;
run;
data temp;
set have;
array m{*} mon1-mon12;
do i=1 to dim(m);
value=m{i};date=mdy(i,1,year);output;
end;
format date date9.;
keep OrderID date value;
run;
proc sort data=temp; by OrderID date;run;
data want;
set temp;
by OrderID value notsorted;
retain start_mon start_year;
if first.value then do;start_mon=month(date); start_year=year(date); end;
if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end;
drop value;
run;
Xia Keshan
Here's the first part, first/last month. I'm not sure about the logic of first/last year? You just want the first/last year per order id? Which start/end month should align with the years?
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
run;
data want;
set have;
length string $12.;
array mnth(12) mon1-mon12;
string=cats(of mon1-mon12);
first_month=find(string, '1');
last_month=12-find(reverse(string), '1')+1;
run;
Thanks a lot, Reeza.
You addressed my first question.
For the second part, I need to observe the months across several years and find the first and last months over the years as well as the first and last years of shipping per order. I think I can use your idea once I transform the data from long to wide. Thanks again!
I want know if there are some non-consecutive range ,like :
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 1 1 1 1 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
For the first one:
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year startmon endmon;
cards;
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012 1 4
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012 2 12
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012 1 12
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012 1 1
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012 4 9
;
run;
data temp;
set have;
array m{*} mon1-mon12;
do i=1 to dim(m);
value=m{i};output;
end;
keep OrderID i value;
run;
data want;
set temp;
by OrderID value notsorted;
retain start_mon ;
if first.value then start_mon=i;
if last.value and value=1 then do;end_mon=i;output;end;
drop i value;
run;
For the second one:
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ;
cards;
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011
3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011
4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013
3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013
5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
;
run;
data temp;
set have;
array m{*} mon1-mon12;
do i=1 to dim(m);
value=m{i};date=mdy(i,1,year);output;
end;
format date date9.;
keep OrderID date value;
run;
proc sort data=temp; by OrderID date;run;
data want;
set temp;
by OrderID value notsorted;
retain start_mon start_year;
if first.value then do;start_mon=month(date); start_year=year(date); end;
if last.value and value=1 then do;end_mon=month(date); end_year=year(date);output;end;
drop value;
run;
Xia Keshan
Thanks so much, Keshan! You have what I've been looking for.
Lizi
Keshan, I didn't pay attention to your comments. But yes, there can be some nonconsecutive ranges. If I only care about the start month and the end month only despite non-consecutive ranges, how would the program change? Thanks again!
You mean consecutive range's start month and end month? If it was, keep my code intact .
I mean there can be breaks during a year. Just like OrderID#6 as below:
| OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
| 1 | 12 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
| 3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
| 4 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2011 |
| 6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
| 1 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
| 2 | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
| 3 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2012 |
| 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 |
| 5 | 6 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 2012 |
| 6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
| 1 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
| 2 | 10 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 2013 |
| 3 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 |
| 5 | 12 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2013 |
| 6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
OrderID #6 has 4 breaks, but I only need the first start, month mon1 (January) of 2011, and the last end month, month mon10 (October) of 2013
| OrderID | mons | mon1 | mon2 | mon3 | mon4 | mon5 | mon6 | mon7 | mon8 | mon9 | mon10 | mon11 | mon12 | year |
| 6 | 5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2011 |
| 6 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 2012 |
| 6 | 8 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 2013 |
So, the target data is
| OrderID | startmon | startyr | endmon | endyr |
| 1 | 1 | 2011 | 4 | 2012 |
| 2 | 2 | 2012 | 10 | 2013 |
| 3 | 10 | 2011 | 2 | 2013 |
| 4 | 2 | 2011 | 1 | 2012 |
| 5 | 4 | 2012 | 12 | 2013 |
| 6 | 1 | 2011 | 10 | 2013 |
Thanks for all the help, Keshan!
OK. Here is .
data have;
input OrderID mons mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 year ;
cards;
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2011
3 3 0 0 0 0 0 0 0 0 0 1 1 1 2011
4 11 0 1 1 1 1 1 1 1 1 1 1 1 2011
6 5 1 1 0 0 0 0 0 0 0 1 1 1 2011
1 4 1 1 1 1 0 0 0 0 0 0 0 0 2012
2 11 0 1 1 1 1 1 1 1 1 1 1 1 2012
3 12 1 1 1 1 1 1 1 1 1 1 1 1 2012
4 1 1 0 0 0 0 0 0 0 0 0 0 0 2012
5 6 0 0 0 1 1 1 1 1 1 0 0 0 2012
6 11 1 1 1 1 1 1 1 1 1 1 1 0 2012
1 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
2 10 1 1 1 1 1 1 1 1 1 1 0 0 2013
3 2 1 1 0 0 0 0 0 0 0 0 0 0 2013
5 12 1 1 1 1 1 1 1 1 1 1 1 1 2013
6 8 0 1 1 1 1 1 1 0 1 1 0 0 2013
;
run;
data temp(where=(value=1));
set have;
array m{*} mon1-mon12;
do i=1 to dim(m);
value=m{i};date=mdy(i,1,year);output;
end;
format date date9.;
keep OrderID date value;
run;
proc sort data=temp; by OrderID date;run;
data want;
set temp;
by OrderID;
retain start_mon start_year;
if first.OrderID then do;start_mon=month(date); start_year=year(date); end;
if last.OrderID then do;end_mon=month(date); end_year=year(date);output;end;
drop value;
run;
Xia Keshan
Thanks a lot, Keshan. You helped me tons.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.