Hi all,
Here is my data. Data A has ID with different lines with start and end dates.
ID | LINE | LOSTART | LOEND |
1 | 4 | 11/27/2017 | 1/17/2018 |
1 | 1 | 12/1/2015 | 4/1/2016 |
1 | 2 | 4/5/2016 | 8/16/2016 |
1 | 3 | 8/23/2016 | 11/14/2016 |
2 | 1 | 6/4/2016 | 9/18/2016 |
2 | 2 | 9/19/2016 | 11/11/2016 |
3 | 2 | 5/6/2016 | 7/17/2016 |
3 | 3 | 7/18/2016 | 10/3/2016 |
4 | 1 | 11/2/2015 | 5/30/2016 |
4 | 2 | 5/31/2016 | 12/28/2016 |
4 | 3 | 4/8/2018 | 1/9/2019 |
5 | 1 | 4/3/2015 | 6/4/2015 |
5 | 2 | 6/5/2015 | 12/27/2015 |
5 | 3 | 8/5/2016 | 9/7/2016 |
6 | 1 | 10/12/2015 | 6/2/2016 |
6 | 4 | 11/7/2016 | 4/20/2017 |
6 | 2 | 6/3/2016 | 8/27/2016 |
7 | 1 | 6/5/2016 | 6/15/2016 |
Here is data B:
ID | Order |
1 | 7/16/2016 |
1 | 10/2/2017 |
2 | 7/18/2016 |
3 | 7/18/2016 |
4 | 7/26/2016 |
5 | 7/31/2016 |
6 | 7/31/2016 |
7 | 5/10/2016 |
The end result table is below. I am looking for the line by the order date.
The rule for the line variable is as follows:
Line = the order date is in between LOstart and LOend; This is pretty straight forward. For example: ID#1 row 1: Line 2 falls between LOstart and LOend. i.e. 7/16/2016 falls between 4/5/2016 and 8/16/2016
When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.
Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016
Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.
I tried using SQL and only could get partial results.
Row | ID | Order | LINE |
1 | 1 | 7/16/2016 | 2 |
2 | 1 | 10/2/2017 | 3 |
3 | 2 | 7/18/2016 | 1 |
4 | 3 | 7/18/2016 | 3 |
5 | 4 | 7/26/2016 | 2 |
6 | 5 | 7/31/2016 | 2 |
7 | 6 | 7/31/2016 | 2 |
8 | 7 | 5/10/2016 | 0 |
Thanks a ton!
Okay @rajd1 Keeping it simple-
data a;
input ID LINE (LOSTART LOEND) (:mmddyy10.);
format LOSTART LOEND mmddyy10.;
cards;
1 4 11/27/2017 1/17/2018
1 1 12/1/2015 4/1/2016
1 2 4/5/2016 8/16/2016
1 3 8/23/2016 11/14/2016
2 1 6/4/2016 9/18/2016
2 2 9/19/2016 11/11/2016
3 2 5/6/2016 7/17/2016
3 3 7/18/2016 10/3/2016
4 1 11/2/2015 5/30/2016
4 2 5/31/2016 12/28/2016
4 3 4/8/2018 1/9/2019
5 1 4/3/2015 6/4/2015
5 2 6/5/2015 12/27/2015
5 3 8/5/2016 9/7/2016
6 1 10/12/2015 6/2/2016
6 4 11/7/2016 4/20/2017
6 2 6/3/2016 8/27/2016
7 1 6/5/2016 6/15/2016
;
data b;
input ID Order :mmddyy10.;
format order mmddyy10.;
cards;
1 7/16/2016
1 10/2/2017
2 7/18/2016
3 7/18/2016
4 7/26/2016
5 7/31/2016
6 7/31/2016
7 5/10/2016
;
proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id
group by a.id,order
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;
data want;
set temp;
if line=1 and _line=0 then line=_line;
drop _line;
run;
ID Order LINE
1 07/16/2016 2
1 10/02/2017 3
2 07/18/2016 1
3 07/18/2016 3
4 07/26/2016 2
5 07/31/2016 2
6 07/31/2016 2
7 05/10/2016 0
Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016?where is this in Data B?
Hi @rajd1 Alright no worries. Please try and test properly as I didn't for the reason I'm extremely tired and am about to doze off. If it doesnt work, Iet me know. I shall try and fix it tomoorow. Oh hey, in any case somebody else would have answered before I wake. Good night from CT
data a;
input ID LINE (LOSTART LOEND) (:mmddyy10.);
format LOSTART LOEND mmddyy10.;
cards;
1 4 11/27/2017 1/17/2018
1 1 12/1/2015 4/1/2016
1 2 4/5/2016 8/16/2016
1 3 8/23/2016 11/14/2016
2 1 6/4/2016 9/18/2016
2 2 9/19/2016 11/11/2016
3 2 5/6/2016 7/17/2016
3 3 7/18/2016 10/3/2016
4 1 11/2/2015 5/30/2016
4 2 5/31/2016 12/28/2016
4 3 4/8/2018 1/9/2019
5 1 4/3/2015 6/4/2015
5 2 6/5/2015 12/27/2015
5 3 8/5/2016 9/7/2016
6 1 10/12/2015 6/2/2016
6 4 11/7/2016 4/20/2017
6 2 6/3/2016 8/27/2016
7 1 6/5/2016 6/15/2016
;
data b;
input ID Order :mmddyy10.;
format order mmddyy10.;
cards;
1 7/16/2016
1 10/2/2017
2 7/18/2016
3 7/18/2016
4 7/26/2016
5 7/31/2016
6 7/31/2016
7 5/10/2016
;
proc sql;
create table want as
select b.* , ifn(max(line)=1 and order<lostart,0,line) as line
from b b left join a a
on a.id=b.id and (LOSTART<=order<=LOEND or LOSTART>order)
group by a.id,order
having LOSTART<=order<=LOEND or
^max(LOSTART<=order<=LOEND) and min(lostart)=lostart
order by a.id,order;
quit;
Hi @novinosrin , Thanks for this. Just a subtle change that i was looking for:
When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.
Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016
Hi @rajd1 Good afternoon. I guess there is some problem in the communication or perhaps I am just lacking the comprehension skill. Let's do in piecemeal-
1. Once you JOIN/LOOK UP using self join, the sql processor see's this in-memory. The true flag =1 denotes the order falls between lostart and loend. So for these ID,Order combinations, it's easy to filter i.e. where true_flag=1. And for those cases, where it isn't true, your statement line= before the start of next line- is kinda confusing. Please explain the comparsion for ID1 10/02/2017 from the below
ID | Order | LINE | True_Flag | LOSTART | LOEND |
---|---|---|---|---|---|
1 | 07/16/2016 | 1 | 0 | 12/01/2015 | 04/01/2016 |
1 | 07/16/2016 | 2 | 1 | 04/05/2016 | 08/16/2016 |
1 | 07/16/2016 | 3 | 0 | 08/23/2016 | 11/14/2016 |
1 | 07/16/2016 | 4 | 0 | 11/27/2017 | 01/17/2018 |
1 | 10/02/2017 | 1 | 0 | 12/01/2015 | 04/01/2016 |
1 | 10/02/2017 | 2 | 0 | 04/05/2016 | 08/16/2016 |
1 | 10/02/2017 | 3 | 0 | 08/23/2016 | 11/14/2016 |
1 | 10/02/2017 | 4 | 0 | 11/27/2017 | 01/17/2018 |
2 | 07/18/2016 | 1 | 1 | 06/04/2016 | 09/18/2016 |
2 | 07/18/2016 | 2 | 0 | 09/19/2016 | 11/11/2016 |
3 | 07/18/2016 | 2 | 0 | 05/06/2016 | 07/17/2016 |
3 | 07/18/2016 | 3 | 1 | 07/18/2016 | 10/03/2016 |
4 | 07/26/2016 | 1 | 0 | 11/02/2015 | 05/30/2016 |
4 | 07/26/2016 | 2 | 1 | 05/31/2016 | 12/28/2016 |
4 | 07/26/2016 | 3 | 0 | 04/08/2018 | 01/09/2019 |
5 | 07/31/2016 | 1 | 0 | 04/03/2015 | 06/04/2015 |
5 | 07/31/2016 | 2 | 0 | 06/05/2015 | 12/27/2015 |
5 | 07/31/2016 | 3 | 0 | 08/05/2016 | 09/07/2016 |
6 | 07/31/2016 | 1 | 0 | 10/12/2015 | 06/02/2016 |
6 | 07/31/2016 | 2 | 1 | 06/03/2016 | 08/27/2016 |
6 | 07/31/2016 | 4 | 0 | 11/07/2016 | 04/20/2017 |
7 | 05/10/2016 | 1 | 0 | 06/05/2016 | 06/15/2016 |
Good afternoon,
This is great! Hope i explain this better this time. The first rule between dates is good
So, say for ID#1 orderdate= 10/02/2017 will be Line = 3; It falls between line 3 LOEND 11/14/2016 and line 4 LOSTART 11/27/2017 and is still considered line 3.
Again, for ID#5, orderdate = 7/31/2016 will be line 2; it falls between line 2 LOEND 12/27/2015 and line 3 LOSTART 08.05/2016 and is still considered line 2.
You had this in the solution and it works great: If order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.
I have the final solution below.
ID | Order | LINE |
1 | 7/16/2016 | 2 |
1 | 10/2/2017 | 3 |
2 | 7/18/2016 | 1 |
3 | 7/18/2016 | 3 |
4 | 7/26/2016 | 2 |
5 | 7/31/2016 | 2 |
6 | 7/31/2016 | 2 |
7 | 5/10/2016 | 0 |
Much better Sir @rajd1 . lol. Thank you!!!. Hmm, you "have"-'I have the final solution below.'
Do you still want the solution or you have done writing it?
@novinosrin Can I please get your modified code? Thank you sir 🙂
Okay @rajd1 Keeping it simple-
data a;
input ID LINE (LOSTART LOEND) (:mmddyy10.);
format LOSTART LOEND mmddyy10.;
cards;
1 4 11/27/2017 1/17/2018
1 1 12/1/2015 4/1/2016
1 2 4/5/2016 8/16/2016
1 3 8/23/2016 11/14/2016
2 1 6/4/2016 9/18/2016
2 2 9/19/2016 11/11/2016
3 2 5/6/2016 7/17/2016
3 3 7/18/2016 10/3/2016
4 1 11/2/2015 5/30/2016
4 2 5/31/2016 12/28/2016
4 3 4/8/2018 1/9/2019
5 1 4/3/2015 6/4/2015
5 2 6/5/2015 12/27/2015
5 3 8/5/2016 9/7/2016
6 1 10/12/2015 6/2/2016
6 4 11/7/2016 4/20/2017
6 2 6/3/2016 8/27/2016
7 1 6/5/2016 6/15/2016
;
data b;
input ID Order :mmddyy10.;
format order mmddyy10.;
cards;
1 7/16/2016
1 10/2/2017
2 7/18/2016
3 7/18/2016
4 7/26/2016
5 7/31/2016
6 7/31/2016
7 5/10/2016
;
proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id
group by a.id,order
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;
data want;
set temp;
if line=1 and _line=0 then line=_line;
drop _line;
run;
ID Order LINE
1 07/16/2016 2
1 10/02/2017 3
2 07/18/2016 1
3 07/18/2016 3
4 07/26/2016 2
5 07/31/2016 2
6 07/31/2016 2
7 05/10/2016 0
The rule for ID#5 as well as ID#1 row2, so is the line of ID#5 line 3?I am not sure that I understood what you said. Here is my code.
proc sort data=a ;
by id lostart;
run;
data a1;
set a;
by id lostart;
lostart1=lag(loend)+1;
if first.id then lostart1=lostart;
format lagstdat mmddyy10.;
run;
proc sql;
create table b1 as
select b.id,b.order
,case when not missing(a.line) then a.line
else 0
end as line
from b
left join a1 as a
on b.id=a.id and b.order between a.lostart1 and a.loend
order by b.id,b.order
;
quit;
According to your rule, the result for ID 5 should be 3, not 2. Please revise your intended result.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.