BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rajd1
Quartz | Level 8

Hi all,

Here is my data. Data A has ID with different lines with start and end dates.

IDLINELOSTARTLOEND
1411/27/20171/17/2018
1112/1/20154/1/2016
124/5/20168/16/2016
138/23/201611/14/2016
216/4/20169/18/2016
229/19/201611/11/2016
325/6/20167/17/2016
337/18/201610/3/2016
4111/2/20155/30/2016
425/31/201612/28/2016
434/8/20181/9/2019
514/3/20156/4/2015
526/5/201512/27/2015
538/5/20169/7/2016
6110/12/20156/2/2016
6411/7/20164/20/2017
626/3/20168/27/2016
716/5/20166/15/2016

Here is data B:

IDOrder
17/16/2016
110/2/2017
27/18/2016
37/18/2016
47/26/2016
57/31/2016
67/31/2016
75/10/2016

 

The end result table is below. I am looking for the line by the order date.

The rule for the line variable is as follows:

Line = the order date is in between LOstart and LOend; This is pretty straight forward. For example: ID#1 row 1: Line 2 falls between LOstart and  LOend. i.e. 7/16/2016 falls between 4/5/2016 and 8/16/2016

 

When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016

 

Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.

I tried using SQL and only could get partial results. 

 

RowIDOrderLINE
117/16/20162
2110/2/20173
327/18/20161
437/18/20163
547/26/20162
657/31/20162
767/31/20162
875/10/20160

Thanks a ton!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Okay @rajd1  Keeping it simple-


data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;

proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id 
group by a.id,order 
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;

data want;
 set temp;
 if line=1 and _line=0 then line=_line;
 drop _line;
run;
ID Order LINE 
1 07/16/2016 2 
1 10/02/2017 3 
2 07/18/2016 1 
3 07/18/2016 3 
4 07/26/2016 2 
5 07/31/2016 2 
6 07/31/2016 2 
7 05/10/2016 0 

View solution in original post

17 REPLIES 17
novinosrin
Tourmaline | Level 20

Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016?where is this in Data B?

rajd1
Quartz | Level 8
Yes sir, its the last row in data set B.
rajd1
Quartz | Level 8
Yes, for some reason the row was missing but i edited it. Thanks
novinosrin
Tourmaline | Level 20

Hi @rajd1  Alright no worries. Please try and test properly as I didn't for the reason I'm extremely tired and am about to doze off. If it doesnt work, Iet me know. I shall try and fix it tomoorow. Oh hey, in any case somebody else would have answered before I wake. Good night from CT

 

data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;
proc sql;
create table want as
select b.* , ifn(max(line)=1 and order<lostart,0,line) as line
from b b left join a a
on a.id=b.id and (LOSTART<=order<=LOEND or LOSTART>order)
group by a.id,order 
having 	LOSTART<=order<=LOEND or 
^max(LOSTART<=order<=LOEND) and min(lostart)=lostart
order by a.id,order;
quit;
rajd1
Quartz | Level 8

Hi @novinosrin , Thanks for this. Just a subtle change that i was looking for:

 

When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016

novinosrin
Tourmaline | Level 20

Hi @rajd1  Good afternoon. I guess there is some problem in the communication or perhaps I am just lacking the comprehension skill. Let's do in piecemeal-

1. Once you JOIN/LOOK UP using self join, the sql processor see's this in-memory. The true flag =1 denotes the order falls between lostart and loend. So for these ID,Order combinations, it's easy to filter i.e. where true_flag=1. And for those cases, where it isn't true, your statement line= before the start of next line- is kinda confusing. Please explain the comparsion for ID1 10/02/2017 from the below

ID Order LINE True_Flag LOSTART LOEND
1 07/16/2016 1 0 12/01/2015 04/01/2016
1 07/16/2016 2 1 04/05/2016 08/16/2016
1 07/16/2016 3 0 08/23/2016 11/14/2016
1 07/16/2016 4 0 11/27/2017 01/17/2018
1 10/02/2017 1 0 12/01/2015 04/01/2016
1 10/02/2017 2 0 04/05/2016 08/16/2016
1 10/02/2017 3 0 08/23/2016 11/14/2016
1 10/02/2017 4 0 11/27/2017 01/17/2018
2 07/18/2016 1 1 06/04/2016 09/18/2016
2 07/18/2016 2 0 09/19/2016 11/11/2016
3 07/18/2016 2 0 05/06/2016 07/17/2016
3 07/18/2016 3 1 07/18/2016 10/03/2016
4 07/26/2016 1 0 11/02/2015 05/30/2016
4 07/26/2016 2 1 05/31/2016 12/28/2016
4 07/26/2016 3 0 04/08/2018 01/09/2019
5 07/31/2016 1 0 04/03/2015 06/04/2015
5 07/31/2016 2 0 06/05/2015 12/27/2015
5 07/31/2016 3 0 08/05/2016 09/07/2016
6 07/31/2016 1 0 10/12/2015 06/02/2016
6 07/31/2016 2 1 06/03/2016 08/27/2016
6 07/31/2016 4 0 11/07/2016 04/20/2017
7 05/10/2016 1 0 06/05/2016 06/15/2016

 

rajd1
Quartz | Level 8

Good afternoon,
This is great! Hope i explain this better this time. The first rule between dates is good
So, say for ID#1 orderdate= 10/02/2017 will be Line = 3; It falls between line 3 LOEND 11/14/2016 and line 4 LOSTART 11/27/2017 and is still considered line 3.
Again, for ID#5, orderdate = 7/31/2016 will be line 2; it falls between line 2 LOEND 12/27/2015 and line 3 LOSTART 08.05/2016 and is still considered line 2.
You had this in the solution and it works great: If order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.
I have the final solution below.

IDOrderLINE
17/16/20162
110/2/20173
27/18/20161
37/18/20163
47/26/20162
57/31/20162
67/31/20162
75/10/20160



novinosrin
Tourmaline | Level 20

Much better Sir @rajd1 .  lol. Thank you!!!. Hmm, you "have"-'I have the final solution below.'

 

Do you still want the solution or you have done writing it?

rajd1
Quartz | Level 8

@novinosrin Can I please get your modified code? Thank you sir 🙂

novinosrin
Tourmaline | Level 20

Okay @rajd1  Keeping it simple-


data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;

proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id 
group by a.id,order 
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;

data want;
 set temp;
 if line=1 and _line=0 then line=_line;
 drop _line;
run;
ID Order LINE 
1 07/16/2016 2 
1 10/02/2017 3 
2 07/18/2016 1 
3 07/18/2016 3 
4 07/26/2016 2 
5 07/31/2016 2 
6 07/31/2016 2 
7 05/10/2016 0 

rajd1
Quartz | Level 8
Works fantastic , thanks a lot
Eigth_Below
Calcite | Level 5

The rule for ID#5 as well as ID#1 row2, so is the line of ID#5  line 3?I am not sure that I understood what you said. Here is my code.

proc sort data=a ;
by id lostart;
run;

data a1;
set a;
by id lostart;
lostart1=lag(loend)+1;
if first.id then lostart1=lostart;
format lagstdat mmddyy10.;
run;


proc sql;
create table b1 as
select b.id,b.order
,case when not missing(a.line) then a.line
else 0
end as line
from b
left join a1 as a
on b.id=a.id and b.order between a.lostart1 and a.loend
order by b.id,b.order
;
quit;

 

rajd1
Quartz | Level 8
Hi, This works great.
I need a subtle change:
When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 1765 views
  • 5 likes
  • 4 in conversation