BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rajd1
Quartz | Level 8

Hi all,

Here is my data. Data A has ID with different lines with start and end dates.

IDLINELOSTARTLOEND
1411/27/20171/17/2018
1112/1/20154/1/2016
124/5/20168/16/2016
138/23/201611/14/2016
216/4/20169/18/2016
229/19/201611/11/2016
325/6/20167/17/2016
337/18/201610/3/2016
4111/2/20155/30/2016
425/31/201612/28/2016
434/8/20181/9/2019
514/3/20156/4/2015
526/5/201512/27/2015
538/5/20169/7/2016
6110/12/20156/2/2016
6411/7/20164/20/2017
626/3/20168/27/2016
716/5/20166/15/2016

Here is data B:

IDOrder
17/16/2016
110/2/2017
27/18/2016
37/18/2016
47/26/2016
57/31/2016
67/31/2016
75/10/2016

 

The end result table is below. I am looking for the line by the order date.

The rule for the line variable is as follows:

Line = the order date is in between LOstart and LOend; This is pretty straight forward. For example: ID#1 row 1: Line 2 falls between LOstart and  LOend. i.e. 7/16/2016 falls between 4/5/2016 and 8/16/2016

 

When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016

 

Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.

I tried using SQL and only could get partial results. 

 

RowIDOrderLINE
117/16/20162
2110/2/20173
327/18/20161
437/18/20163
547/26/20162
657/31/20162
767/31/20162
875/10/20160

Thanks a ton!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Okay @rajd1  Keeping it simple-


data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;

proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id 
group by a.id,order 
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;

data want;
 set temp;
 if line=1 and _line=0 then line=_line;
 drop _line;
run;
ID Order LINE 
1 07/16/2016 2 
1 10/02/2017 3 
2 07/18/2016 1 
3 07/18/2016 3 
4 07/26/2016 2 
5 07/31/2016 2 
6 07/31/2016 2 
7 05/10/2016 0 

View solution in original post

17 REPLIES 17
novinosrin
Tourmaline | Level 20

Finally for only 1 order and if this order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016?where is this in Data B?

rajd1
Quartz | Level 8
Yes sir, its the last row in data set B.
rajd1
Quartz | Level 8
Yes, for some reason the row was missing but i edited it. Thanks
novinosrin
Tourmaline | Level 20

Hi @rajd1  Alright no worries. Please try and test properly as I didn't for the reason I'm extremely tired and am about to doze off. If it doesnt work, Iet me know. I shall try and fix it tomoorow. Oh hey, in any case somebody else would have answered before I wake. Good night from CT

 

data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;
proc sql;
create table want as
select b.* , ifn(max(line)=1 and order<lostart,0,line) as line
from b b left join a a
on a.id=b.id and (LOSTART<=order<=LOEND or LOSTART>order)
group by a.id,order 
having 	LOSTART<=order<=LOEND or 
^max(LOSTART<=order<=LOEND) and min(lostart)=lostart
order by a.id,order;
quit;
rajd1
Quartz | Level 8

Hi @novinosrin , Thanks for this. Just a subtle change that i was looking for:

 

When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016

novinosrin
Tourmaline | Level 20

Hi @rajd1  Good afternoon. I guess there is some problem in the communication or perhaps I am just lacking the comprehension skill. Let's do in piecemeal-

1. Once you JOIN/LOOK UP using self join, the sql processor see's this in-memory. The true flag =1 denotes the order falls between lostart and loend. So for these ID,Order combinations, it's easy to filter i.e. where true_flag=1. And for those cases, where it isn't true, your statement line= before the start of next line- is kinda confusing. Please explain the comparsion for ID1 10/02/2017 from the below

ID Order LINE True_Flag LOSTART LOEND
1 07/16/2016 1 0 12/01/2015 04/01/2016
1 07/16/2016 2 1 04/05/2016 08/16/2016
1 07/16/2016 3 0 08/23/2016 11/14/2016
1 07/16/2016 4 0 11/27/2017 01/17/2018
1 10/02/2017 1 0 12/01/2015 04/01/2016
1 10/02/2017 2 0 04/05/2016 08/16/2016
1 10/02/2017 3 0 08/23/2016 11/14/2016
1 10/02/2017 4 0 11/27/2017 01/17/2018
2 07/18/2016 1 1 06/04/2016 09/18/2016
2 07/18/2016 2 0 09/19/2016 11/11/2016
3 07/18/2016 2 0 05/06/2016 07/17/2016
3 07/18/2016 3 1 07/18/2016 10/03/2016
4 07/26/2016 1 0 11/02/2015 05/30/2016
4 07/26/2016 2 1 05/31/2016 12/28/2016
4 07/26/2016 3 0 04/08/2018 01/09/2019
5 07/31/2016 1 0 04/03/2015 06/04/2015
5 07/31/2016 2 0 06/05/2015 12/27/2015
5 07/31/2016 3 0 08/05/2016 09/07/2016
6 07/31/2016 1 0 10/12/2015 06/02/2016
6 07/31/2016 2 1 06/03/2016 08/27/2016
6 07/31/2016 4 0 11/07/2016 04/20/2017
7 05/10/2016 1 0 06/05/2016 06/15/2016

 

rajd1
Quartz | Level 8

Good afternoon,
This is great! Hope i explain this better this time. The first rule between dates is good
So, say for ID#1 orderdate= 10/02/2017 will be Line = 3; It falls between line 3 LOEND 11/14/2016 and line 4 LOSTART 11/27/2017 and is still considered line 3.
Again, for ID#5, orderdate = 7/31/2016 will be line 2; it falls between line 2 LOEND 12/27/2015 and line 3 LOSTART 08.05/2016 and is still considered line 2.
You had this in the solution and it works great: If order occurs before LINE 1 then it is considered 0. For Example: ID#7 row 8: Line is considered 0 because the order date of 5/10/2016 occurs before LOSTART date of 6/5/2016.
I have the final solution below.

IDOrderLINE
17/16/20162
110/2/20173
27/18/20161
37/18/20163
47/26/20162
57/31/20162
67/31/20162
75/10/20160



novinosrin
Tourmaline | Level 20

Much better Sir @rajd1 .  lol. Thank you!!!. Hmm, you "have"-'I have the final solution below.'

 

Do you still want the solution or you have done writing it?

rajd1
Quartz | Level 8

@novinosrin Can I please get your modified code? Thank you sir 🙂

novinosrin
Tourmaline | Level 20

Okay @rajd1  Keeping it simple-


data a;
input ID	LINE	(LOSTART	LOEND) (:mmddyy10.);
format 	LOSTART	LOEND mmddyy10.;
cards;
1	4	11/27/2017	1/17/2018
1	1	12/1/2015	4/1/2016
1	2	4/5/2016	8/16/2016
1	3	8/23/2016	11/14/2016
2	1	6/4/2016	9/18/2016
2	2	9/19/2016	11/11/2016
3	2	5/6/2016	7/17/2016
3	3	7/18/2016	10/3/2016
4	1	11/2/2015	5/30/2016
4	2	5/31/2016	12/28/2016
4	3	4/8/2018	1/9/2019
5	1	4/3/2015	6/4/2015
5	2	6/5/2015	12/27/2015
5	3	8/5/2016	9/7/2016
6	1	10/12/2015	6/2/2016
6	4	11/7/2016	4/20/2017
6	2	6/3/2016	8/27/2016
7	1	6/5/2016	6/15/2016
;

data b;
input ID	Order :mmddyy10.;
format order mmddyy10.;
cards;
1	7/16/2016
1	10/2/2017
2	7/18/2016
3	7/18/2016
4	7/26/2016
5	7/31/2016
6	7/31/2016
7	5/10/2016
;

proc sql;
create table temp(drop=_True_Flag) as
select b.id,order,max(LOSTART<=order<=LOEND) as _True_Flag,line,
min(ifn(lostart>order,line,.))-1 as _line
from b b left join a a
on a.id=b.id 
group by a.id,order 
having _True_Flag and LOSTART<=order<=LOEND
or (not _True_Flag and _line=line)
or (line=1 and _line=0 and not _True_Flag)
order by a.id,order,line;
quit;

data want;
 set temp;
 if line=1 and _line=0 then line=_line;
 drop _line;
run;
ID Order LINE 
1 07/16/2016 2 
1 10/02/2017 3 
2 07/18/2016 1 
3 07/18/2016 3 
4 07/26/2016 2 
5 07/31/2016 2 
6 07/31/2016 2 
7 05/10/2016 0 

rajd1
Quartz | Level 8
Works fantastic , thanks a lot
Eigth_Below
Calcite | Level 5

The rule for ID#5 as well as ID#1 row2, so is the line of ID#5  line 3?I am not sure that I understood what you said. Here is my code.

proc sort data=a ;
by id lostart;
run;

data a1;
set a;
by id lostart;
lostart1=lag(loend)+1;
if first.id then lostart1=lostart;
format lagstdat mmddyy10.;
run;


proc sql;
create table b1 as
select b.id,b.order
,case when not missing(a.line) then a.line
else 0
end as line
from b
left join a1 as a
on b.id=a.id and b.order between a.lostart1 and a.loend
order by b.id,b.order
;
quit;

 

rajd1
Quartz | Level 8
Hi, This works great.
I need a subtle change:
When order date is not in between LOstart and LOend, then Line = before the start of next line. For example: ID#1 row 2: order date 10/2/2017 is considered line 3 because it occurs before the beginning of line 4 LOstart date of 11/27/2018.

Similarly with ID#5 order date is 7/31/2016 is considered line 2 because it occurs before the beginning of line 3 LOstart date of 8/5/2016
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 3265 views
  • 5 likes
  • 4 in conversation