Solved: Finding overlapping durations

wizkid2050 · Posted 09-14-2016 06:13 AM

data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
proc print;
run;

I have this sample data set

I need to find out the same director (d1,d2 or d3 ) being shown for overlapping duration in multiple rows for same id.

Ksharp · Posted 09-14-2016 11:54 PM

That is not called overlap durations.
If I understood right. you just want which director have duplicated value 
in the same ID group.



 data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
run;

*first, transpose;

data have (keep= id director);
set test;
array dir {*} d1-d3;
do i = 1 to dim(dir);
  director = dir{i};
  output;
end;
run;
proc sql;
create table want as
 select *
  from have
   group by id,director
    having count(*) gt 1;
quit;

View solution in original post

Reeza · Posted 09-14-2016 06:20 AM

Can you show what you'd expect as output?

wizkid2050 · Posted 09-14-2016 06:42 AM

ID Common_Directors

10 23,11,13,9

11 9,8,10,11

or

ID Common_directors

10 23

10 11

10 13

10 9

11 9

11 8

11 10

11 11

This should be the output for above sample dataset.

The comparison has to be made for d1 with all observation in d1, d2, d3 and like wise for d2 with all and d3 with all.

Hope you understand the problem.

Kurt_Bremser · Posted 09-14-2016 07:32 AM

Try this:

data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
run;

*first, transpose;

data have (keep=start end_o id director);
set test;
array dir {*} d1-d3;
do i = 1 to dim(dir);
  director = dir{i};
  output;
end;
run;

proc sort data=have;
by id director start;
run;

data want (keep=id director);
set have;
by id director;
oldend = lag(end_o);
if not first.director and start <= oldend then output;
run;

proc sort data=want nodupkey;
by id director;
run;

proc print data=want noobs;
run;

Result:

id    director

10       11   
10       13   
10       23   
11       11

Note that in your example result, ID 10 with director 9 and ID 11 with directors 8, 9 and 10 don't have overlaps.

Unless you define "overlap" differently.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

wizkid2050 · Posted 09-14-2016 09:30 AM

See in second observation d1 = 9
and in 5th observation d3 = 9

I want to check d1 with all variables( d1,d2,d3). and then d2...
and if there is a duplicate found then store them with the respective id.

Kurt_Bremser · Posted 09-15-2016 02:33 AM

@wizkid2050 wrote:

See in second observation d1 = 9
and in 5th observation d3 = 9

I want to check d1 with all variables( d1,d2,d3). and then d2...
and if there is a duplicate found then store them with the respective id.

So, if the dates play no role at all, why are you looking for "overlapping durations"??

You might consider to be more clear in the future.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Reeza · Posted 09-14-2016 06:20 AM

Can you show what you'd expect as output?

Tom_C_Mortensen · Posted 09-14-2016 07:26 AM

Hi try below code, from your output exsample I think this will work.

data test;

input start end_o id d1 d2 d3 ;

informat start date9. end_o date9. ;

format start date9. end_o date9. ;

datalines;

1jan95 20sep95 10 23 11 13

10feb96 21may98 10 9 8 10

20jan95 30apr95 10 11 17 16

1jan95 30dec95 10 23 11 43

3jan95 20may95 10 12 13 9

3feb10 28apr12 11 9 8 10

10feb99 21may00 11 9 8 10

20jan95 30apr00 11 11 17 16

1jan95 30dec95 11 23 11 43

;

proc print;

run;

proc sort data=test;

by start end_o id;

run;

proc transpose data=test out=test1(rename=(col1=director));

by start end_o id;

var d:;

run;

proc sort data=test1 out=test2(drop=_name_);

by id director start end_o id;

run;

Ksharp · Posted 09-14-2016 11:54 PM

That is not called overlap durations.
If I understood right. you just want which director have duplicated value 
in the same ID group.



 data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
run;

*first, transpose;

data have (keep= id director);
set test;
array dir {*} d1-d3;
do i = 1 to dim(dir);
  director = dir{i};
  output;
end;
run;
proc sql;
create table want as
 select *
  from have
   group by id,director
    having count(*) gt 1;
quit;

wizkid2050 · Posted 09-15-2016 02:43 AM

Excellent. This is exactly what i was looking for.

Thanks man.

Finding overlapping durations

Re: Finding overlapping durations

Re: BASE Sas Problem

Re: BASE Sas Problem

Re: Finding Overlapping Durations

Re: Finding Overlapping Durations

Re: Finding Overlapping Durations

Re: BASE Sas Problem

Re: BASE Sas Problem

Re: Finding overlapping durations

Re: Finding overlapping durations

SAS Innovate 2025: Call for Content

Classroom Training Available!