data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
proc print;
run;
I have this sample data set
I need to find out the same director (d1,d2 or d3 ) being shown for overlapping duration in multiple rows for same id.
That is not called overlap durations. If I understood right. you just want which director have duplicated value in the same ID group. data test; input start end_o id d1 d2 d3 ; informat start date9. end_o date9. ; format start date9. end_o date9. ; datalines; 1jan95 20sep95 10 23 11 13 10feb96 21may98 10 9 8 10 20jan95 30apr95 10 11 17 16 1jan95 30dec95 10 23 11 43 3jan95 20may95 10 12 13 9 3feb10 28apr12 11 9 8 10 10feb99 21may00 11 9 8 10 20jan95 30apr00 11 11 17 16 1jan95 30dec95 11 23 11 43 ; run; *first, transpose; data have (keep= id director); set test; array dir {*} d1-d3; do i = 1 to dim(dir); director = dir{i}; output; end; run; proc sql; create table want as select * from have group by id,director having count(*) gt 1; quit;
Can you show what you'd expect as output?
ID Common_Directors
10 23,11,13,9
11 9,8,10,11
or
ID Common_directors
10 23
10 11
10 13
10 9
11 9
11 8
11 10
11 11
This should be the output for above sample dataset.
The comparison has to be made for d1 with all observation in d1, d2, d3 and like wise for d2 with all and d3 with all.
Hope you understand the problem.
Try this:
data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
run;
*first, transpose;
data have (keep=start end_o id director);
set test;
array dir {*} d1-d3;
do i = 1 to dim(dir);
director = dir{i};
output;
end;
run;
proc sort data=have;
by id director start;
run;
data want (keep=id director);
set have;
by id director;
oldend = lag(end_o);
if not first.director and start <= oldend then output;
run;
proc sort data=want nodupkey;
by id director;
run;
proc print data=want noobs;
run;
Result:
id director 10 11 10 13 10 23 11 11
Note that in your example result, ID 10 with director 9 and ID 11 with directors 8, 9 and 10 don't have overlaps.
Unless you define "overlap" differently.
See in second observation d1 = 9
and in 5th observation d3 = 9
I want to check d1 with all variables( d1,d2,d3). and then d2...
and if there is a duplicate found then store them with the respective id.
@wizkid2050 wrote:
See in second observation d1 = 9
and in 5th observation d3 = 9I want to check d1 with all variables( d1,d2,d3). and then d2...
and if there is a duplicate found then store them with the respective id.
So, if the dates play no role at all, why are you looking for "overlapping durations"??
You might consider to be more clear in the future.
Can you show what you'd expect as output?
Hi try below code, from your output exsample I think this will work.
data test;
input start end_o id d1 d2 d3 ;
informat start date9. end_o date9. ;
format start date9. end_o date9. ;
datalines;
1jan95 20sep95 10 23 11 13
10feb96 21may98 10 9 8 10
20jan95 30apr95 10 11 17 16
1jan95 30dec95 10 23 11 43
3jan95 20may95 10 12 13 9
3feb10 28apr12 11 9 8 10
10feb99 21may00 11 9 8 10
20jan95 30apr00 11 11 17 16
1jan95 30dec95 11 23 11 43
;
proc print;
run;
proc sort data=test;
by start end_o id;
run;
proc transpose data=test out=test1(rename=(col1=director));
by start end_o id;
var d:;
run;
proc sort data=test1 out=test2(drop=_name_);
by id director start end_o id;
run;
That is not called overlap durations. If I understood right. you just want which director have duplicated value in the same ID group. data test; input start end_o id d1 d2 d3 ; informat start date9. end_o date9. ; format start date9. end_o date9. ; datalines; 1jan95 20sep95 10 23 11 13 10feb96 21may98 10 9 8 10 20jan95 30apr95 10 11 17 16 1jan95 30dec95 10 23 11 43 3jan95 20may95 10 12 13 9 3feb10 28apr12 11 9 8 10 10feb99 21may00 11 9 8 10 20jan95 30apr00 11 11 17 16 1jan95 30dec95 11 23 11 43 ; run; *first, transpose; data have (keep= id director); set test; array dir {*} d1-d3; do i = 1 to dim(dir); director = dir{i}; output; end; run; proc sql; create table want as select * from have group by id,director having count(*) gt 1; quit;
Excellent. This is exactly what i was looking for.
Thanks man.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.