self join

Reply
Contributor BSL
Contributor
Posts: 25

self join

dear experts,

please help me to understand self join in as i have searched on google as well but couldn't get whats happening in the examples like given below:

data roster;

do i = 1 to 3e4;

drop i;

name = i;

phone = i + 0.1;

email = i + 0.2;

output;

if ranuni(111) > 0.8 then name = name + 0.01;

if ranuni(111) > 0.8 then output;

end;

run;

proc sql stimer;

create table same_phone as

select distinct roster.name, copy.name as diff_name

from roster join roster as copy

on   roster.phone = copy.phone

;

quit;

-anand

Super User
Super User
Posts: 7,401

Re: self join

Not actually seen just "join" used on its own like that.  I imagine that defaults to inner join, though check that.  Plenty of guidance online about joins, what they do, etc:

http://en.wikipedia.org/wiki/Join_(SQL)

http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/

http://www.w3schools.com/sql/sql_join.asp

More importantly though, what is it you are trying do, i.e. why have you put join rather than left, inner etc.

Super User
Posts: 17,828

Re: self join

There are many reasons why a SELF JOIN is useful, so listing them all would be difficult.

In general its a way to look up information within the same table, in this case, for records with the same phone number, add the other name to the record for easy comparison. Note that the name will be added to BOTH records with the same phone number. You may want to check what happens when you have multiple matches.

Regular Contributor
Posts: 184

Re: self join

Thought this code looked familiar. See http://www2.sas.com/proceedings/sugi31/242-31.pdf page 16.

Respected Advisor
Posts: 4,647

Re: self join

If I understand what your are trying to show, you should add a second condition to extract only the phone numbers that are shared by two different names:

data roster;

do i = 1 to 30;

    name = i;

    phone = i + 0.1;

    email = i + 0.2;

    output;

    if ranuni(111) > 0.8 then do;

        name = name + 0.01;

        output;

        end;

    end;

drop i;

run;

proc sql stimer;

create table same_phone as

select roster.phone, roster.name, copy.name as diff_name

from roster join roster as copy

on  roster.phone = copy.phone and

      roster.name < copy.name

;

quit;

Note that you could extract equivalent information, albeit not in the same table shape, with the query:

proc sql stimer;

create table same_phone_2 as

select phone, name

from roster

group by phone

having count(*) > 1;

quit;

PG

PG
Ask a Question
Discussion stats
  • 4 replies
  • 343 views
  • 0 likes
  • 5 in conversation