BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dustinsaunders
Fluorite | Level 6

I have a task to use SAS to write some data aggregation in SAS Studio University.  I've been using SQL Server and MySQL for over a decade.

 

However... I'm trying to do a simple self join on a table and I have no idea how to explain the results I'm getting....

 

I ended up going all the way down to a small table to figure out what is going on with 1 column called 'Gender' and 2 rows 'Male' and 'Female':

 

select
	m.Gender,
	f.Gender
from DATA.GENDERS m
inner join DATA.GENDERS f 
	on f.Gender = 'Female'
where m.Gender = 'Male';

However, the results I get are:

 GenderGender
1MaleMale
2MaleMale

 

I've tried several different ways of putting it together and it seriously doesn't make any sense why I'm getting this join result?  It's driving me crazy!

On a side note...  it looks like INNER JOINs return null values?

 

Thanks...

1 ACCEPTED SOLUTION

Accepted Solutions
dustinsaunders
Fluorite | Level 6

Well, I figured out what is going on here, I was getting the issue because it was creating a view and I didn't have an alias for the columns which is invalid.  What is interesting is that it proceeds with the view creation but uses only the first instance of the column name rather than throwing a warning.

 

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;
create view GENDERS_JOINED as
select
   m.Gender,
   f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

This is resulting in the following:

result.png

However, providing an alias for the fields it doesn't create the issue:

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;
create view GENDERS_JOINED as
select
   m.Gender as M_G,
   f.Gender as F_G
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

dustinsaunders_0-1587511902664.png

If you aren't creating the view with no alias it still works fine as expected:

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;

select
   m.Gender,
   f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

worked.png

View solution in original post

9 REPLIES 9
dustinsaunders
Fluorite | Level 6

For reference, here's the exact same query in SQL Server.  Do the joins just work completely different that standard TSQL?

 

sql_query.png

Tom
Super User Tom
Super User

Are you sure those are the values?  Do you have a format attached to the variable?

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;
select
   m.Gender as M_Gender
 , f.Gender as F_Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male'
;
quit;
M_Gender    F_Gender
----------------------
Male        Female

 

Tom
Super User Tom
Super User

What is the actual problem that lead you to create this crazy example?
What real life situation would you have where you use a join condition that only referenced values from one of the tables?

dustinsaunders
Fluorite | Level 6

Thanks for the response.  The type for the field is CHAR(13).

 

Self joins aren't uncommon in the normal breadth of SQL work I do, but to elaborate the data actually looks more like this:

 GenderQuestionResult
1MaleA100
2MaleB45
3FemaleA66

 

So the join is Male to Female to compare the differences in values for their responses to Questions A, B, C, etc...

So the join is more like:

f.Gender='Female' and f.Question = m.Question

but even going back to just the one column I am getting the unexpected result.

dustinsaunders
Fluorite | Level 6

Well, I figured out what is going on here, I was getting the issue because it was creating a view and I didn't have an alias for the columns which is invalid.  What is interesting is that it proceeds with the view creation but uses only the first instance of the column name rather than throwing a warning.

 

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;
create view GENDERS_JOINED as
select
   m.Gender,
   f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

This is resulting in the following:

result.png

However, providing an alias for the fields it doesn't create the issue:

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;
create view GENDERS_JOINED as
select
   m.Gender as M_G,
   f.Gender as F_G
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

dustinsaunders_0-1587511902664.png

If you aren't creating the view with no alias it still works fine as expected:

data genders;
  input gender $10.;
cards;
Male
Female
;

proc sql;

select
   m.Gender,
   f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';

worked.png

Tom
Super User Tom
Super User

You cannot have two variables with the same name in a DATASET (or a view which has to mimic a dataset structure). 

 

But you can, like in both your example and mine, make a REPORT from PROC SQL that is pulling values FROM two variables that USED TO have the same name, since that doesn't make an actual dataset .

 

So you essentially ran:

select gender,gender from have;

 

Kurt_Bremser
Super User

To give you even more food for thought, run this:

data genders;
input gender $;
datalines;
Female
Male
;

proc sql;
create view GENDERS_JOINED as
select
   m.Gender,
   f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender='Male';
quit;

data test1;
set genders_joined;
run;

proc sql;
create table test2 as
select * from genders_joined;
quit;

and look at the two resulting tables.

Kurt_Bremser
Super User

When I run this:

data genders;
input gender $;
datalines;
Female
Male
;

proc sql;
select
	m.Gender,
	f.Gender
from GENDERS m
inner join GENDERS f 
	on f.Gender = 'Female'
where m.Gender = 'Male';
quit;

I get this:

gender	gender
Male	Female

so your data has to be different from what you think.

dustinsaunders
Fluorite | Level 6

Hello Kurt,

 

I found the issue in an earlier post-- the problem was with creating the query as view without an alias which is invalid.  However, instead of throwing an error SAS will generate the view but use the first instance of the key for all results.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 674 views
  • 0 likes
  • 3 in conversation