BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Markov
Calcite | Level 5

Dear experts,

I have tried to merge some variables with numerical values from one table(table b) to another table(table a) using SQL left join (on a.year=b.year and a.codeA=b.codeA and a.codeB=b.codeB).

What it returns me is the table with the variables what I wanted from table b (let's say INCOME AGE etc) but no numerical values from table b. I see no error message in log.

What could be the reason for this results?

Best,

MB

-------------------------------

Thanks for your replies, experts.

I have attached the data sets (table a and table b) as well as result table where I am having trouble mentioned in the original discussion).

SAS Code I used is as the following;

proc sql; create table sample_result

as select a.*, b.edu, b.mhincome, b.mage

from sample_a as a left join sample_b as b

on a.year=b.year and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

Thank you for your help!

MB

1 ACCEPTED SOLUTION

Accepted Solutions
Fugue
Quartz | Level 8

There are two potential problems.

You are joining table B to table A using a LEFT JOIN. This means you will get all records in table A even if there are no matching rows in table B. This also means that you will not get rows in table B that do NOT exist in table A. This is fine if this is what you want.

For example, table A does not have a row where "state_code = 55 and co_code = 87 and year=1980". But, there is a corresponding row in table B. But, because you are left joining B to A, the resulting table will not have a row where "state_code = 55 and co_code = 87 and year=1980".

Your second problem is probably why you are not getting the results you expect. Your two tables have inconsistent formats for variables in table A vs table B. In particular, the year variable in table B has a YEARw format whereas the same variable in table A is formatted BEST12. The table B format is interpreting the year variable as a date value.

Try this:

proc sql; create table sample_result2

as select a.*, b.edu, b.mhincome, b.mage

from tmp2.sample_a as a

left join tmp3.sample_b as b

on put(a.year, 4.0) = put(b.year, year4.) and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

View solution in original post

6 REPLIES 6
Astounding
PROC Star

Well, we're looking at neither your code nor your data so it's a bit of a guess.  Here are a couple of basic things to check.

Does Table B actually contain non-missing data for INCOME, AGE, etc.?

Since you are performing a LEFT JOIN, is it possible you are getting records from Table A that have no match in Table B?

There are plenty of SQL gurus lurking here who might have better ideas if they could see your code.

Good luck.

Reeza
Super User

You probably did something wrong.  But we can't tell you what because you haven't told us what you did Smiley Happy

Fugue
Quartz | Level 8

There are two potential problems.

You are joining table B to table A using a LEFT JOIN. This means you will get all records in table A even if there are no matching rows in table B. This also means that you will not get rows in table B that do NOT exist in table A. This is fine if this is what you want.

For example, table A does not have a row where "state_code = 55 and co_code = 87 and year=1980". But, there is a corresponding row in table B. But, because you are left joining B to A, the resulting table will not have a row where "state_code = 55 and co_code = 87 and year=1980".

Your second problem is probably why you are not getting the results you expect. Your two tables have inconsistent formats for variables in table A vs table B. In particular, the year variable in table B has a YEARw format whereas the same variable in table A is formatted BEST12. The table B format is interpreting the year variable as a date value.

Try this:

proc sql; create table sample_result2

as select a.*, b.edu, b.mhincome, b.mage

from tmp2.sample_a as a

left join tmp3.sample_b as b

on put(a.year, 4.0) = put(b.year, year4.) and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

Reeza
Super User

To add to Fuge response, its that the underlying data is actually different.

proc print data=sample_b;

format year;

var year;

run;

Markov
Calcite | Level 5

Thank you Reeza! Yes, I can now see that it was creating the trouble. I learned one lessen today Smiley Happy

Markov
Calcite | Level 5

Thank you Fugue! I was aware of the first concern but your second concern and suggestion actually made it work!

Thanks for your help.

MB

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1230 views
  • 6 likes
  • 4 in conversation