Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Variables are merged but not the actual values are not merged

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Variables are merged but not the actual values are not merged

Dear experts,

I have tried to merge some variables with numerical values from one table(table b) to another table(table a) using SQL left join (on a.year=b.year and a.codeA=b.codeA and a.codeB=b.codeB).

What it returns me is the table with the variables what I wanted from table b (let's say INCOME AGE etc) but no numerical values from table b. I see no error message in log.

What could be the reason for this results?

Best,

MB

-------------------------------

Thanks for your replies, experts.

I have attached the data sets (table a and table b) as well as result table where I am having trouble mentioned in the original discussion).

SAS Code I used is as the following;

proc sql; create table sample_result

as select a.*, b.edu, b.mhincome, b.mage

from sample_a as a left join sample_b as b

on a.year=b.year and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

Thank you for your help!

MB

Attachment
Attachment
Attachment

Accepted Solutions
Solution
‎01-30-2014 03:09 PM
Super Contributor
Posts: 307

Re: Variables are merged but not the actual values are not merged

There are two potential problems.

You are joining table B to table A using a LEFT JOIN. This means you will get all records in table A even if there are no matching rows in table B. This also means that you will not get rows in table B that do NOT exist in table A. This is fine if this is what you want.

For example, table A does not have a row where "state_code = 55 and co_code = 87 and year=1980". But, there is a corresponding row in table B. But, because you are left joining B to A, the resulting table will not have a row where "state_code = 55 and co_code = 87 and year=1980".

Your second problem is probably why you are not getting the results you expect. Your two tables have inconsistent formats for variables in table A vs table B. In particular, the year variable in table B has a YEARw format whereas the same variable in table A is formatted BEST12. The table B format is interpreting the year variable as a date value.

Try this:

proc sql; create table sample_result2

as select a.*, b.edu, b.mhincome, b.mage

from tmp2.sample_a as a

left join tmp3.sample_b as b

on put(a.year, 4.0) = put(b.year, year4.) and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

View solution in original post


All Replies
Super User
Posts: 5,498

Re: Variables are merged but not the actual values are not merged

Well, we're looking at neither your code nor your data so it's a bit of a guess.  Here are a couple of basic things to check.

Does Table B actually contain non-missing data for INCOME, AGE, etc.?

Since you are performing a LEFT JOIN, is it possible you are getting records from Table A that have no match in Table B?

There are plenty of SQL gurus lurking here who might have better ideas if they could see your code.

Good luck.

Super User
Posts: 19,772

Re: Variables are merged but not the actual values are not merged

You probably did something wrong.  But we can't tell you what because you haven't told us what you did Smiley Happy

Solution
‎01-30-2014 03:09 PM
Super Contributor
Posts: 307

Re: Variables are merged but not the actual values are not merged

There are two potential problems.

You are joining table B to table A using a LEFT JOIN. This means you will get all records in table A even if there are no matching rows in table B. This also means that you will not get rows in table B that do NOT exist in table A. This is fine if this is what you want.

For example, table A does not have a row where "state_code = 55 and co_code = 87 and year=1980". But, there is a corresponding row in table B. But, because you are left joining B to A, the resulting table will not have a row where "state_code = 55 and co_code = 87 and year=1980".

Your second problem is probably why you are not getting the results you expect. Your two tables have inconsistent formats for variables in table A vs table B. In particular, the year variable in table B has a YEARw format whereas the same variable in table A is formatted BEST12. The table B format is interpreting the year variable as a date value.

Try this:

proc sql; create table sample_result2

as select a.*, b.edu, b.mhincome, b.mage

from tmp2.sample_a as a

left join tmp3.sample_b as b

on put(a.year, 4.0) = put(b.year, year4.) and a.state_code=b.state_code and a.co_code=b.co_code;

quit;

Super User
Posts: 19,772

Re: Variables are merged but not the actual values are not merged

To add to Fuge response, its that the underlying data is actually different.

proc print data=sample_b;

format year;

var year;

run;

Occasional Contributor
Posts: 15

Re: Variables are merged but not the actual values are not merged

Thank you Reeza! Yes, I can now see that it was creating the trouble. I learned one lessen today Smiley Happy

Occasional Contributor
Posts: 15

Re: Variables are merged but not the actual values are not merged

Thank you Fugue! I was aware of the first concern but your second concern and suggestion actually made it work!

Thanks for your help.

MB

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 510 views
  • 6 likes
  • 4 in conversation