I have been joining several tables one step at a time as I build my project (e.g., table A to Table B; Table C to Table D; etc.). I am at the point where I am joining those table to each other (e.g., Table AB to Table CD, etc.). Below is my code for two of those tables and the results are good. However, I use the exact same coding all the way through and on other tables and I get the join but the volume of lines result in about 1600 more lines than the two table combined contain. I am not sure why and have been looking for a reason. Any thoughts?
PROC SQL;
CREATE TABLE WORK.QUERY_JOIN7(label="QUERY_JOIN7") AS
SELECT /* REFERRED_DATE */
(IFN(t1.REFERRED_DATE<>.,t1.REFERRED_DATE,t2.REFERRED_DATE)) FORMAT=MMDDYYS10. LABEL="REFERRED DATE" AS
REFERRED_DATE,
/* STATE */
(IFC(t1.STATE<>" ",t1.STATE,t2.STATE)) LABEL="STATE" AS STATE,
/* PRODUCT_TYPE */
(IFC(t1.PRODUCT_TYPE<>" ",t1.PRODUCT_TYPE,t2.PRODUCT_TYPE)) LABEL="PRODUCT TYPE" AS PRODUCT_TYPE,
/* LOB */
(IFC(t1.LOB<>" ",t1.LOB,t2.LOB)) LABEL="LOB" AS LOB,
/* REFERRAL_TYPE */
(IFC(t1.REFERRAL_TYPE<>" ",t1.REFERRAL_TYPE,t2.REFERRAL_TYPE)) LABEL="REFERRAL TYPE" AS REFERRAL_TYPE,
/* VOLUME */
(1) FORMAT=BESTX10. LABEL="VOLUME" AS VOLUME
FROM WORK.QUERY_JOIN5 t1
NATURAL FULL JOIN WORK.QUERY_JOIN6 t2;
QUIT;
Thanks!
You're not 'joining' use an APPEND Task instead.
A brief example may help:
data one; input a b; datalines; 1 1 1 2 1 3 2 4 2 5 ; run; data two; input a c; datalines; 1 1 1 2 3 3 2 4 2 5 ; run; proc sql; select one.a, one.b, two.c from one natural full join two; quit;
Two data sets of 5 records and the join has 11 in the result because it includes values from the second set that do not have a match under the "natural full" join.
Perhaps you want to specify
proc sql; select one.a, one.b, two.c from one natural join two; quit;
Note that "full" is not present.
Without seeing some example data and the desired result it is hard to be more specific.
Thanks ballardw. I tried your suggestion with the following code.
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_QUERY_JOIN2_0000 AS
SELECT /* STATE */
(IFC(t1.STATE<>" ",t1.STATE,t2.STATE)) LABEL="STATE" AS STATE
FROM WORK.QUERY_JOIN2 t1
NATURAL JOIN WORK.QUERY_JOIN4 t2;
QUIT;
However, now I get way too few lines of output (2030 when I should have 61,266). Not sure where I am going wrong.
In general you get 'more' records when you have duplicates along your join variables.
So run a proc freq on the join fields in each table and see where you have duplicates. If you have two in one table that matches with two on another table you end up with 4.
Thanks Reeza...I understand what you are saying and that is what I want (essentially merging the two tables (both have the same columns) together. However, when I join them the 2 + 2 = 5 rows of data instead of 4.
Can you post an example, simplified that replicates your issue?
Also, you should look up the COALESCE() function, it's easier to work with IMO.
I am really trying to get the natural full outer join (the output rows include all rows from both tables and rows that do not have a match are filled in with missing values). This has worked fine in my earlier joins, but is giving me extra rows of data in my later joins.
I have tables A and B below. I want to combine them to get table C below.
Table A | |||
REFERRED_DATE | STATE | PRODUCT_TYPE | LOB |
07/04/2016 | WI | CLASSIC | AUTO |
07/04/2016 | KS | CLASSIC | AUTO |
07/04/2016 | OH | CLASSIC | AUTO |
Table B | |||
REFERRED_DATE | STATE | PRODUCT_TYPE | LOB |
07/06/2016 | MN | CLASSIC | AUTO |
07/06/2016 | MN | CLASSIC | AUTO |
07/06/2016 | AZ | CLASSIC | AUTO |
Table C | |||
REFERRED_DATE | STATE | PRODUCT_TYPE | LOB |
07/04/2016 | WI | CLASSIC | AUTO |
07/04/2016 | KS | CLASSIC | AUTO |
07/04/2016 | OH | CLASSIC | AUTO |
07/06/2016 | MN | CLASSIC | AUTO |
07/06/2016 | MN | CLASSIC | AUTO |
07/06/2016 | AZ | CLASSIC | AUTO |
You're not 'joining' use an APPEND Task instead.
Thanks Reeza...you are correct! Sometimes being too close doesn't allow one to see the forest that is right in front of one!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.