10-25-2015 10:20 PM
I am trying to merge two datasets according to a unique customer IDs (NO_CUST) and date (DATE). My data consists of usage data on TV (table 1) and web applications (table 2) and I want to merge customer usage from both tables only for those in table 1 and drop any users from table 2 that are not in table 1.
I am running a LEFT join using PROC SQL, however, my original t1 sample has 30k unique IDs, and once the tables are merged, I am loosing 20k of those users although I want to keep them and lose those in t2.
Here is the code, any idea what is wrong here ?
CREATE TABLE LIB.USAGE_TV_WEB AS
FROM LIB.MERGE_TV t1
LEFT JOIN LIB.MERGE_WEB t2 ON (t1.NO_CUST = t2.NO_CUST) AND (t1.DATE=t2.DATE);
10-25-2015 10:43 PM
Puzzling... Make sure you are looking at the right version of your tables. After running your query, try
SELECT COUNT(*) AS N_TV FROM LIB.MERGE_TV;
SELECT COUNT(*) AS N_WEB FROM LIB.MERGE_WEB;
SELECT COUNT(*) AS N_USAGE FROM LIB.USAGE_TV_WEB;
N_USAGE should be >= N_TV
10-25-2015 10:55 PM - edited 10-25-2015 10:58 PM
Thank you so much for your reply.
Once running the sql above, and / or filtering for missing values in the customer account # (NO_CLI), N_USAGE = N_TV
Is there a reason why the LEFT JOIN SQL keeps these missing values in my original outout ?
10-25-2015 11:01 PM
Just a suggestion.....
Have you considered merging using a SAS MERGE? Not that it should matter but it will be easy to view the counts from the logs... not so much with the SQL... If you are able to get the counts that PG recommended, it might bring to light if there is anything missing. The SQL looks good to me... can't spot anything....
10-25-2015 11:12 PM
Thank you very much PG, with the counts and your latest reply, the issue seems to be solved.
I drop all missing values in a seperate SQL and I have the same counts between my original TV table and the final merge with my web table.
Thank you very much for your help and quick reply !
Need further help from the community? Please ask a new question.