11-18-2015 08:37 AM
I'm combining 2 files by an ID variable. Each of the input files has approximately 5000 records, and none of them have missing data for the ID variable. The output file has about 7400 records, about 800 of which have missing data for the ID variable. How can that be?
Create Table &state..&state.&year._IMG8_matched_DX_IMAGED
as Select a.MSIS_ID, a.DX, b.MSIS_ID, b.IMGPROC, a.SRVC_BGN_DT, b.SRVC_BGN_DT,
a.SRVC_BGN_DT as DxDatenum,
b.SRVC_BGN_DT as IMGDatenum
from &state..ipot&year._&state._IMGmeas8_DX as a
RIGHT JOIN &state..ipot&year._&state._IMGmeas8_IMAGED as b
ON a.MSIS_ID = b.MSIS_ID;
11-18-2015 09:00 AM
11-18-2015 09:06 AM - edited 11-18-2015 09:06 AM
Which MISS_ID variables are you talking about:
as Select a.MSIS_ID, a.DX, b.MSIS_ID,
I will assume it is A.MSIS_ID. The reason is your right joining the A dataset onto the B dataset, all records from B will be present, even if there is no match in A. Consider:
If I right join A to B, 2-2, 3-3, -1, so B's id is present, but A has no match and so becomes missing. What you should do is either use B.MSIS_ID as that will always be present.
11-18-2015 12:23 PM