11-04-2015 02:44 PM
Hi! I need to join two tables by PatientID and EncDate from table A to the matching PatientID and EncDate in table B and remove all other observations in table B that are not matched to PatientID and EncDate in table A. SHould I rename variables in table B and then join? If so, what type of join would I use? This should be a side by side join. Thanks
11-04-2015 02:51 PM
If I'm understanding this correctly, this is just a left join in SQL, i.e.
PROC SQL; CREATE TABLE WANT AS SELECT A.* FROM HAVEA A LEFT JOIN HAVEB B ON (A.PATIENTID = B.PATIENTID AND A.ENCDATE = B.ENCDATE); QUIT;
This would keep all columns in dataset HAVEA. If you wanted to pull columns from HAVEB as well, you would just need to add them to the select statment.
11-04-2015 02:54 PM
Hi, thank you. What if i wanted to just select two columns in table A (PatiendID and EncDate) to merge with all columns in B (which are 100+)?
11-04-2015 02:58 PM
I haven't personally tried this way before, but assuming the two columns you want from A are on A and B (as they would need to be for the join to work this way in the first place) you should in theory be able to just change the select statement to
The FROM table (A) is still going to be the determinant of what to keep, given that you are left joining B. (untested, but this is what I would expect the behavior to be).
11-04-2015 02:53 PM
As a side note, I had this printed out and hanging on my office wall when I began using sql/PROC SQL. I thought it was a nice visual reminder.
11-04-2015 04:23 PM
CREATE TABLE ENC_Join AS
SELECT a.patID a.encdate1
FROM Caper.OBS_12ND as A
left join caper.encounterdate as b
on(a.patID = b.patID and a.encdate1= b.encdate1);
This is the error I'm getting.
1057 PROC SQL;
1058 CREATE TABLE ENC_Join AS
1059 SELECT a.patID, a.encdate1
1060 FROM Caper.OBS_12ND as A
1061 left join caper.encounterdate as b
1062 on(a.patID = b.patID and a.encdate1= b.encdate1);
ERROR: Sort initialization failure.
How can I resolve this error? Thanks!
11-04-2015 04:39 PM
I think you were missing a ", " between a.patID and a.encDate1 ...
PROC SQL; CREATE TABLE ENC_Join AS SELECT a.patID, a.encdate1 FROM Caper.OBS_12ND as A left join caper.encounterdate as b on(a.patID = b.patID and a.encdate1= b.encdate1); QUIT;
If you still get the error "Sort initialization failure", you might want to: (1) adjust memsize, or (2) add a "DISTINCT" in order to limit duplicate row output.
PROC SQL; CREATE TABLE ENC_Join AS SELECT DISTINCT a.patID, a.encdate1 FROM Caper.OBS_12ND as A left join caper.encounterdate as b on(a.patID = b.patID and a.encdate1= b.encdate1); QUIT;