DATA Step, Macro, Functions and more

Join tables and drop observations

Reply
Frequent Contributor
Posts: 121

Join tables and drop observations

Hi!  I need to join two tables by PatientID and EncDate from table A to the matching PatientID and EncDate in table B and remove all other observations in table B that are not matched to PatientID and EncDate in table A.    SHould I rename variables in table B and then join?  If so, what type of join would I use?  This should be a  side by side join.  Thanks

 

 

Contributor
Posts: 55

Re: Join tables and drop observations

If I'm understanding this correctly, this is just a left join in SQL, i.e.

 

PROC SQL;
  CREATE TABLE WANT AS
    SELECT A.*
    FROM HAVEA A
        LEFT JOIN HAVEB B
          ON (A.PATIENTID = B.PATIENTID AND A.ENCDATE = B.ENCDATE);
QUIT;

This would keep all columns in dataset HAVEA. If you wanted to pull columns from HAVEB as well, you would just need to add them to the select statment.

Frequent Contributor
Posts: 121

Re: Join tables and drop observations

Hi, thank you.  What if i wanted to just select two columns in table A (PatiendID and EncDate) to merge with all columns in B (which are 100+)?

Contributor
Posts: 55

Re: Join tables and drop observations

I haven't personally tried this way before, but assuming the two columns you want from A are on A and B (as they would need to be for the join to work this way in the first place) you should in theory be able to just change the select statement to 

 

 

SELECT B.*

The FROM table (A) is still going to be the determinant of what to keep, given that you are left joining B. (untested, but this is what I would expect the behavior to be).

Contributor
Posts: 55

Re: Join tables and drop observations

As a side note, I had this printed out and hanging on my office wall when I began using sql/PROC SQL. I thought it was a nice visual reminder.

Frequent Contributor
Posts: 121

Re: Join tables and drop observations

 

 

PROC SQL;

CREATE TABLE ENC_Join AS

SELECT a.patID a.encdate1

FROM Caper.OBS_12ND as A

left join caper.encounterdate as b

on(a.patID = b.patID and a.encdate1= b.encdate1);

quit;

 

This is the error I'm getting.

 

1057 PROC SQL;

1058 CREATE TABLE  ENC_Join AS

1059 SELECT a.patID, a.encdate1

1060 FROM Caper.OBS_12ND as A

1061 left join caper.encounterdate as b

1062 on(a.patID = b.patID and a.encdate1= b.encdate1);

ERROR: Sort initialization failure.

 

How can I resolve this error? Thanks!

Contributor hbi
Contributor
Posts: 66

Re: Join tables and drop observations

I think you were missing a ", " between a.patID and a.encDate1 ...

 

 

PROC SQL;
  CREATE TABLE ENC_Join AS
  SELECT a.patID, a.encdate1
  FROM Caper.OBS_12ND as A
  left join caper.encounterdate as b
    on(a.patID = b.patID and a.encdate1= b.encdate1);
QUIT;

 

If you still get the error "Sort initialization failure", you might want to: (1) adjust memsize, or (2) add a "DISTINCT" in order to limit duplicate row output.  

 

PROC SQL;
  CREATE TABLE ENC_Join AS
  SELECT DISTINCT a.patID, a.encdate1
  FROM Caper.OBS_12ND as A
  left join caper.encounterdate as b
    on(a.patID = b.patID and a.encdate1= b.encdate1);
QUIT;
Ask a Question
Discussion stats
  • 6 replies
  • 343 views
  • 1 like
  • 3 in conversation