I am trying to put together some output displaying all subjects in the database with their medications and adverse events side by side, e.g. Subj, Med, AE. I need to use a join to pull the data from the CM and AE tables so I used the SUBJID in a left join (thinking this is correct).
The problem is that in the output, I am getting a mashup of every med lnked with every AE for each subject, which appears to be a cartesian join and not what I was expecting in the output.
The code is below, any suggestions in correcting this? I've plyed around with various other joins that Enterprise Guide allows but all seem to come up with a massive cartesian output.
CREATE TABLE SASUSER.QUERY_FOR_CM_SAS7BDAT_0002 AS
FROM EC100045.cm AS t1 LEFT JOIN EC100047.ae AS t2 ON (t1.SUBJID = t2.SUBJID)
WHERE t1.SUBJID = '201-008' AND t1.CMTRT NOT = '' AND t2.AETRT NOT = ''
ORDER BY t1.SUBJID, t1.CMTRT;
PROC SQL did what you asked it to do. To get the dataset you want with that data, you are going to have to code it with a DATA step. To use PROC SQL, you will need to add a 'row' number in addition to the subject ID to use in the LEFT JOIN.
As my father in law would say, PROC SQL in this case is doomed as doomed can be. You're talking about a many-to-many combination, which for PROC SQL always means an internal cartesian product.
As Doc suggested, you could do this with DATA step code although you'd have to really know what you're doing to come up with a way to get 002 CARDIZEM (null) instead of 002 CARDIZEM VOMITING.
One alternative if the data is small: you could go ahead with the Cartesian product, then on the back end use a sort data task with no duplicates based on the values of subjid cmtrt. Note this ability is only built into the sort data task in EG 4.2 -- in previous versions, you'd have to manually add the NODUPKEY option to the code yourself. This still doesn't get exactly what you want, in that you would see 002 CARDIZEM VOMITING instead of 002 CARDIZEM (null). And if your data is large and/or in a database, your system administrator / database administrator is going to kill me for even mentioning this. IT IS MASSIVELY INEFFICIENT!! You've been warned.