I Have the following data sets: AllData, Death, and Live. AllData has more observations than Death and Live combined.
I want to have a data set of AllData excluding Death observations. I keep getting these errors when I'm using the code:
proc sql;
create table Live as
select * from AllData
except
select * from Death
;
quit;
WARNING: A table has been extended with null columns to perform the EXCEPT set operation.
ERROR: Column 5 from the first contributor of EXCEPT is not the same type as its counterpart from the second.
ERROR: Numeric expression requires a numeric format.
ERROR: Column 6 from the first contributor of EXCEPT is not the same type as its counterpart from the second.
The variable causing the errors is the DOB having the Types as follows:
I'm not sure why DOB in both datasets (imported from .xlsx with a 'date' type) are imported into SAS as Num and why are they causing this issue.
**I also used this code but it's giving AllData dataset without any death exclusions!
data live2;
set AllData;
if _n_ = 1
then do;
declare hash d (dataset:"Death");
d.definekey("Patient_Name");
d.definedone();
end;
if d.check() ne 0;
run;
Your help and support is greatly appreciated.
Thank you
That is simple to do:
proc sql;
create table Live as
select * from AllData
where Patient_Name not in (select Patient_Name from Death)
;
quit;
But depending on how clear (or really how consistent) your data is might not do exactly what you want.
At a minimum you might want to ignore the case of the letters in the Patient_Name variable.
where upcase(Patient_Name) not in (select upcase(Patient_Name) from Death)
ERROR: Column 5 from the first contributor of EXCEPT is not the same type as its counterpart from the second. ERROR: Column 6 from the first contributor of EXCEPT is not the same type as its counterpart from the second.
The mismatch in data type is in Column 5 and Column 6 in ALLDATA. Please examine those variables carefully in both data sets.
I had this in my proc contents for both datasets. I can't find any discrepancies. As I mentioned before, DOB seems to cause the issue but I'm not sure how!
Please don't be so stingy with your screen captures, show us more of the output from PROC CONTENTS so we can see the relevant parts including which variable is 5 in ALLDATA and which variable is 6 in ALLDATA, and including the part of the PROC CONTENTS output that indicates which data set is which..
As I mentioned before, DOB seems to cause the issue but I'm not sure how!
No! It's column 5 and column 6, one of which may be DOB, but there's another variable we need information on as well.
Thank you PaigeMiller.
Here's the proc contents of AllData
And the proc contents of Death:
Please let me know if I'm missing any other information.
Just be explicit about which variables you want to select. (Don't use the * shorthand).
But what is your criteria for saying there is a match?
Do you only need to have them match on SSN?
Or perhaps on NAME and DOB?
Hi Tom,
Thank you for the response.
For the second code, I tried it because I had the errors in the first code, I was trying to match on Patient_Name only.
That is simple to do:
proc sql;
create table Live as
select * from AllData
where Patient_Name not in (select Patient_Name from Death)
;
quit;
But depending on how clear (or really how consistent) your data is might not do exactly what you want.
At a minimum you might want to ignore the case of the letters in the Patient_Name variable.
where upcase(Patient_Name) not in (select upcase(Patient_Name) from Death)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.