I am combine two data sets based on a patient's date of surgery falling between +/- 1 month of when a patient completed their survey.
dataset abc has variables lastname, firstname, dob, and survey_completed
dataset xyz has variables last, first, date_birth, and date_surgery
Here's my code:
proc sql;
create table want as
select distinct t1.*,
intnx('month', t1.Survey_Completed, -1, 'sameday') as first_survey format=date9.,
intnx('month', t1.survey_completed, 1, 'sameday') as last_survey format=date9.,
t2.last,
t2.first,
t2.date_birth,
t2.date_surgery as DOS format date9.
from abc as t1
left join xyz as t2
on t1.dob=t2.date_birth and t1.last_name=t2.last and t1.first_name=t2.first
where DOS between first_survey and last_survey
order by last_name;
quit;
with the above code I am getting the following error message:
ERROR: Expression using BETWEEN has components that are of different data types.
ERROR: The following columns were not found in the contributing tables: first_survey, last_survey.
What am i doing wrong?
t2.date_surgery as DOS format date9. <- missing the equal sign here
What's the type/format on date_surgery before the query? I suspect it's a character variable and you'd need to convert it to a numeric variable and a SAS date first.
type is numeric and format is date9.
You must use the calculated keyword when referring to a calculated column.
where DOS between calculated first_survey and calculated last_survey
It works after using 'calculated'.
Thank you very much.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.