11-29-2016 06:51 PM
I have a database of time periods that represent different environments (the same study subject can have multiple time periods depending on how many environments they were in). I have a second dataset containing clinic visits and the corresponding dates of those visits. I would like to input these clinic visits (and dates) into the appropriate time periods (delineated by startdate and enddate). To do this, I have a code like this, which sort of works:
/*combine time periods and visit datasets together, keep records where visit date between start and end dates for each period*/
create table allvisits as
select i.*, v.*
from kelsey.periods i full outer join kelsey.clinicvisits v
where i.startdate<=v.visitdate<=i.enddate ;
The problem is that there are sometimes time periods where no clinic visits took place during those time periods; the code below results in those time periods getting dropped and I don't want to do that because I am also interested in situations were no clinic visit was done. Is there a way to modify this code so that for records where there is no visit date that corresponds to a time period's start and end dates, I can still retain that time period and make the visitdate=missing? Thank you in advance.
11-29-2016 07:06 PM
You could but your probably better off handling that in your end reporting. Depending on the proc used to summarize results you may be able to add the time period in at that point more easily.
11-29-2016 11:02 PM
It is not clear why you use a full outer join. I would do:
proc sql; create table allvisits as select i.*, v.subjectId as visitSubjectId, v.visitdate /* More visits fields */ from kelsey.periods as i left join kelsey.clinicvisits as v on i.subjectid = v.subjectid and v.visitdate between i.startdate and i.enddate; quit;