Dealing mainly with the following varaibles:
Case_ind
Dateofvisit
PersonID
What i have done is define with other variables whether or not a doctor visit was related to my condition of interest (Case_ind 0/1). Now i need to pull out all other visits for the same person that are within 3 weeks before or after the visit for the condititon of interest (case_ind=1). I know how to do this with only 3 weeks after by using a retain statement in a data set by personID, dateofvisit, and case_ind but I am not sure how to grab the ones 3 weeks before?
Any help would be greatly appreciated!!
data have; informat obsid Case_ind best. Dateofvisit mmddyy10. Personid best.; format Dateofvisit mmddyy10. ; input obsid Case_ind Dateofvisit Personid ; datalines; 1 0 1/1/2017 1 2 0 1/4/2017 1 3 0 2/1/2017 1 4 0 3/20/2017 1 5 0 3/24/2017 1 6 1 4/3/2017 1 7 0 4/3/2017 1 8 0 4/4/2017 1 9 0 4/20/2017 1 10 0 5/11/2017 1 11 0 5/31/2017 1 12 0 7/25/2017 1 13 0 9/1/2017 1 14 0 9/2/2017 1 ; run; proc sql; create table want as select b.* from (select * from have where case_ind=1) as a left join have as b on a.Personid=b.Personid where b.dateofvisit between intnx('week',a.dateofvisit,-3,'s') and intnx('week',a.dateofvisit,+3,'s') order by b.personid,b.Dateofvisit ; quit;
Should do the basic time request.
You will have to provide an exampled data set, in the form of a data step similar to above, with an example of that "One thing I should mention is someone could have a case of the condition more than once, so we would need to take out the observations that are within 3 weeks of both cases even though they are for the same person." AND exactly what the result should look like.
There are so many ways that dates could overlap that you may be providing a number of examples. I'm envisioning Case_ind at two week intervals two or more times.
It may be that all that would be needed is Select Distinct b.* in the code above to remove duplicates depending on what your overlap rules actually entail.
Can you provide an example of the dataset you have, and (based on that example) what you want?
You can probably do it using a DOW loop, but I'd rather see what you have before suggesting code.
Art, CEO, AnalystFinder.com
Here is a very simple example of what I am looking for (data are confidential so I cannot provide an actual example).
Input data set:
OBS ID | Case_ind | Dateofvisit | Personid |
1 | 0 | 1/1/2017 | 1 |
2 | 0 | 1/4/2017 | 1 |
3 | 0 | 2/1/2017 | 1 |
4 | 0 | 3/20/2017 | 1 |
5 | 0 | 3/24/2017 | 1 |
6 | 1 | 4/3/2017 | 1 |
7 | 0 | 4/3/2017 | 1 |
8 | 0 | 4/4/2017 | 1 |
9 | 0 | 4/20/2017 | 1 |
10 | 0 | 5/11/2017 | 1 |
11 | 0 | 5/31/2017 | 1 |
12 | 0 | 7/25/2017 | 1 |
13 | 0 | 9/1/2017 | 1 |
14 | 0 | 9/2/2017 | 1 |
Ideal output dataset:
OBS ID | Case_ind | Dateofvisit | Personid |
4 | 0 | 3/20/2017 | 1 |
5 | 0 | 3/24/2017 | 1 |
6 | 1 | 4/3/2017 | 1 |
7 | 0 | 4/3/2017 | 1 |
8 | 0 | 4/4/2017 | 1 |
9 | 0 | 4/20/2017 | 1 |
This process would need to be repeated for all of the people in the dataset. One thing I should mention is someone could have a case of the condition more than once, so we would need to take out the observations that are within 3 weeks of both cases even though they are for the same person.
Thank you!!
data have; informat obsid Case_ind best. Dateofvisit mmddyy10. Personid best.; format Dateofvisit mmddyy10. ; input obsid Case_ind Dateofvisit Personid ; datalines; 1 0 1/1/2017 1 2 0 1/4/2017 1 3 0 2/1/2017 1 4 0 3/20/2017 1 5 0 3/24/2017 1 6 1 4/3/2017 1 7 0 4/3/2017 1 8 0 4/4/2017 1 9 0 4/20/2017 1 10 0 5/11/2017 1 11 0 5/31/2017 1 12 0 7/25/2017 1 13 0 9/1/2017 1 14 0 9/2/2017 1 ; run; proc sql; create table want as select b.* from (select * from have where case_ind=1) as a left join have as b on a.Personid=b.Personid where b.dateofvisit between intnx('week',a.dateofvisit,-3,'s') and intnx('week',a.dateofvisit,+3,'s') order by b.personid,b.Dateofvisit ; quit;
Should do the basic time request.
You will have to provide an exampled data set, in the form of a data step similar to above, with an example of that "One thing I should mention is someone could have a case of the condition more than once, so we would need to take out the observations that are within 3 weeks of both cases even though they are for the same person." AND exactly what the result should look like.
There are so many ways that dates could overlap that you may be providing a number of examples. I'm envisioning Case_ind at two week intervals two or more times.
It may be that all that would be needed is Select Distinct b.* in the code above to remove duplicates depending on what your overlap rules actually entail.
I'm hoping you can answer a post-solution question. Is there a way to add an indicatior to that code? I was hoping to have another indicator that tells me if the office visit was before or after the visit with the diagnosis (i.e. if the office visit occured before I would have place=-1 and if it happened after place=1). I have been messing around with codes but i cant get anything to work!
Thank you!
Since you said you have the 3 week time period after the condition of interest handled, you can do a similar process for the 3 weeks before.
As of now, the data is sorted by PersonID and Dateofvisit. You need to sort by PersonID and descending Dateofvisit and repeat the methodology you used to pull of the visits within 3 weeks. This will get the 3 weeks prior to the condition of interest.
I.e.
proc data data = test out = reverse_order;
by PersonID descending Dateofvisit;
run;
*Add the code you used to pull out the other visits within the 3 weeks time period backward;
data filter_3weeks_prior_condition;
set reverse_order;
/*Filter dataset here*/
run;
*Then, stack the 2 pulled out datasets, 3 weeks forward and 3 weeks backward;
data entire_condition_period;
set filter_3weeks_prior_condition
filter_3weeks_after_condition /*Your 3 weeks after condition dataset*/;
run;
*Sort the data again to get into proper order (ascending);
proc sort data = entire_condition_period;
by PersonID Dateofvisit;
run;
You will now have all of the visits 3 weeks prior and after for the IDs with the condition of interest.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.