Hello,
I am trying to figure out how to get an aggregate count of the number of times a subject has more than one visit within any 7 day window for my entire data set (using SAS 9.4). Subjects can have varying numbers of visits during the study time period, and have a single, unique visit date associated with each visit. Essentially, I need to count the number of times two visits occur within seven days of each other for a given patient, and then aggregate this for all patients. The seven day period can be at any point during the study. A manufactured example of my data is as follows:
Observation Subject Visit Date Visit #
1 1 18765 1
2 1 18789 2
3 2 18759 1
4 2 18792 2
5 2 18760 3
For the hypothetical data above, there would only be one instance where two visits occur within 7 days of each other for a given patient (observations 3 & 5), and this is what I am attempting to quantify. I have tried a few different approaches (e.g. transposing the data, using arrays, using the DIF function) but nothing complete enough to post that resulted in any progress, and I am not yet a skilled enough programmer to solve this on my own. If anyone could offer any advice, it would be greatly appreciated! Forgive me if this answer is already posted somewhere; I researched this issue for several hours and have so far come up empty. Thank you!
proc sort data=have; by subject visit_date; run; data counts; set have; by subject visit_date; counter=dif(Visit_date)<=7 and not first.subject; run; proc summary data=counts; var counter; class subject; output out=sum sum=; run;
It may help to provide slightly more data, some where you have the same subject with 2 different intervals that meet your rule AND show what the resulting data set should look like.
Since you seem to need to look at many same subject pairs then Proc Sql may come into play.
It will help if you provide the data as a datastep. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
If a subject has visits on 18765, 18766 and 18767 how much do you want the count to increase by?
Art, CEO, AnalystFinder.com
art297,
Thanks for your reply and great question. If a subject had the proposed dates for visits, I would want the count to increase by 2: once for the period from 18765-18771 (3 visits in 7-day period), and once for the period from 18766-18772 (2 visits in 7 day period). The count would not need to increase by 2 for the period from 19765-18771, despite the fact that there are more visits within that period. I've adjusted the sample data I provided to reflect sorting by subject, then visit date, and also provided a sample of what I want the data to look like (note that the 7-day window can be prospective or retrospective, but I believe sorting by visit dates takes care of this):
Observation Subject Visit Date Visit #
1 1 18765 1
2 1 18789 2
3 2 18759 1
4 2 18760 2
5 2 18792 3
6 3 18765 1
7 3 18766 2
8 3 19767 3
Desired Data:
Observation Subject Visit Date Visit # Multiple Visit Count
1 1 18765 1 0
2 1 18789 2 0
3 2 18759 1 1
4 2 18760 2 1
5 2 18792 3 1
6 3 18765 1 2
7 3 18766 2 3
8 3 19767 3 3
Again, the multiple visit count would be cumulative for the entire data set, by subject. Thank you!
You'll have to explain why the multiple visits are cumulative across subjects. That seems to contradict your requirements.
Art, CEO, AnalystFinder.com
Correct, they are cumulative across patients in the example I provided because our objective is to sum this value to indicate all instances in which multiple visits occurred within a given 7-day window for our entire sample, but the Multiple Visit Count variable does not need to be set up this way if it will simplify the process. I could always sum the maximum value for this variable across all subjects. Hope this helps to clarify. Thanks.
Given your value of multiple visit count in the last row of desirec output, I think you meant the last value of visit_date to be 18767, not 19767. If so, I see no need for a PROC summary, since all you wnat is a global count of visit dates within 7 days of same-subject next visit date, which is done here:
data have;
input Observation Subject Visit_Date Visit_num;
format visit_date date9.;
put (_all_) (=);
datalines;
1 1 18765 1
2 1 18789 2
3 2 18759 1
4 2 18760 2
5 2 18792 3
6 3 18765 1
7 3 18766 2
8 3 18767 3
run;
data _null_;
merge have
have (firstobs=2 keep=visit_date visit_num
rename=(visit_date=next_date visit_num=next_vnum))
end=end_of_have;
if end_of_have=0 then mult_visit+(next_date <= visit_date+7) and (next_vnum^=1);
else put mult_visit=;
run;
What happens if you have a visit on day one, day 3, day 5 and day 8.
Day 1, Day 3, and Day 5 are within 7 days.
Day 3, 5, 8 are within a different 7 days.
proc sort data=have; by subject visit_date; run; data counts; set have; by subject visit_date; counter=dif(Visit_date)<=7 and not first.subject; run; proc summary data=counts; var counter; class subject; output out=sum sum=; run;
Since you only need the overall total, I think @s_lassen's approach does what you need. The only thing I might change is the proc summary statement:
proc summary data=counts; var counter; class subject; output out=sum (where=(_type_ eq 0)) sum=; run;
That way you'll only get the one record that provides the total number of instances where your condition is met.
Art, CEO, AnalystFinder.com
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.