SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
weatherm
Calcite | Level 5

Hello,

 

I am trying to figure out how to get an aggregate count of the number of times a subject has more than one visit within any 7 day window for my entire data set (using SAS 9.4). Subjects can have varying numbers of visits during the study time period, and have a single, unique visit date associated with each visit. Essentially, I need to count the number of times two visits occur within seven days of each other for a given patient, and then aggregate this for all patients. The seven day period can be at any point during the study. A manufactured example of my data is as follows:

 

Observation    Subject    Visit Date    Visit #

 1                       1              18765         1

 2                       1              18789         2

 3                       2              18759         1

 4                       2              18792         2

 5                       2              18760         3

 

For the hypothetical data above, there would only be one instance where two visits occur within 7 days of each other for a given patient (observations 3 & 5), and this is what I am attempting to quantify. I have tried a few different approaches (e.g. transposing the data, using arrays, using the DIF function) but nothing complete enough to post that resulted in any progress, and I am not yet a skilled enough programmer to solve this on my own. If anyone could offer any advice, it would be greatly appreciated! Forgive me if this answer is already posted somewhere; I researched this issue for several hours and have so far come up empty. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14
proc sort data=have;
  by subject visit_date;
run;
 
data counts;
  set have;
  by subject visit_date;
  counter=dif(Visit_date)<=7 and not first.subject;
run;

proc summary data=counts;
  var counter;
  class subject;
  output out=sum sum=;
run;  

View solution in original post

13 REPLIES 13
ballardw
Super User

It may help to provide slightly more data, some where you have the same subject with 2 different intervals that meet your rule AND show what the resulting data set should look like.

 

Since you seem to need to look at many same subject pairs then Proc Sql may come into play.

 

It will help if you provide the data as a datastep. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

weatherm
Calcite | Level 5
ballardw,

Thank you for the resource. I hope what I have posted in my updated comment is sufficient.
art297
Opal | Level 21

If a subject has visits on 18765, 18766 and 18767 how much do you want the count to increase by?

 

Art, CEO, AnalystFinder.com

 

weatherm
Calcite | Level 5

art297,

 

Thanks for your reply and great question. If a subject had the proposed dates for visits, I would want the count to increase by 2: once for the period from 18765-18771 (3 visits in 7-day period), and once for the period from 18766-18772 (2 visits in 7 day period). The count would not need to increase by 2 for the period from 19765-18771, despite the fact that there are more visits within that period. I've adjusted the sample data I provided to reflect sorting by subject, then visit date, and also provided a sample of what I want the data to look like (note that the 7-day window can be prospective or retrospective, but I believe sorting by visit dates takes care of this):

 

Observation    Subject    Visit Date    Visit #

 1                       1              18765         1

 2                       1              18789         2

 3                       2              18759         1

 4                       2              18760         2

 5                       2              18792         3

 6                       3              18765         1

 7                       3              18766         2

 8                       3              19767         3

 

Desired Data:

 

Observation    Subject    Visit Date    Visit #    Multiple Visit Count

 1                       1              18765         1          0

 2                       1              18789         2          0

 3                       2              18759         1          1

 4                       2              18760         2          1

 5                       2              18792         3          1

 6                       3              18765         1          2

 7                       3              18766         2          3

 8                       3              19767         3          3

 

Again, the multiple visit count would be cumulative for the entire data set, by subject. Thank you!

art297
Opal | Level 21

You'll have to explain why the multiple visits are cumulative across subjects. That seems to contradict your requirements.

 

Art, CEO, AnalystFinder.com

weatherm
Calcite | Level 5

Correct, they are cumulative across patients in the example I provided because our objective is to sum this value to indicate all instances in which multiple visits occurred within a given 7-day window for our entire sample, but the Multiple Visit Count variable does not need to be set up this way if it will simplify the process. I could always sum the maximum value for this variable across all subjects. Hope this helps to clarify. Thanks.

mkeintz
PROC Star

Given your value of multiple visit count in the last row of desirec output, I think you meant the last value of visit_date to be 18767, not 19767.  If so, I see no need for a PROC summary, since all you wnat is a global count of visit dates within 7 days of same-subject next visit date, which is done here:

 

data have;
input Observation    Subject    Visit_Date    Visit_num;
  format visit_date date9.;
put (_all_) (=);
datalines;
 1 1 18765 1
 2 1 18789 2
 3 2 18759 1
 4 2 18760 2
 5 2 18792 3
 6 3 18765 1
 7 3 18766 2
 8 3 18767 3
run;

data _null_;
  merge have 
        have (firstobs=2 keep=visit_date visit_num
              rename=(visit_date=next_date visit_num=next_vnum))
        end=end_of_have;

  if end_of_have=0 then mult_visit+(next_date <= visit_date+7) and (next_vnum^=1);
  else put mult_visit=;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

What happens if you have a visit on day one, day 3, day 5 and day 8. 

 

Day 1, Day 3, and Day 5 are within 7 days. 

Day 3, 5, 8 are within a different 7 days. 

weatherm
Calcite | Level 5
Reeza,

Thanks for your reply. Your comment seemed similar to art297's above, so I hope the additional comments I've left help to answer your question as well.
s_lassen
Meteorite | Level 14
proc sort data=have;
  by subject visit_date;
run;
 
data counts;
  set have;
  by subject visit_date;
  counter=dif(Visit_date)<=7 and not first.subject;
run;

proc summary data=counts;
  var counter;
  class subject;
  output out=sum sum=;
run;  
weatherm
Calcite | Level 5
s_lassen,

Thank you for your reply and sample code. I will give your code a try!
art297
Opal | Level 21

Since you only need the overall total, I think @s_lassen's approach does what you need. The only thing I might change is the proc summary statement:

proc summary data=counts;
  var counter;
  class subject;
  output out=sum (where=(_type_ eq 0)) sum=;
run;  

That way you'll only get the one record that provides the total number of instances where your condition is met.

 

Art, CEO, AnalystFinder.com

 

weatherm
Calcite | Level 5

Thank you very much to @s_lassen and @art297 for a great solution to this question. Thank you also to everyone else who commented and helped.

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 6575 views
  • 1 like
  • 6 in conversation