BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xinxin
Obsidian | Level 7

Hello, I need to find the number of repeat visits within a year, starting from the initial visit. Ex., data is as below. A has 0 repeat visits within a year, B has 6, C has 1. I tried the LAG and INTCK functions, but couldn't get the correct answer. 

ID TIME_IN
A 5/8/2015
A 10/25/2016
B 6/3/2016
B 7/6/2016
B 8/22/2016
B 10/31/2016
B 1/9/2017
B 2/24/2017
B 4/7/2017
C 7/7/2014
C 7/29/2014
C 8/24/2015
C 10/28/2015
C 1/4/2016
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Here's a simple way:

 

data want;

set have;

by id;

if first.id then do;

   repeat_visits=0;

   baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

 

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

 

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

And what defines a repeat visit?  Post test data in the form of a datastep and show what you want to have output.  Then explain the logic of why.  As for your problem, simply:

data want;
  set have;
  retain repeat;
  by id;
  if first.id then repeat=0;
  else repeat=repeat+1;
run;
PeterClemmensen
Tourmaline | Level 20

I'm not sure I follow your logic? How does B have 6 repeat visits? What is the exact definition for a repeat visit?

Astounding
PROC Star

Here's a simple way:

 

data want;

set have;

by id;

if first.id then do;

   repeat_visits=0;

   baseline_date = time_in;

end;

else if time_in <= baseline_date + 365 then repeat_visits + 1;

if last.id;

retain baseline_date;

keep id baseline_date repeat_visits;

run;

 

There are fancier ways (other than adding 365) to define "within a year" including ways that account for leap years.  So it depends on how exact you want to be.

 

Oligolas
Barite | Level 11

Hi,

 

there are several ways to perform.

I have 4 in mind:

- use the classic retain

- use proc SQL update

- use call execute

- use the new function dosubl for a more straight forward approach which blows my mind 🙂

 

I've described the retain and dosubl in the code below:

data test;
informat ID $1. TIME_IN mmddyy10.;
input id time_in;
format TIME_IN mmddyy10.;
datalines;
A 5/8/2015 
A 10/25/2016 
B 6/3/2016 
B 7/6/2016 
B 8/22/2016 
B 10/31/2016 
B 1/9/2017 
B 2/24/2017 
B 4/7/2017 
C 7/7/2014 
C 7/29/2014 
C 8/24/2015 
C 10/28/2015 
C 1/4/2016 
;
run;
/* RETAIN approach */
data test1;
   set test;
   by id time_in;
   retain min max h_repeatVisits .;
   if first.id then do;
      min=intnx("day", time_in, 1, "sameday");
      max=intnx("year", time_in, 1, "sameday");
      h_repeatVisits=0;
   end;
   if min<time_in<=max then h_repeatVisits=h_repeatVisits+1;
run;
proc sort data=test1; by id descending time_in;run; 

data test2;
   set test1;
   by id descending time_in;
   retain repeatVisits .;
   if first.id then repeatVisits=h_repeatVisits;
   drop min max h_repeatVisits;
run;
proc sort data=test2; by id time_in; run;
/* DOSUBL approach */
data test3;
   set test;
   by id time_in;
   retain repeatVisits 0;
   if first.id then do;
      rc=dosubl('%LET n=0; 
                 PROC SQL noprint; 
                    SELECT count(*) INTO :n 
                    FROM test 
                    WHERE id EQ "'||strip(id)||'" 
                    AND '||put(time_in,best8.)||'<time_in<= '||put(intnx("year", time_in, 1, "sameday"),best8.)
             ||';QUIT;');
      repeatVisits=input(symget('n'),best8.);
   end;
   drop rc;
run;
________________________

- Cheers -

Xinxin
Obsidian | Level 7

Thank you for the various methods and increasing my knowledge ... I really appreciate it !!

ballardw
Super User

Count the occurrence of year within each ID and subtract 1?

proc summary data=have nway;
   class id date;
   format date year4.;
   output out=yearcount (drop=_type_);
run;

will have a variable _freq_ with the count.

 

Assumes that your date is a SAS date value and NOT a character variable.

A data step could subtract the 1.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 997 views
  • 0 likes
  • 6 in conversation