I'm trying to assess what percent of callers' issues are resolved within the caller's first call. Currently, a repeat call is defined as a call placed within 14 days of a customer's first call (i.e. callback window = 14 days). I'm trying to determine if this 14-day callback window can be justifiably shortened (i.e. determine what a proper cutoff value for a callback is).
I'm thinking of collecting a sample of calls and running logistic regression to model a repeat call having the same call intent as the first call as a function of time. (The thinking is that as more time passes after the first call, the less likely the caller's subsequent call is related to the first call).
Would it be valid to specify 15 logistic models in which I first define callback window as 0 days and increment by 1 day up to 14? So callback window would be defined as 0 days in first model, 0-1 days in second, 0-2 days in third, and so on. Is it a valid approach to choose the model with the highest area under the curve to determine the cut-off value for my callback window?
Another suggestion of "get an idea of how your data looks".
If you have something that indicates the general topic of the "calls" you might add that information to a variable and use that as BY processing with @Reeza's suggestion to see if different topics might have a different behavior.
Consider if 95% of calls related to topic XX are resolved with one call and never generate a call back you might not want them influencing the call back period analysis of topic YY that typically takes 3 or 4 call backs to resolve.
You can conceivably create a data set with one observation per callback with a variable indicating the number of days until the callback and another indicating if the callback was a repeat call or not. Given that, you can simply fit a single logistic model such as the following - assume REPEAT=1 if the call is a repeat call, 0 otherwise. The EFFECPLOT statement shows how the probability of being a repeat call changes as the number of callback days increases. You can use that plot to pick a cutoff on the number of days after you choose what probability you want to satisfy.
proc logistic data=<your_data>;
model repeat(event="1")=callback_days;
effectplot;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.