Re: Building Customer Satisfaction Measurement System

mjheever · Posted 01-28-2015 04:10 AM

Hi everyone,

I'm building a Customer Satisfaction Measurement System from scratch and thought it might be interesting/fun/beneficial for the community to be part of it.

The purpose is to get some fresh ideas on how to measure satisfaction from a set of observations, so feel free to join in and as usual any input from you guys are much appreciated.

Intro:

I got the following data set (DataSet1):

Case_No	Task_ID	Case_Arrival_Date	Case_Complete_Date	Cust_ID	Task_Pool	Task_Desc	TH_time	Transf_Ind
123	123a	16SEP2014:17:09:13	17SEP2014:18:20:13	c12	Task_help	Access_Denied	12.345	1

Just a basic explanation:

Case_No : Every time a customer contact us with an issue(s) it generates a Case_No.
Task_ID(unique) : Essentially a sub category of Case_No- If a customer calls with an existing/open Case then the Task_ID will be connected to to current Case_No until all the issues in this case is resolved.
Case_Arrival_Date : Date-Time of first contact from customer.
Case_Complete_Data : Date-Time when all issues in Case_No has been resolved.
Cust_ID : Uniquely identifies a customer (one customer can log multiple Tasks/Cases).
Task_Pool : Category of the issue which needs to resolved.
Task_Desc : Sub category of Task_Pool which specifies the issue with a brief description.
TH_Time : Total Holding (Waiting) Time of customer on a specific task (in Minutes).
Transf_Ind : If customer has been transferred from one agent to an other agent.

There is also an additional table (DataSet2) with the details of customer's demographics and purchases done on my website which I can link from Cust_ID.

The Program:

This is what I have at this point:

data t01;

set Dataset1;

format three_days date9.;

format month DATE9.;

If index(Task_Desc,'Delivery')>0 then Delivery_Ind=1; /*Delayed or undelivered deliveries of goods are a serious concern and gets special indicator*/

else Delivery_Ind=0;

month=intnx("month",DATEPART(Case_Arrival_Date),0,"end"); /*Month-end variable generated, analysis conducted on a monthly basis for now, we can change the date-interval at a later stage*/

run;

proc sql;

create table Summary_ind /*I'm doing doing a monthly analysis, so this will indicate if the customer has had any delivery-related issues in that month when the case was logged*/

as select distinct

Cust_ID,

max(Delivery_Ind) as Del_Ind,

month

from t01 group by Cust_ID,month;

quit;

proc sql;

create table summary_Cust_month

as select distinct

a.Cust_ID,

a.month,

b.Del_Ind,

sum(TH_time) as Total_TH_time label="Total Holding Time this month", /*this will calculate the customer's total waiting time for the case to be resolved for the month-end*/

count(distinct(a.Case_No)) as total_number_calls label ="Total Number Of Calls Received this month" /*this will calculate the total calls received from a customer at month-end (please correct me if I'm wrong )*/

from t01 a left join Summary_ind b on a.Cust_ID=b.Cust_ID group by 1,2;

quit;

/*I for one, hate calling companies multiple times in regards with the same issues, so I'm trying incorporate this element into the measurement system*/

/*Also, since delivery-issues are taken very seriously i gave it a special (high) Frustration score*/

data Customer_month_score; /*Here I just assign some basic values to measure the potential frustration of the customer from calls received (based on historical observations, please give some constructive criticisms if possible*/

set summary_Cust_month;

Frustration_score=0;

if 1 < total_number_calls <=2 then Frustration_score=1;

if 2 < total_number_calls <=4 then Frustration_score=2;

if 4 < total_number_calls then Frustration_score=3;

if Del_Ind= 1 then Frustration_score=Frustration_score+2;

run;

proc sort data=T01;

by Task_Desc ;

run;

/*It's from here I specifically need some fresh ideas in assigning scores regarding the waiting time of each customer per month-end*/

proc univariate data=t01 noprint; /*This calculates the percentiles (Cumulative Distribution of customers) regarding the Holding Time (or waiting time) of the customers relative to the specific issue*/

var TH_time;

by Task_Desc;

output out=percentiles pctlpre=P_ pctlpts= 5 to 90 by 5;

run;

proc sql;

create table Benchmark_score /*I want to get an average or benchmark waiting time for each specific issue (Delivery issue, LogIn issues, Payment issues, etc...)*/

as select a.*, /*P_xx is generated from previous data step which indicates the percentile-number*/

avg(case when TH_time<=P_90 then TH_time end) as Bench_score /*There were system failures which caused certain waiting times to be 100000+ which is not realistic at all, so I exclude them here and calculate the average*/

from percentiles a join t01 b on a.Task_Desc=b.Task_Desc

group by a.Task_desc;

quit;

proc sort data=Benchmark_score nodupkey; /*simplifies the Benchmark_score table, since Task_desc isn't unique and I want to only get the average waiting time of each specific Task*/

by Task_desc;

run;

proc sql;

create table Customer_vs_Bench as select distinct

a.*,

P_65,

P_75,

Bench_score

from t01 a left join Benchmark_score b on a.Task_Desc=b.Task_Desc;

quit;

/*Here, again, is here I need some fresh ideas to incorporate the average task waiting time and the percentiles of the task waiting time into the measuring system*/

data Customer_vs_Bench

set Customer_vs_Bench;

Frustration_score2a=0;

if TH_time<=P_65 then Frustration_score2a=1;

if P_75 >= TH_time> P_65 then Frustration_score2a=2;

if TH_time>= P_75 then Frustration_score2a=3;

run;

/*at a later stage I want to combine all the scores I have generated and create a final score to measure the potential frustration of the customer*/

proc sql;

create table combined as select distinct

a.*,

sum(Frustration_score,max(b.Frustration_score2a)) as Total_Score label="Total out of 8"

from Customer_month_score a join Customer_vs_Bench b

on a.Cust_ID=b.Cust_IDgroup by a.Cust_ID,a.month;

quit;

/****** This is still a Work In Progress ******/

Please let me know what you guys think of this concept and where I can improve and build on the program.

As this is my first time of attempting to build a measuring system, I would very much appreciate any advice/tips and input.

Thank you !!

LinusH · Posted 01-28-2015 05:07 AM

Please, the forums is not intended for getting free of charge help to make system design etc.

If you need some extensive help on designing you solution, hire!

Try to boil this down to be specific, and limit the scope for each question.

Data never sleeps

mjheever · Posted 01-28-2015 08:47 AM

Hi Linus,

It is not my intention to get free of charge help. If this seems like the case I would much rather take it down completely.

I want to start a thread where users can give their input/ideas and I will put it to the test with my current data. In other words I want to create a thread where users can be part of the starting process and develop the concept with me in such a way that they can apply this concept/program to similar cases of their own.

I apologize if the general feeling around the initial post is similar to that of 's reply.

If my initial post does not convey this purpose, please reply on this thread and I will gladly remove it. Alternatively if you want to follow the process of the development of this program, please let me know and I will post updates during the course of time (even if you don't contribute actively to this thread).

Once again, sorry and thank you .

ballardw · Posted 01-28-2015 10:57 AM

Having worked with some similar projects here are a few thoughts:

The number of calls may not be an indicator of frustration. Number of calls could relate to just larger volume of transactions especially if your customer codes belong to largish organizations with multiple locations/offices or people likely to contact you. Even for a specific case depending on complexity. The customer may be making multiple calls due to 1) interruptions on his end, 2) the time it takes to attempt a resolution (image having to take down as server, apply a software patch, restart and test condition), 3) the fix to one part of project breaks something else and 4) the ever popular call back to say "never mind" because they realized they contacted you when that wasn't the actual problem (operator headspace and timing issues for example).

And time to reach a person, which may not match time on hold, may be a better time to examine. Lengthy multilevel phone trees are a very common complaint. If you can get time from dial including any of the time spent in those trees you'll have a better start.

And before spending too much time on creating measures I would be very tempted to survey some customers about their overall experience/satisfaction and then create a model from the results.

mjheever · Posted 01-29-2015 01:51 AM

Thank you for the input Ballardw.

I also have a data set where the comments of customers are logged which I will use to test if the frustration score of the customers are actually correlated with the frustrated comments (complaints). I will later incorporate this data into the model, but first I want to the test the results of the scoring system.

Regarding the number of phone calls, I figured that usually a customer will only call once if something is wrong then we can sort it out while they are holding the line, this happens the majority of the times.

So if a customer calls multiple times regarding the same case_id and hence also increase the holding time (TH_time) then I assume the customer is potentially frustrated with us since this quite abnormal, although I do agree that interruptions of the line do indeed play a role - this will also depend a lot the country's connectivity abilities. So I'm thinking do a few adjustments regarding the score in this section. We will mainly look at customers with highest scores first, so if someone gets a score of say 1, then it wouldn't really be regarded as frustrated customer, this is where the cumulative scoring system come into play.

If you can get time from dial including any of the time spent in those trees you'll have a better start.

I will definitely look into this - it does sound like a pretty good measurement of frustration.

*More updates coming *

Reeza · Posted 01-29-2015 02:19 AM

@Linus comment about this not being a discussion is correct, but it would be appropriate for an Article instead. If you truly want collaboration on such a topic though I think you'd have to release some sample data as well. Otherwise it's all conjecture on our part, not to mention pointless to review the code.

For sharing something like this I'd also recommend splitting code/methodology to make it easier to follow.

My thoughts only

mjheever · Posted 01-29-2015 02:40 AM

Hi Reeza,

I agree with what you are saying. I will need to modify the sample data a bit for privacy and security reasons obviously but I will definitely upload a sample of the data during the course of the day.

Would it be appropriate to move this post/discussion directly into the article section as it is now or rather when this project is finalized and then only upload it to the article section ?

Also I will modify the layout and methodology of the initial post, just make it a bit easier to follow.

Thank you for sharing your thoughts on this matter Reeza .

mjheever · Posted 02-02-2015 04:02 AM

Edit: Uploaded some sample data if you guys feel like having a look.

I've decided that once this project is done (2-3weeks), I will edit and update the initial post to make it more readable and move it to the article section .

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!