Hi everyone. I have been tasked with determining the number of times a customer has been phoned until one of the following effort metrics are recorded:
Note that a PTP implies an RPC, and an RPC implies a connect.
The dataset I have available is as follows:
Account number | Time of contact | Connect flag | RPC flag | PTP flag |
12345 | 3NOV21 15:15:00 | 0 | 0 | 0 |
12345 | 6NOV21 11:36:56 | 0 | 0 | 0 |
12345 | 7NOV21 09:17:08 | 1 | 0 | 0 |
12345 | 8NOV21 14:08:11 | 1 | 1 | 0 |
12345 | 9NOV21 09:11:42 | 1 | 1 | 1 |
23456 | 2NOV21 08:37:15 | 0 | 0 | 0 |
23456 | 3NOV21 09:55:12 | 1 | 1 | 1 |
34567 | 1NOV21 09:10:10 | 0 | 0 | 0 |
34567 | 2NOV21 10:25:26 | 1 | 0 | 0 |
34567 | 3NOV21 11:55:55 | 1 | 1 | 0 |
I want to reproduce the same table as above but with three new fields, to count the attempts (i.e. rows) needed until each of the effort metrics were recorded. In the example above, for account 12345 that would be 2 attempts until a connect, 3 attempts until an RPC, and 4 attempts until a PTP; for account 23456, 1 attempt until each of connect, RPC, PTP, etc. The resulting dataset should look as follows:
Account number | Time of contact | Connect flag | RPC flag | PTP flag | Attempts to connect | Attempts to RPC | Attempts to PTP |
12345 | 3NOV21 15:15:00 | 0 | 0 | 0 | 2 | 3 | 4 |
12345 | 6NOV21 11:36:56 | 0 | 0 | 0 | 2 | 3 | 4 |
12345 | 7NOV21 09:17:08 | 1 | 0 | 0 | 2 | 3 | 4 |
12345 | 8NOV21 14:08:11 | 1 | 1 | 0 | 2 | 3 | 4 |
12345 | 9NOV21 09:11:42 | 1 | 1 | 1 | 2 | 3 | 4 |
23456 | 2NOV21 08:37:15 | 0 | 0 | 0 | 1 | 1 | 1 |
23456 | 3NOV21 09:55:12 | 1 | 1 | 1 | 1 | 1 | 1 |
23456 | 1NOV21 09:10:10 | 0 | 0 | 0 | 1 | 2 | 0 |
34567 | 2NOV21 10:25:26 | 1 | 0 | 0 | 1 | 2 | 0 |
34567 | 3NOV21 11:55:55 | 1 | 1 | 0 | 1 | 2 | 0 |
I've tried the following (on SAS EG 8.2.0.1201), but I can't seem to group the results for each account number such that a total attempt count for each effort metric is the result:
proc sort data=dialer out=dialer2;
by account_number time_of_contact;
run;
data dialer3;
set dialer2;
by account_number;
if FIRST.account_number then do;
attempt_count = 1;
attempts_to_connect = 1;
attempts_to_rpc = 1;
attempts_to_ptp = 1;
end;
else do;
attempt_count + 1;
if connect = 0 then attempts_to_connect + 1;
if RPC = 0 then attempts_to_rpc + 1;
if PTP = 0 then attempts_to_ptp + 1;
end;
run;
The code above artificially highlights the point at which the first connect/RPC/PTP has been recorded: the first attempts_to_connect/attempts_to_RPC/attempts_to_PTP that duplicates from one record to the following is the correct answer, but I'm not really sure how to go about formalizing that logic so that a single summary of the effort metric for the account is the result.
I'm sure someone has asked a similar question before, please point me in the direction of such threads.
RETAIN the attempt counter, and the three target variables.
Set the counter to 1 at first., and the other three to 0. Increment the counter in the ELSE.
Then, if a flag is set and the corresponding target is still zero, set the target from the counter.
OUTPUT at last., KEEP only account number and the targets, and merge that back to the original dataset.
RETAIN the attempt counter, and the three target variables.
Set the counter to 1 at first., and the other three to 0. Increment the counter in the ELSE.
Then, if a flag is set and the corresponding target is still zero, set the target from the counter.
OUTPUT at last., KEEP only account number and the targets, and merge that back to the original dataset.
I think you can try something like this:
proc sort data=dialer out=dialer2;
by account_number time_of_contact;
run;
data dialer3;
if 0 then set dialer2; /* Just to get variables in right order, not strictly necessary */
Done_Connect=0;
Done_RPC=0;
Done_PTP=0;
Attempts_to_connect=0;
Attempts_to_RPC=0;
Attempts_to_PTP=0;
do until(last.account_number);
set dialer2;
by account_number;
array Flags(*) Connnect_flag RPC_flag PTP_flag;
array attempts(*) Attempts_:;
array done(*) Done_:;
do _N_=1 to dim(flags);
if Done(_N_)=0 then do;
if Flags(_N_)=0 then
Attempts(_N_)+1;
else
Done(_N_)=1;
end;
end;
output;
end;
drop Done_:;
run;
data have; infile cards expandtabs truncover; input Account_number Time_of_contact : datetime32. Connect_flag RPC_flag PTP_flag; format Time_of_contact datetime32.; cards; 12345 3NOV21:15:15:00 0 0 0 12345 6NOV21:11:36:56 0 0 0 12345 7NOV21:09:17:08 1 0 0 12345 8NOV21:14:08:11 1 1 0 12345 9NOV21:09:11:42 1 1 1 23456 2NOV21:08:37:15 0 0 0 23456 3NOV21:09:55:12 1 1 1 34567 1NOV21:09:10:10 0 0 0 34567 2NOV21:10:25:26 1 0 0 34567 3NOV21:11:55:55 1 1 0 ; proc sql; create table want as select *, ifn(count(*)=sum(Connect_flag=0),0,sum(Connect_flag=0)) as Attempts_to_connect , ifn(count(*)=sum(RPC_flag=0),0,sum(RPC_flag=0)) as Attempts_to_RPC , ifn(count(*)=sum(PTP_flag=0),0,sum(PTP_flag=0)) as Attempts_to_PTP from have group by Account_number order by Account_number,Time_of_contact ; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.