Hi, all. I am pretty new to SAS and am looking for some assistance with some research I am trying to do on a system that blocks risky transactions. I have a data set that looks like this
Account Name Date Time Amount Risky Blocked
--------------------- ------ -------- ---------- ------- -----------
Bill 1/1/2011 02:30 1.00 1 0
Bill 1/1/2011 02:33 100.00 1 1
Bill 1/2/2011 02:15 300.00 1 0
Bill 1/3/2011 02:20 200.00 1 1
Bill 1/3/2011 03:30 100.00 1 1
Bill 1/3/2011 09:45 150.00 0 0
Bill 1/3/2011 10:00 200.00 1 1
Jane 1/1/2011 01:00 50.00 1 1
Jane 1/1/2011 01:30 50.00 1 1
What I want from the data is this: I am looking for the following totals for the number of risky transactions that were or were not blocked within 24 hours of the the first blocked risky transaction:
Sum(Amount of Risky that are not blocked after the first trxn):
Count(Num transactions not risky)
Count(num transactions risky)
Count(Number of customers that have had risky transactions not blocked after the first block).
So, for Bill, the 2nd Transaction marks the start of the first 24 hour clock. Then I want to start a second 24 hour clock for the the transaction in Row 5. The totals should look like this
Sum(Amount of Risky that are not blocked within the 24 hr window): 500.00 (Row 3 + Row 7)
Count(num transactions risky not blocked after the first trxn): 2 (count of Row 3 and Row 7
Count(Number of customers that have had risky transactions not blocked after the first block): 1 (Bill Only)
There are of thousands of accounts in this data set, so it is not feasible to do by hand.
Thanks for the help!!
I'm not sure I met all of your requirements or not, but this should give you enough direction to hopefully fill in the blanks.
proc format;
value my_tf_fmt
0 = 'false'
1 = 'true'
other = '***ERROR***';
;
run;
data have;
format dhms datetime. date mmddyy10. time time5. amount dollar8.2 risky blocked my_tf_fmt.;
infile cards dsd dlm=',';
input name :$4. date :mmddyy10. time :time5. amount :6.2 risky blocked;
dhms=dhms(date,hour(time),minute(time),'0');
cards;
Bill,1/1/2011,02:30,1.00,1,0
Bill,1/1/2011,02:33,100.00,1,1
Bill,1/2/2011,02:15,300.00,1,0
Bill,1/3/2011,02:20,200.00,1,1
Bill,1/3/2011,03:30,100.00,1,1
Bill,1/3/2011,09:45,150.00,0,0
Bill,1/3/2011,10:00,200.00,1,1
Jane,1/1/2011,01:00,50.00,1,1
Jane,1/1/2011,01:30,50.00,1,1
;
run;
proc sort data=have; by name dhms; run;
data want;
set have;
by name;
retain cb cnb tb tnb d;
if first.name then
do;
cycles=0;
end;
cycle: if blocked=1 and d=. then
do;
d=dhms;
tnb=0;
tb=-amount;
cnb=0;
cb=-1;
cycles+1;
end;
if intck('hour',d,dhms)<=24 then
do;
if blocked=1 then
do;
tb=tb+amount;
cb+1;
end;
else
do;
tnb=tnb+amount;
cnb+1;
end;
end;
else
do;
d=.;
go to cycle;
end;
if last.name then
do;
d=.;
output;
end;
keep name tnb cnb tb cb cycles;
format tnb tb dollar8.2 cnb cb cycles comma8.;
label name='Account Name'
tnb='Total Amount - Non-Blocked'
cnb='Count - Non-Blocked'
tb='Total - Blocked'
cb='Count - Blocked'
cycles='Nbr of 24hour Cycles Since 1st Blocked';
run;
Obs name cb cnb tb tnb cycles
1 Bill 2 1 $300.00 $150.00 2
2 Jane 1 0 $50.00 $0.00 1
Thanks! I think this enough to at least get me started!
⏰
Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.
Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.