Help using Base SAS procedures

Hash table to match transactions to outcomes

Reply
Contributor
Posts: 54

Hash table to match transactions to outcomes

On 9.1.3 but will soon be on 9.2

KSharp and Art297 helped me on my last question similar to this one but the business rules changed and I haven't been able to adjust the code to work for it. If anyone can help correct what I have or even offer a completely different solution, I’d appreciate it. Thank you.

Summary

The program attempts to match the Queued transactions to those that were worked.

A transaction can queue into multiple queues. A transaction can queue but never be worked. In which case, we just leave the worked indicator as null.

A queued record’s worked indicator is marked as “Worked” when there is a match in both tables on acct_num and queuename and the worked record happened after the queued record.

All other Queued transactions happening at the same time or before the Worked record should be marked as “Removed” because when one record is worked, all others are removed from the system.

If the customer calls in (Inbound call), the Queuename is “Inbound”. There will not be a matching queue name in the Queued table. In this case, all non-Worked and non-Removed records that occur before the Inbound call record should be marked as “Inbound”.

The results I want to get are in the table below. I added highlighting to help see the distinct transactions:

ACCT_NUM

TRAN_DT

QueueName

TRAN_TM

WorkedInd

workedid

1000000000801180

3/4/2012

C-BXX-N-AE-XXXXXXXXXX

170458

Removed

11895835

1000000000801180

3/4/2012

N-2XX-N-KE-XXXXXXX

170458

Worked

11895835

2000000000885230

3/3/2012

N-PXX-N-CP-XXXXXXXXXXXXX

213205

Worked

11886991

3000000000064278

3/6/2012

C-BXX-D-CP-XXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202257

Worked

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-D-CP-XXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/8/2012

N-BXX-N-CP-XXXXXXXXXXXX

233701

Program:

Data Queued;

  informat TRAN_DT mmddyy8.;

  format TRAN_DT mmddyy8.

             ACCT_NUM 16.;

  input QueueName $24.  ACCT_NUM TRAN_DT TRAN_TM;

  cards;

C-BXX-N-AE-XXXXXXXXXX         1000000000801180  3/4/2012    170458

N-2XX-N-KE-XXXXXXX                  1000000000801180  3/4/2012    170458

N-PXX-N-CP-XXXXXXXXXXXXX      2000000000885230  3/3/2012    213205

C-BXX-D-CP-XXXXXXX                  3000000000064278  3/6/2012    202257

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202257

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/7/2012    202257

C-BXX-D-CP-XXXXXXX                  3000000000064278  3/7/2012    202312

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202312

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/8/2012    233701

;

run;

Data Worked;

  informat _TRAN_DT mmddyy8.;

  format _TRAN_DT mmddyy8.

             ACCT_NUM 16.

             _ACCT_NUM 16.;

  input _QueueName $24. _tran_dt    _tran_tm    acct_num    workedkey _acct_num;

  cards;

Inbound                                   3/8/2012    172000      3000000000064278  11925138      3000000000064278

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400      3000000000064278  11918651      3000000000064278

N-PXX-N-CP-XXXXXXXXXXXXX      3/3/2012    213400      2000000000885230  11886991      2000000000885230

N-2XX-N-KE-XXXXXXX                  3/4/2012    170800      1000000000801180  11895835      1000000000801180

;

run;

proc sort data = queued;

by acct_num tran_dt tran_tm;

run;

data Output(drop=_: found rc);*Mark all Worked and Removed;

if _n_ eq 1 then do;

  if 0 then set worked;

  declare hash ha(hashexp:16,dataset:'worked'/*,ordered:'A'*/);

  declare hiter hi('ha');

   ha.definekey('_tran_dt','_tran_tm','_acct_num');

   ha.definedata('_tran_dt','_tran_tm','_acct_num','_queuename','workedkey');

   ha.definedone();

end;

set Queued;

length WorkedInd $ 10;

_rc=hi.first();

do while(_rc=0);

if queuename=_queuename and acct_num=_acct_num then do;

                                             found=1;

                                             WorkedInd='Worked';

                                                                   workedid = workedkey;

                                             _k1=_tran_dt;

                                             _k2=_tran_tm;

                                                                   _k3=_acct_num;

                                             output;

                                           end;

else if  acct_num eq _acct_num

       and ( _tran_dt <= tran_dt or (tran_dt eq _tran_dt and _tran_tm lt tran_tm)) then do;

                                             found=1;

                                             WorkedInd='Removed';

                                                                   workedid = workedkey;

                                             output;

                                           end;

else if  acct_num eq _acct_num

      and ( _tran_dt <= tran_dt or (tran_dt eq _tran_dt and _tran_tm lt tran_tm))

      and upcase(_Queuename) in('Inbound') then do;

                                                                   found=1;

                                             WorkedInd='Inbound';

                                                                   workedid = workedkey;

                                             output;

                                           end;

           

_rc=hi.next();

if found then do; rc=ha.remove(key:_k1,key:_k2,key:_k3); leave;end;

end;

if not found then do;

                                   call missing(WorkedInd,workedkey);

                                   output;

                                 end;

run;

proc sort data = Output (keep= acct_num tran_dt tran_tm queuename workedind workedid);

by acct_num tran_dt tran_tm queuename;

run;

Trusted Advisor
Posts: 1,301

Re: Hash table to match transactions to outcomes

Given your data and rules I cannot tell why your last expected result is NULL instead of Inbound?  I am also not sure how you are assigning all of the workedid values?

data queued;

input QueueName $24. acct_num tran_dt :mmddyy8. tran_tm;

format tran_dt date9. acct_num 16.;

cards;

C-BXX-N-AE-XXXXXXXXXX         1000000000801180  3/4/2012    170458

N-2XX-N-KE-XXXXXXX            1000000000801180  3/4/2012    170458

N-PXX-N-CP-XXXXXXXXXXXXX      2000000000885230  3/3/2012    213205

C-BXX-D-CP-XXXXXXX            3000000000064278  3/6/2012    202257

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202257

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/7/2012    202257

C-BXX-D-CP-XXXXXXX            3000000000064278  3/7/2012    202312

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202312

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/8/2012    233701

;

run;

data worked;

input _QueueName $24. _tran_dt :mmddyy8. _tran_tm acct_num workedkey _acct_num;

format _tran_dt mmddyy8. acct_num 16.;

cards;

Inbound                       3/8/2012    172000      3000000000064278  11925138      3000000000064278

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400      3000000000064278  11918651      3000000000064278

N-PXX-N-CP-XXXXXXXXXXXXX      3/3/2012    213400      2000000000885230  11886991      2000000000885230

N-2XX-N-KE-XXXXXXX            3/4/2012    170800      1000000000801180  11895835      1000000000801180

;

run;

proc sql;

select a.acct_num,

        a.tran_dt,

        a.QueueName,

        a.tran_tm,

                    case

                     when a.tran_dt=b.max_tran_dt and a.tran_tm=b.max_tran_tm and n>1 then 'Inbound'

                     when a.acct_num=c.acct_num and a.QueueName=c._QueueName then 'Worked'

                     when a.tran_dt<b.max_tran_dt or (a.tran_dt=b.max_tran_dt and a.tran_tm<b.max_tran_tm) then 'Removed'

                    end as WorkedInd,

                    c.workedkey as workedid

   from queued a

   left join ( select QueueName,

                      max(tran_dt) as max_tran_dt,

                                                    max(tran_tm) as max_tran_tm,

                                                    count(1) as n

                 from queued

                group by QueueName ) b on a.QueueName=b.QueueName

   left join worked c on a.acct_num=c.acct_num and a.QueueName=c._QueueName

  order by a.acct_num, a.tran_tm, a.tran_dt, a.QueueName;

quit;

NOTE: PROCEDURE SQL used (Total process time):

      real time           0.01 seconds

      cpu time            0.00 seconds

acct_numtran_dtQueueNametran_tmWorkedIndworkedid
100000000080118004MAR2012C-BXX-N-AE-XXXXXXXXXX170458Removed.
100000000080118004MAR2012N-2XX-N-KE-XXXXXXX170458Worked11895835
200000000088523003MAR2012N-PXX-N-CP-XXXXXXXXXXXXX213205Worked11886991
300000000006427806MAR2012C-BXX-D-CP-XXXXXXX202257Removed.
300000000006427807MAR2012C-BXX-N-AE-XXXXXXXXXX202257Removed.
300000000006427807MAR2012C-RXX-N-AE-XXXXXXXXXXXXX202257Removed.
300000000006427807MAR2012C-VXX-D-CP-XXXXXXX-XXXX202257Worked11918651
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXX202257Removed.
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXXX202257Removed.
300000000006427807MAR2012C-BXX-D-CP-XXXXXXX202312Inbound.
300000000006427807MAR2012C-BXX-N-AE-XXXXXXXXXX202312Inbound.
300000000006427807MAR2012C-RXX-N-AE-XXXXXXXXXXXXX202312Inbound.
300000000006427807MAR2012C-VXX-D-CP-XXXXXXX-XXXX202312Inbound11918651
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXXX202312Inbound.
300000000006427808MAR2012N-BXX-N-CP-XXXXXXXXXXXX233701Inbound.
Contributor
Posts: 54

Hash table to match transactions to outcomes

FriedEgg, I never thought to try it with Proc SQL. My actual data sets are huge, but if I can get it to work playing with what you started with here I'll take the performance loss over something that works any day. Thanks for making the effort! Hopefully someone will be able to get it.

Trusted Advisor
Posts: 1,301

Re: Hash table to match transactions to outcomes

What size is the data you are working with?

data queued;

input QueueName $24. acct_num tran_dt :mmddyy8. __tran_tm $;

tran_tm=input(prxchange('s/(\d{2})(\d{2})(\d{2})/\1:\2:\3/o',1,__tran_tm),time8.);

format tran_dt date9. acct_num 16. tran_tm time.;

drop __tran_tm;

cards;

C-BXX-N-AE-XXXXXXXXXX         1000000000801180  3/4/2012    170458

N-2XX-N-KE-XXXXXXX            1000000000801180  3/4/2012    170458

N-PXX-N-CP-XXXXXXXXXXXXX      2000000000885230  3/3/2012    213205

C-BXX-D-CP-XXXXXXX            3000000000064278  3/6/2012    202257

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202257

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/7/2012    202257

C-BXX-D-CP-XXXXXXX            3000000000064278  3/7/2012    202312

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202312

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/8/2012    233701

;

run;

data worked;

input _QueueName $24. _tran_dt :mmddyy8. __tran_tm $ acct_num workedkey _acct_num;

_tran_tm=input(prxchange('s/(\d{2})(\d{2})(\d{2})/\1:\2:\3/o',1,__tran_tm),time8.);

format _tran_dt mmddyy8. acct_num 16. _tran_tm time8.;

drop __tran_tm;

cards;

Inbound                       3/8/2012    172000      3000000000064278  11925138      3000000000064278

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400      3000000000064278  11918651      3000000000064278

N-PXX-N-CP-XXXXXXXXXXXXX      3/3/2012    213400      2000000000885230  11886991      2000000000885230

N-2XX-N-KE-XXXXXXX            3/4/2012    170800      1000000000801180  11895835      1000000000801180

;

run;

proc sql;

select acct_num,

       tran_dt,

             QueueName,

             tran_tm,

             WorkedInd,

             case

              when _QueueName=WorkedInd then max_workedkey

                    else min_workedkey

             end

  from (

select a.acct_num,

       a.tran_dt,

       a.QueueName,

       a.tran_tm,

       case

                    when d.workedkey is null then ''

        when a.tran_dt=b.max_tran_dt and a.tran_tm=b.max_tran_tm and b.n>1 then 'Inbound'

        when a.acct_num=c.acct_num and a.QueueName=c._QueueName then 'Worked'

        when a.tran_dt<b.max_tran_dt or (a.tran_dt=b.max_tran_dt and a.tran_tm<b.max_tran_tm) then 'Removed'

       end as WorkedInd,

             min(d.workedkey) as min_workedkey,

             max(d.workedkey) as max_workedkey,

             d._QueueName

   from queued a

   left join ( select QueueName,

                      max(tran_dt) as max_tran_dt,

                      max(tran_tm) as max_tran_tm,

                      count(1) as n

                 from queued

                group by QueueName ) b on a.QueueName=b.QueueName

   left join worked c on a.acct_num=c.acct_num and a.QueueName=c._QueueName

   left join worked d on a.acct_num=d.acct_num and dhms(a.tran_dt,0,0,a.tran_tm)<dhms(d._tran_dt,0,0,d._tran_tm)

   group by a.acct_num, a.tran_dt, a.tran_tm, a.QueueName,calculated WorkedInd

   having (d._QueueName eq 'Inbound' and calculated WorkedInd eq 'Inbound') or (calculated WorkedInd ne 'Inbound' and d._QueueName ne 'Inbound')

)

order by a.acct_num, a.tran_dt, a.tran_tm, a.QueueName;

quit;

acct_numtran_dtQueueNametran_tmWorkedInd
100000000080118004MAR2012C-BXX-N-AE-XXXXXXXXXX170458Removed11895835
100000000080118004MAR2012N-2XX-N-KE-XXXXXXX170458Worked11895835
200000000088523003MAR2012N-PXX-N-CP-XXXXXXXXXXXXX213205Worked11886991
300000000006427806MAR2012C-BXX-D-CP-XXXXXXX202257Removed11918651
300000000006427807MAR2012C-BXX-N-AE-XXXXXXXXXX202257Removed11918651
300000000006427807MAR2012C-RXX-N-AE-XXXXXXXXXXXXX202257Removed11918651
300000000006427807MAR2012C-VXX-D-CP-XXXXXXX-XXXX202257Worked11918651
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXX202257Removed11918651
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXXX202257Removed11918651
300000000006427807MAR2012C-BXX-D-CP-XXXXXXX202312Inbound11925138
300000000006427807MAR2012C-BXX-N-AE-XXXXXXXXXX202312Inbound11925138
300000000006427807MAR2012C-RXX-N-AE-XXXXXXXXXXXXX202312Inbound11925138
300000000006427807MAR2012C-VXX-D-CP-XXXXXXX-XXXX202312Inbound11925138
300000000006427807MAR2012N-BXX-N-CP-XXXXXXXXXXXXX202312Inbound11925138
300000000006427808MAR2012N-BXX-N-CP-XXXXXXXXXXXX233701
.
Contributor
Posts: 54

Re: Hash table to match transactions to outcomes

The table is about 300,000 records by 20 deep for Queued and the Worked is about 35,000.

The last record is not marked as Inbound because it happens a few hours after the inbound call. Therefore the record couldn't have been marked inbound when the inbound call came in.

Super User
Posts: 10,041

Hash table to match transactions to outcomes

I have a problem.

ACCT_NUM

TRAN_DT

QueueName

TRAN_TM

WorkedInd

workedid

1000000000801180

3/4/2012

C-BXX-N-AE-XXXXXXXXXX

170458

Removed

11895835

1000000000801180

3/4/2012

N-2XX-N-KE-XXXXXXX

170458

Worked

11895835

2000000000885230

3/3/2012

N-PXX-N-CP-XXXXXXXXXXXXX

213205

Worked

11886991

3000000000064278

3/6/2012

C-BXX-D-CP-XXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202257

Worked

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-D-CP-XXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/8/2012

N-BXX-N-CP-XXXXXXXXXXXX

233701

There two C-VXX-D-CP-XXXXXXX-XXXX and they are all before than worked datetime C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400 . But  you flag the second as Inbound. Is it right?

Ksharp


ACCT_NUM

TRAN_DT

QueueName

TRAN_TM

WorkedInd

workedid

1000000000801180

3/4/2012

C-BXX-N-AE-XXXXXXXXXX

170458

Removed

11895835

1000000000801180

3/4/2012

N-2XX-N-KE-XXXXXXX

170458

Worked

11895835

2000000000885230

3/3/2012

N-PXX-N-CP-XXXXXXXXXXXXX

213205

Worked

11886991

3000000000064278

3/6/2012

C-BXX-D-CP-XXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202257

Worked

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-D-CP-XXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/8/2012

N-BXX-N-CP-XXXXXXXXXXXX

233701

Super User
Posts: 10,041

Hash table to match transactions to outcomes

Data Queued;
  informat TRAN_DT mmddyy8.;
  format TRAN_DT mmddyy8.
             ACCT_NUM 16.;
  input QueueName $24.  ACCT_NUM TRAN_DT TRAN_TM;
  datetime=dhms(tran_dt,0,0,tran_tm);
  format datetime datetime12.;
  cards;
C-BXX-N-AE-XXXXXXXXXX         1000000000801180  3/4/2012    170458
N-2XX-N-KE-XXXXXXX                  1000000000801180  3/4/2012    170458
N-PXX-N-CP-XXXXXXXXXXXXX      2000000000885230  3/3/2012    213205
C-BXX-D-CP-XXXXXXX                  3000000000064278  3/6/2012    202257
C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202257
C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257
C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202257
N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257
N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/7/2012    202257
C-BXX-D-CP-XXXXXXX                  3000000000064278  3/7/2012    202312
C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202312
C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312
C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202312
N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312
N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/8/2012    233701
;
run;

Data Worked;
  informat _TRAN_DT mmddyy8.;
  format _TRAN_DT mmddyy8.
             ACCT_NUM 16.
             _ACCT_NUM 16.;
  input _QueueName $24. _tran_dt    _tran_tm    acct_num    workedkey _acct_num;
  _datetime=dhms(_tran_dt,0,0,_tran_tm);
  format _datetime datetime12.;
  cards;
Inbound                                   3/8/2012    172000      3000000000064278  11925138      3000000000064278
C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400      3000000000064278  11918651      3000000000064278
N-PXX-N-CP-XXXXXXXXXXXXX      3/3/2012    213400      2000000000885230  11886991      2000000000885230
N-2XX-N-KE-XXXXXXX                  3/4/2012    170800      1000000000801180  11895835      1000000000801180
;
run;

data Queued(drop=_: rc);
 if _n_ eq 1 then do;
  if 0 then set worked;
  declare hash ha(hashexp:16,dataset:'worked');
   ha.definekey('_acct_num','_queuename');
   ha.definedone();

  declare hash _ha(hashexp:16,dataset:'worked');
   _ha.definekey('_acct_num');
   _ha.definedata('workedkey');
   _ha.definedone();
end;
set Queued;
call missing(workedkey);
rc=_ha.find(key:acct_num);
if ha.check(key:acct_num,key:queuename)=0 then flag=0;
 else flag=1;
run;

proc sort data = queued;
by acct_num  descending datetime flag;
run;


data Output(drop=_: found flag);*Mark all Worked and Removed;
if _n_ eq 1 then do;
  if 0 then set worked;
  declare hash ha(hashexp:16,dataset:'worked');
   ha.definekey('_acct_num','_queuename');
   ha.definedata('_datetime','_acct_num','_queuename');
   ha.definedone();
end;
set Queued;
length WorkedInd $ 10;
retain found;
if acct_num ne lag(acct_num) then found=0;
if found then WorkedInd='Removed';
 else if ha.find(key:acct_num,key:queuename)=0 and _datetime gt datetime then do;WorkedInd='Worked'; found=1;end;
run;

data output(drop=_:) ;
 if _n_ eq 1 then do;
  if 0 then set worked;
  declare hash _ha(hashexp:16,dataset:'worked(where=(_QueueName="Inbound"))');
   _ha.definekey('_acct_num');
   _ha.definedata('_datetime');
   _ha.definedone();
end;
set output;
if _ha.find(key:acct_num)=0 and _datetime gt datetime and missing(workedind) then workedind='Inbound';
run;


Ksharp

Contributor
Posts: 54

Hash table to match transactions to outcomes

KSharp. Thanks!

Your assumption above is correct. Even though the queuenames are the same, one transaction happened after the worked transaction. The one that happened after the worked transaction happened before the Inbound action so it should be marked as Inbound.

The problem I am having now is that for data Queued, the tran_tm is not a SAS time.

So datetime=dhms(tran_dt,0,0,tran_tm); does not work.

I've tried:

  datetime=dhms(tran_dt,

            substr(left(put(tran_tm,best12.)),1,2),

            substr(left(put(tran_tm,best12.)),3,2),

            substr(left(put(tran_tm,best12.)),5,2));

But because the tran_tm could be 1, for example, if the tranasction occured on the first second of the day, I'm not sure how to handle it.

Do you know how to get a number in this format converted to SAS time?

Contributor
Posts: 54

Hash table to match transactions to outcomes

Ignore that last question. I got it using the format to add leading zeros. SAS can do everything!

  datetime=dhms(tran_dt,

                  substr(left(put(put(tran_tm,z6.),$6.)),1,2),

                  substr(left(put(put(tran_tm,z6.),$6.)),3,2),

                  substr(left(put(put(tran_tm,z6.),$6.)),5,2)

            );

  format datetime datetime23.;

Contributor
Posts: 54

Hash table to match transactions to outcomes

The last data step throws an error. If I remove the (where=(_QueueName="Inbound")) it runs. Is it a syntax error?

177
178  data output(drop=_Smiley Happy ;
179   if _n_ eq 1 then do;
180    if 0 then set worked;
181    declare hash _ha(hashexp:16,dataset:'worked(where=(_QueueName="Inbound"))');
182     _ha.definekey('_acct_num');
183     _ha.definedata('_datetime');
184     _ha.definedone();
185  end;
186  set output;
187  if _ha.find(key:acct_num)=0 and _datetime gt datetime and missing(workedind) then
187! workedind='Inbound';
188  run;

ERROR: Invalid data set name at line 184 column 4.
ERROR: Hash data set load failed at line 184 column 4.
ERROR: DATA STEP Component Object failure.  Aborted during the EXECUTION phase.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.OUTPUT may be incomplete.  When this step was stopped there were 0
         observations and 7 variables.

Respected Advisor
Posts: 3,156

Re: Hash table to match transactions to outcomes

I have to confess, this query is one of the most complicated queries I have ever encountered. I have try to apply all of the rules (complicated by themself already) addressed by OP's first post. Here is what I can come up with:

data queued;

input QueueName $24. acct_num tran_dt :mmddyy8. tran_tm : $8. ;

format tran_dt mmddyy10. acct_num 16.;

tran_tm1=input(tran_tm,    b8601tm.);

tran_dtm=dhms(tran_dt,0,0, tran_tm1);

format tran_dtm datetime20.;

cards;

C-BXX-N-AE-XXXXXXXXXX         1000000000801180  3/4/2012    170458

N-2XX-N-KE-XXXXXXX            1000000000801180  3/4/2012    170458

N-PXX-N-CP-XXXXXXXXXXXXX      2000000000885230  3/3/2012    213205

C-BXX-D-CP-XXXXXXX            3000000000064278  3/6/2012    202257

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202257

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202257

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/7/2012    202257

C-BXX-D-CP-XXXXXXX            3000000000064278  3/7/2012    202312

C-BXX-N-AE-XXXXXXXXXX         3000000000064278  3/7/2012    202312

C-RXX-N-AE-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

C-VXX-D-CP-XXXXXXX-XXXX       3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXXX      3000000000064278  3/7/2012    202312

N-BXX-N-CP-XXXXXXXXXXXX       3000000000064278  3/8/2012    233701

;

run;

data worked;

input _QueueName $24. _tran_dt :mmddyy8. _tran_tm : $8.  acct_num workedkey _acct_num;

format _tran_dt mmddyy8. acct_num 16.;

_tran_tm1=input(_tran_tm,  b8601tm.);

_tran_dtm=dhms(_tran_dt,0,0, _tran_tm1);

QueueName=_QueueName;

format _tran_dtm datetime20.;

cards;

Inbound                       3/8/2012    172000      3000000000064278  11925138      3000000000064278

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400      3000000000064278  11918651      3000000000064278

N-PXX-N-CP-XXXXXXXXXXXXX      3/3/2012    213400      2000000000885230  11886991      2000000000885230

N-2XX-N-KE-XXXXXXX            3/4/2012    170800      1000000000801180  11895835      1000000000801180

;

run;

proc sort data=queued;

by acct_num tran_dtm;

run;

proc sort data=worked (rename=workedkey=workedid);

by acct_num descending _tran_dtm;

run;

data _null_ ;

  if _n_=1 then do;

     set queued (obs=1);

      dcl hash q1(dataset:'queued', ordered: 'a');

          q1.definekey('acct_num','QueueName');

          q1.definedata('acct_num','QueueName','tran_dt','tran_tm','tran_dtm');

          q1.definedone();

      dcl hash q2(dataset:'queued', multidata:'yes');

          q2.definekey('acct_num');

          q2.definedata('acct_num','QueueName','tran_dt','tran_tm','tran_dtm');

          q2.definedone();

      dcl hash q3(multidata: 'yes', ordered: 'a');

          q3.definekey('acct_num','tran_dtm','QueueName');

          q3.definedata('acct_num','tran_dt','QueueName','tran_tm','WorkedInd', 'WorkedId');

          q3.definedone();

  end;

  do until (last.acct_num);

        set worked end=done;

        length workedind $7.;

        by     acct_num;

        retain Inb_dtm inb_workedid pre_workedid;

        if _QueueName='Inbound' then

            do;

               Inb_dtm=_tran_dtm;

               Inb_workedid=Workedid;

            end;

          _rc=q1.find();

        if _rc ne 0 then return;

        else 

           do;

                 _tran_dt=tran_dt; 

                 _tran_tm=tran_tm;

                 _tran_dtm=tran_dtm; 

                 pre_workedid=workedid;

           end;

        do _rc=q2.find() by 0 while (_rc=0);

              if  QueueName eq _QueueName and tran_dtm <=_tran_dtm then

                  do;

                     WorkedInd='Worked';

                     workedid=pre_workedid;

                     q3.add();

                  end;      

              else if  QueueName ne _QueueName and tran_dtm <=_tran_dtm then

                  do;

                     WorkedInd='Removed';

                     workedid=pre_workedid;

                     q3.add();

                  end; 

              else if Inb_dtm >= tran_dtm > _tran_dtm then

                  do;

                     WorkedInd='Inbound';

                     WorkedId= inb_workedid;

                     q3.add();

                  end;

              else if tran_dtm > inb_dtm then

                  do;

                     call missing(WorkedInd, workedid);

                     q3.add();

                  end;

              _rc=q2.find_next();

        end;

  end;

  call missing (Inb_dtm, inb_workedid, pre_workedid);

  if (done) then q3.output(dataset: 'want');

run;

proc print;run;

Kindly Regards,

Haikuo

Update: Oops, has to be SAS 9.2

Contributor
Posts: 54

Re: Hash table to match transactions to outcomes

@Haikuo, Thanks for the reply. When I move to 9.2 in the next few months, I'll be sure to give this a try.

Contributor
Posts: 54

Re: Hash table to match transactions to outcomes

@KSharp. So it's not a syntax error. Not sure what was wrong, so I just threw a data step before this one to create it with just the Inbound records and removed the Where option and it worked.

Super User
Posts: 10,041

Re: Hash table to match transactions to outcomes

Mishka1,

Sorry. It is not your fault . It is my fault. This type statement only can be used at SAS9.2.

What you are doing is right(i.e. create a new table before it).

What I confused is your data , when I change

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202400

into

C-VXX-D-CP-XXXXXXX-XXXX       3/7/2012    202300

then, I get the result you posted.  202257 < 202300<202312

C-VXX-D-CP-XXXXXXX-XXXX

202257

Worked

11918651


3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

N-BXX-N-CP-XXXXXXXXXXXXX

202257

Removed

11918651

3000000000064278

3/7/2012

C-BXX-D-CP-XXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-BXX-N-AE-XXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-RXX-N-AE-XXXXXXXXXXXXX

202312

Inbound

11925138

3000000000064278

3/7/2012

C-VXX-D-CP-XXXXXXX-XXXX

202312


Ksharp


Ask a Question
Discussion stats
  • 13 replies
  • 213 views
  • 0 likes
  • 4 in conversation