BookmarkSubscribeRSS Feed
somu19tec
Calcite | Level 5

Hello experts,

i am doing sas aml data step scenario code in Visual investigator in Viya 3.5.

below is the sample data i am using to create the code.

data input_dataset_filtered_fvel;
input Party_number $ transaction_cdi_code $ transaction_dttm datetime26. currency_amount 7.2 transaction_reference_number $30. transaction_key 8.;
format transaction_dttm datetime26.;
datalines;
A D 20MAR2020:00:00:00.000000 100.97 201-1914974-148_15022021CEUR 1234
A C 19MAR2020:00:00:00.000000 101.97 201-1914974-77_15022021DEUR 1235

A D 13MAR2020:00:00:00.000000 100.97 201-1914974-140_15022021CEUR 1212
A C 12MAR2020:00:00:00.000000 101.97 201-1914974-70_15022021DEUR 1213
B D 20MAR2020:00:00:00.000000 100.97 201-1914974-35_15022021DEUR 1236
B C 19MAR2020:00:00:00.000000 55.97 201-1914974-29_15022021CEUR 1226
B C 18MAR2020:00:00:00.000000 50.97 201-1914974-26_15022021DEUR 1237
E D 20MAR2020:00:00:00.000000 100.97 201-1914974-209_15022021CEUR 1238
E C 19MAR2020:00:00:00.000000 100.97 201-1914974-208_15022021CEUR 1239
E C 18MAR2020:00:00:00.000000 101.97 201-1914974-206_15022021CEUR 1231
F C 20MAR2020:00:00:00.000000 1000.97 201-1914974-205_15022021CEUR 1232
F D 20MAR2020:00:00:00.000000 50.97 201-1914751-81_12022021DEUR 1233
F D 19MAR2020:00:00:00.000000 1005.97 201-1914751-82_12022021DEUR 1244
F C 13MAR2020:00:00:00.000000 50.97 201-1914751-83_12022021DEUR 1245
G C 20MAR2020:00:00:00.000000 1000.97 201-1914751-84_12022021DEUR 1246
G D 19MAR2020:00:00:00.000000 1000.97 201-1914974-147_15022021CEUR 1247
G C 18MAR2020:00:00:00.000000 1005.97 201-1914974-79_15022021DEUR 1248
I D 19MAR2020:00:00:00.000000 100.97 201-1914974-141_15022021CEUR 1221
I C 19MAR2020:00:00:00.000000 101.97 201-1914974-71_15022021DEUR 1222
J C 13MAR2020:00:00:00.000000 100.97 201-1914974-143_15022021CEUR 1224
J D 13MAR2020:00:00:00.000000 100.97 201-1914974-72_15022021DEUR 1225
;
run;

 

for each credit transaction, i need to look for a debit transaction in the previous 3 days(it could be 2days or 5 days, as this is a parameter value) and for each debit transaction i need to look for a credit transaction in the previous 3 days. I need to pair the transactions which match for a customer with onehit for each pair. if there is more than one transaction for the same debit/credit transaction then also the hit should be one but both transactions should be outputted. see example party number E.

 

this needs to be done in data step code. no procedures can be used.

 

Any help is appreciated.

2 REPLIES 2
ballardw
Super User

A question about dates before attempting this. Does every single datetime value have the time component as :00:00:00.000000?

I realize there are a number of data sources that default to such when there actually isn't any time component involved. If indeed that is the case I would be strongly tempted to make the transaction value a date with a date format instead of leaving it as datetime.

 

Some other details: How do we identify a "debit" or "credit". You really don't want us guessing a critical information. And should this be within some identification group, possibly within Party_number? Is that a "customer"?

 

I do not understand at all what this means:"I need to pair the transactions which match for a customer with onehit for each pair." You need to walk us through an example and then show what the expected output looks like.

Similar with " if there is more than one transaction for the same debit/credit transaction then also the hit should be one but both transactions should be outputted" as well as a clear definition of "same debit/credit transaction". A very clear definition is needed because of the possibility of multiples per "customer" and how they need to be shown in the output.

 

 

 

Your informat with currency_amount creates invalid data errors for some records. The short values, when copy and pasting from your posted code, allow the start of the next value to be read as part of currency. Use of the colon modifier helps prevent that. This could be a side effect of columns moving when pasting code into the main message window. The software will reformat text pasted into the main window. Strongly recommend pasting code and log entries into a text box opened using the </> icon above the main window.

Consider this as one way to use the DATEPART.

data input_dataset_filtered_fvel;
   input Party_number $ transaction_cdi_code $ transaction_dttm datetime26. currency_amount :8.2   transaction_reference_number $30. transaction_key 8.;
   transaction_dttm= datepart(transaction_dttm);
   format transaction_dttm date9.;
datalines;
A D 20MAR2020:00:00:00.000000 100.97 201-1914974-148_15022021CEUR 1234
A C 19MAR2020:00:00:00.000000 101.97 201-1914974-77_15022021DEUR 1235
A D 13MAR2020:00:00:00.000000 100.97 201-1914974-140_15022021CEUR 1212
A C 12MAR2020:00:00:00.000000 101.97 201-1914974-70_15022021DEUR 1213
B D 20MAR2020:00:00:00.000000 100.97 201-1914974-35_15022021DEUR 1236
B C 19MAR2020:00:00:00.000000 55.97 201-1914974-29_15022021CEUR 1226
B C 18MAR2020:00:00:00.000000 50.97 201-1914974-26_15022021DEUR 1237
E D 20MAR2020:00:00:00.000000 100.97 201-1914974-209_15022021CEUR 1238
E C 19MAR2020:00:00:00.000000 100.97 201-1914974-208_15022021CEUR 1239
E C 18MAR2020:00:00:00.000000 101.97 201-1914974-206_15022021CEUR 1231
F C 20MAR2020:00:00:00.000000 1000.97 201-1914974-205_15022021CEUR 1232
F D 20MAR2020:00:00:00.000000 50.97 201-1914751-81_12022021DEUR 1233
F D 19MAR2020:00:00:00.000000 1005.97 201-1914751-82_12022021DEUR 1244
F C 13MAR2020:00:00:00.000000 50.97 201-1914751-83_12022021DEUR 1245
G C 20MAR2020:00:00:00.000000 1000.97 201-1914751-84_12022021DEUR 1246
G D 19MAR2020:00:00:00.000000 1000.97 201-1914974-147_15022021CEUR 1247
G C 18MAR2020:00:00:00.000000 1005.97 201-1914974-79_15022021DEUR 1248
I D 19MAR2020:00:00:00.000000 100.97 201-1914974-141_15022021CEUR 1221
I C 19MAR2020:00:00:00.000000 101.97 201-1914974-71_15022021DEUR 1222
J C 13MAR2020:00:00:00.000000 100.97 201-1914974-143_15022021CEUR 1224
J D 13MAR2020:00:00:00.000000 100.97 201-1914974-72_15022021DEUR 1225
;

You do want to be careful with use of an informat like 8.2 as it may do things to integer values you don't expect. Run this and see:

data example;
  input x 8.2;
datalines;
1
12
123
1234
123.4
123.45
;

From ancient days use of a decimal point on an informat to read data means "implied decimal" and SAS will put one in the value if there wasn't one read.

somu19tec
Calcite | Level 5
Hi Ballard,
Thanks for your response.
please find my below answers:
1) transaction_cdi_code variable is used to identify the credit transactions as C and debit transactions as D. customer is the party_number variable.
2) Let us consider the transactions of the Party_number A, it has 2 debit transactions(D) with the transaction_dttm of 20mar2020 and 13mar2020. it also has 2 credit transactions(C) with transaction_dttm of 19mar2020 and 12mar2020.
now the code needs to compare the 1st debit transaction(D) of party_number A of 20mar2020 with the 2 credit transactions (C) with transaction_dttm 19mar2020 and 12mar2020. the output should be the pair of debit transaction(D) of 20mar2020 with credit transaction(C) of 19mar2020 because the look back period is 3 days and 19mar2020 falls within the 3 days period where as 12mar2020 doesnot fall within the 3 days lookback period.
simillarly the 2nd debit transaction(D) of party number A of 13mar2020 has to be compared with the 2 credit transactions(C) with the transaction_dttm 19mar2020 and 12mar2020. the output should be the pair of debit transaction(D) of 13mar2020 with credit transaction(C) of 12mar2020 because the look back period is 3 days and 12mar2020 falls within the 3 days period where as 19mar2020 doesnot fall within the 3 days lookback period.
here the pairing can be a flag variable or a counter variable to denote the matching.
3) now for the multpiles, let us consider the example of party number E. the party_number has 1 Debit transaction(D) as of 20mar2020 and 2 credit transactions (C) as of 19mar2020 and 18mar2020.
here the debit transaction(D) of 20mar2020 has to be compared with credit transactions (C) of 19mar2020 and 18mar2020. now here both the credit transactions falls within the 3 days look back period. however we pair(flag) the first match of 19mar2020 and output the debit transaction(D) and credit transaction of 19mar2020 with a flag and also output the 18mar2020 without pairing or flag.

the sample output for the party_number A and E is as below

Party_number transaction_cdi_code transaction_dttm currency_amount transaction_key flag Matched_key
A D 20MAR2020:00:00:00.000000 100.97 1234 Y 1234,1235
A C 19MAR2020:00:00:00.000000 101.97 1235
A D 13MAR2020:00:00:00.000000 100.97 1212 Y 1212,1213
A C 12MAR2020:00:00:00.000000 101.97 1213
E D 20MAR2020:00:00:00.000000 100.97 1238 Y 1238,1239,1231
E C 19MAR2020:00:00:00.000000 100.97 1239
E C 18MAR2020:00:00:00.000000 101.97 1231