Hello experts,
i am doing sas aml data step scenario code in Visual investigator in Viya 3.5.
below is the sample data i am using to create the code.
data input_dataset_filtered_fvel;
input Party_number $ transaction_cdi_code $ transaction_dttm datetime26. currency_amount 7.2 transaction_reference_number $30. transaction_key 8.;
format transaction_dttm datetime26.;
datalines;
A D 20MAR2020:00:00:00.000000 100.97 201-1914974-148_15022021CEUR 1234
A C 19MAR2020:00:00:00.000000 101.97 201-1914974-77_15022021DEUR 1235
A D 13MAR2020:00:00:00.000000 100.97 201-1914974-140_15022021CEUR 1212
A C 12MAR2020:00:00:00.000000 101.97 201-1914974-70_15022021DEUR 1213
B D 20MAR2020:00:00:00.000000 100.97 201-1914974-35_15022021DEUR 1236
B C 19MAR2020:00:00:00.000000 55.97 201-1914974-29_15022021CEUR 1226
B C 18MAR2020:00:00:00.000000 50.97 201-1914974-26_15022021DEUR 1237
E D 20MAR2020:00:00:00.000000 100.97 201-1914974-209_15022021CEUR 1238
E C 19MAR2020:00:00:00.000000 100.97 201-1914974-208_15022021CEUR 1239
E C 18MAR2020:00:00:00.000000 101.97 201-1914974-206_15022021CEUR 1231
F C 20MAR2020:00:00:00.000000 1000.97 201-1914974-205_15022021CEUR 1232
F D 20MAR2020:00:00:00.000000 50.97 201-1914751-81_12022021DEUR 1233
F D 19MAR2020:00:00:00.000000 1005.97 201-1914751-82_12022021DEUR 1244
F C 13MAR2020:00:00:00.000000 50.97 201-1914751-83_12022021DEUR 1245
G C 20MAR2020:00:00:00.000000 1000.97 201-1914751-84_12022021DEUR 1246
G D 19MAR2020:00:00:00.000000 1000.97 201-1914974-147_15022021CEUR 1247
G C 18MAR2020:00:00:00.000000 1005.97 201-1914974-79_15022021DEUR 1248
I D 19MAR2020:00:00:00.000000 100.97 201-1914974-141_15022021CEUR 1221
I C 19MAR2020:00:00:00.000000 101.97 201-1914974-71_15022021DEUR 1222
J C 13MAR2020:00:00:00.000000 100.97 201-1914974-143_15022021CEUR 1224
J D 13MAR2020:00:00:00.000000 100.97 201-1914974-72_15022021DEUR 1225
;
run;
for each credit transaction, i need to look for a debit transaction in the previous 3 days(it could be 2days or 5 days, as this is a parameter value) and for each debit transaction i need to look for a credit transaction in the previous 3 days. I need to pair the transactions which match for a customer with onehit for each pair. if there is more than one transaction for the same debit/credit transaction then also the hit should be one but both transactions should be outputted. see example party number E.
this needs to be done in data step code. no procedures can be used.
Any help is appreciated.
A question about dates before attempting this. Does every single datetime value have the time component as :00:00:00.000000?
I realize there are a number of data sources that default to such when there actually isn't any time component involved. If indeed that is the case I would be strongly tempted to make the transaction value a date with a date format instead of leaving it as datetime.
Some other details: How do we identify a "debit" or "credit". You really don't want us guessing a critical information. And should this be within some identification group, possibly within Party_number? Is that a "customer"?
I do not understand at all what this means:"I need to pair the transactions which match for a customer with onehit for each pair." You need to walk us through an example and then show what the expected output looks like.
Similar with " if there is more than one transaction for the same debit/credit transaction then also the hit should be one but both transactions should be outputted" as well as a clear definition of "same debit/credit transaction". A very clear definition is needed because of the possibility of multiples per "customer" and how they need to be shown in the output.
Your informat with currency_amount creates invalid data errors for some records. The short values, when copy and pasting from your posted code, allow the start of the next value to be read as part of currency. Use of the colon modifier helps prevent that. This could be a side effect of columns moving when pasting code into the main message window. The software will reformat text pasted into the main window. Strongly recommend pasting code and log entries into a text box opened using the </> icon above the main window.
Consider this as one way to use the DATEPART.
data input_dataset_filtered_fvel; input Party_number $ transaction_cdi_code $ transaction_dttm datetime26. currency_amount :8.2 transaction_reference_number $30. transaction_key 8.; transaction_dttm= datepart(transaction_dttm); format transaction_dttm date9.; datalines; A D 20MAR2020:00:00:00.000000 100.97 201-1914974-148_15022021CEUR 1234 A C 19MAR2020:00:00:00.000000 101.97 201-1914974-77_15022021DEUR 1235 A D 13MAR2020:00:00:00.000000 100.97 201-1914974-140_15022021CEUR 1212 A C 12MAR2020:00:00:00.000000 101.97 201-1914974-70_15022021DEUR 1213 B D 20MAR2020:00:00:00.000000 100.97 201-1914974-35_15022021DEUR 1236 B C 19MAR2020:00:00:00.000000 55.97 201-1914974-29_15022021CEUR 1226 B C 18MAR2020:00:00:00.000000 50.97 201-1914974-26_15022021DEUR 1237 E D 20MAR2020:00:00:00.000000 100.97 201-1914974-209_15022021CEUR 1238 E C 19MAR2020:00:00:00.000000 100.97 201-1914974-208_15022021CEUR 1239 E C 18MAR2020:00:00:00.000000 101.97 201-1914974-206_15022021CEUR 1231 F C 20MAR2020:00:00:00.000000 1000.97 201-1914974-205_15022021CEUR 1232 F D 20MAR2020:00:00:00.000000 50.97 201-1914751-81_12022021DEUR 1233 F D 19MAR2020:00:00:00.000000 1005.97 201-1914751-82_12022021DEUR 1244 F C 13MAR2020:00:00:00.000000 50.97 201-1914751-83_12022021DEUR 1245 G C 20MAR2020:00:00:00.000000 1000.97 201-1914751-84_12022021DEUR 1246 G D 19MAR2020:00:00:00.000000 1000.97 201-1914974-147_15022021CEUR 1247 G C 18MAR2020:00:00:00.000000 1005.97 201-1914974-79_15022021DEUR 1248 I D 19MAR2020:00:00:00.000000 100.97 201-1914974-141_15022021CEUR 1221 I C 19MAR2020:00:00:00.000000 101.97 201-1914974-71_15022021DEUR 1222 J C 13MAR2020:00:00:00.000000 100.97 201-1914974-143_15022021CEUR 1224 J D 13MAR2020:00:00:00.000000 100.97 201-1914974-72_15022021DEUR 1225 ;
You do want to be careful with use of an informat like 8.2 as it may do things to integer values you don't expect. Run this and see:
data example; input x 8.2; datalines; 1 12 123 1234 123.4 123.45 ;
From ancient days use of a decimal point on an informat to read data means "implied decimal" and SAS will put one in the value if there wasn't one read.
Chartis names SAS a leader in both Model Risk Governance and Model Validation