I have a credit card transactiondataset with 50 million observations and
Below is the scenario for example.
account_number amount_due interst&charges category_code amount_paid info_date
12456326 2000 13% good 3000 30jul2008
12456327 3000 18% good 3000 31aug2009
12456323 4000 19% bad 1000 30apr2010
12456326 3000 13% bad 1000 30jul2009
12456327 3000 18% bad 1000 31jun2010
12456323 4100 19% good 4150 30jun2012
12456326 3200 13% good 3210 30may2010
12456327 4000 18% good 5000 31may2011
There are millions of observations with multiple account_id transaction and
info_date updated on last of every month.
Now what i want is mentioned below:
To flag those account id which are going from bad to good in terms of category_code in separate column and how much extra
amount they have paid during that month or period when they are going from bad to good.(for example acc_number "12456326" was
first good then bad and then good.so i want to extract information as when it
going from bad to good and how much amount was paid during that period
Calculate the sum of extra amount of that account_number which was paid during bad period.
I hope yo all understand well. Please tell me the flagging then i will take help from that note. .
This is not exactly what you are looking for but I think it will get you the desired solution:
data have;
infile cards;
informat date date9.;
format date date9.;
input account due charges code $ paid date;
cards;
12456326 2000 .13 good 3000 30jul2008
12456327 3000 .18 good 3000 31aug2009
12456323 4000 .19 bad 1000 30apr2010
12456326 3000 .13 bad 1000 30jul2009
12456327 3000 .18 bad 1000 30jun2010
12456323 4100 .19 good 4150 30jun2012
12456326 3200 .13 good 3210 30may2010
12456327 4000 .18 good 5000 31may2011
;
run;
proc sort data=have;by account date code;
data want;
set have;
by account;
lag_code = lag(code);
lag_paid = lag(paid);
if not first.account then do;
if lag_code = 'bad' and code = 'good' then do;
diff_paid=paid-lag_paid;
end;
end;
run;
You didn't post the output you want yet ?
Does it will appear bad->bad->bad->good ?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.