BookmarkSubscribeRSS Feed
pankak
Calcite | Level 5

I have  a credit card transactiondataset with 50 million observations and

Below is the scenario for example.

account_number amount_due interst&charges category_code  amount_paid  info_date 

12456326       2000         13%             good         3000          30jul2008

12456327       3000         18%             good         3000          31aug2009

12456323       4000         19%             bad          1000          30apr2010

12456326       3000         13%             bad          1000          30jul2009

12456327       3000         18%             bad          1000          31jun2010

12456323       4100         19%             good         4150          30jun2012

12456326       3200         13%             good         3210          30may2010

12456327       4000         18%             good         5000          31may2011

There are millions of observations with multiple account_id transaction and

info_date updated on last of every month.

Now what i want is mentioned below:

To flag those account id which are going from bad to good in terms of category_code in separate column and how much extra

amount they have paid during that month or period when they are going from bad to good.(for example acc_number "12456326" was

first good then bad and then good.so i want to extract information as when it

going from bad to good and how much amount was paid during that period

Calculate the sum of extra amount of that account_number which was paid during bad period.

I hope yo all understand well. Please tell me the flagging then i will take help from that note. .

2 REPLIES 2
Steelers_In_DC
Barite | Level 11

This is not exactly what you are looking for but I think it will get you the desired solution:

data have;

infile cards;

informat date date9.;

format date date9.;

input  account due charges code $ paid date;

cards;

12456326 2000 .13 good 3000 30jul2008

12456327 3000 .18 good 3000 31aug2009

12456323 4000 .19 bad  1000 30apr2010

12456326 3000 .13 bad  1000 30jul2009

12456327 3000 .18 bad  1000 30jun2010

12456323 4100 .19 good 4150 30jun2012

12456326 3200 .13 good 3210 30may2010

12456327 4000 .18 good 5000 31may2011

;

run;

proc sort data=have;by account date code;

data want;

set have;

by account;

lag_code = lag(code);

lag_paid = lag(paid);

if not first.account then do;

    if lag_code = 'bad' and code = 'good' then do;

        diff_paid=paid-lag_paid;

    end;

end;

run;

Ksharp
Super User

You didn't post the output you want yet ?

Does it will appear  bad->bad->bad->good ?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 876 views
  • 0 likes
  • 3 in conversation