BookmarkSubscribeRSS Feed
Ksharp
Super User

Take three months for example.

data have ;
input id spend  date :date9.  ;
format date date9.;
CARDS;
1     20      01JAN2010
1     10      01FEB2010
1      0      01Mar2010
1     10      01Apr2010
1     40      01May2010       
2     20      01Mar2009
2     10      01Apr2009   
2     10      01May2009
2     10      01jun2009
2      5      01jul2009
2     15      01Aug2009  
;;;;
run;
proc sql;
create table want as
 select *,case when (select sum(spend) from have where id=a.id and date between intnx('month',a.date,-2,'s') and a.date) eq 50 then 'Y' else 'N' end as flag
  from have as a;
quit;

Xia Keshan

gauravsood19
Calcite | Level 5

It does work, thanks for that but the self join is killing the server when i run it for 12 and 24 months on the complete table containing 700000 rows.

Do you think we can somehow use the Lag logic I was working with?

Thanks

Gaurav

gergely_batho
SAS Employee

Look at this thread:

https://communities.sas.com/thread/61456?start=0&tstart=0

There are data step based solutions in it.

gergely_batho
SAS Employee

Hi Gaurav,

I had some time to work on this problem. Still calculating the DPD_XX_730 variable is an excercise for you Smiley Happy (you will need to create one more view)

I hope AccountingPeriod is a SAS date variable. If not, you need to change the intnx() function.

proc sort data=have;

by customerid accountingperiod;

run;

data have_exp6 / view=have_exp6;

  set have;

  AccountingPeriod=intnx("month",AccountingPeriod,6,'s');

run;

data have_exp12 / view=have_exp12;

  set have;

  AccountingPeriod=intnx("month",AccountingPeriod,12,'s');

run;

data want;

  set have_exp6(in=in6) have_exp12(in=in12) have(in=inT);/*interleaving original and expiry datasets Gaurav: maybe here you need to adust the order of the datasets! Maybe have comes first. Experiment!*/

  by Firm CustomerID AccountingPeriod;

  if first.Firm then do;/*initializing counters and sums at beginning of a group*/

       DPD_30_180=0; DPD_60_180=0; DPD_90_180=0;

       DPD_30_360=0; DPD_60_360=0; DPD_90_360=0;

  end;

  if inT then do;/*if transaction, then first calculate, then output*/

       DPD_30_180+DPD30_flag; DPD_60_180+DPD60_flag; DPD_90_180+DPD90_flag;

       DPD_30_360+DPD30_flag; DPD_60_360+DPD60_flag; DPD_90_360+DPD90_flag;

       output;

  end;

  if in6 do;/*if 6 month expiry, then calculate*/

       DPD_30_180+(-DPD30_flag); DPD_60_180+(-DPD60_flag); DPD_90_180+(-DPD90_flag);

  end;

  if in12 do;/*if 12 month expiry, then calculate*/

       DPD_30_360+(-DPD30_flag); DPD_60_360+(-DPD60_flag); DPD_90_360+(-DPD90_flag);

  end;

run;

Message was edited by: Gergely Bathó

gauravsood19
Calcite | Level 5

Hey GB, Well In the input table I already have calculated 2 variables accountingperiod_6 (containing the month going back 6 months in time from current value of accountingperiod) and accountingperiod_12 (same logic but 12 months in time).

So I have everything in 1 input table now. I dont have to create seperate tables as you have shown.

I am working now with your code to see if it solves my problem.

Thanks

Gaurav

gergely_batho
SAS Employee

You are right! You don't need the additional views then.

Just change the set statement:

set have_exp6(in=in6 rename=(Accountingperiod_6=AccountingPeriod) drop=AccountingPeriod) have_exp12(rename=(Accountingperiod_12=AccountingPeriod) drop=AccountingPeriod in=in12) have(in=inT);

Good luck!

gauravsood19
Calcite | Level 5

Thanks GB I got your code to work to suit my requirements.

I had a question although, why are you using the following expression

DPD-30-180+(-DPD30_flag), why are you using the negative '-' sign?

Gaurav

gergely_batho
SAS Employee

DPD_30-180+(-DPD30_flag);   is equvivalent to

DPD_30_180=DPD_30_180-DPD30_flag;   ( plus:  DPD_30_180 becomes retained )

I just wanted to make that line very similar to the previous sum statment (and I also copy-pasted it from there):

DPD_30_180+180_DPD30_flag;

But this one would be syntactically wrong:

DPD_30-180-DPD30_flag;    - it is not a sum statement.

My code interleaves the original dataset and 2 shifted datasets.

When I read a record from the original dataset, I need to add to the cummulative variables.

When I read from the shifted dataset, I need to substract. It is because after 6 (or 12) months the effect of the DPD_XX variable "expires".

Ksharp
Super User

700000 rows is not a big deal .  I guess it will take you about six hours to get it . and faster version code ? I think you can use Array or Hash Table.




Xia Keshan

gauravsood19
Calcite | Level 5

Ksharp, unfortunately time is th critcal factor here since I have to create 6,12 and 24 month flags and that too for 3 different variables (which comes to 9 self joins) and its been running now for the last 4 hours and i dont know when it is going to end.

Gergely, the example you gave isnt accumulating the data for the last 6 or 12 months as i wanted. It is just copying the data from one variable into the sum variable.

I have been playing around with an array and have got it to work as well, but again something is missing since I am not able to reinitialise the sum value once the BY group value changes. Here is the code

%let period = 6;

Data want_6;

set have;

by customerid;

array summed[&period] _temporary;

If X = &period then X=1;

else X+1;

summed = DPD30_flag;

if accountingperiod > accountingperiod_6 then do;

     if _n_ >= &period then do;

          DPD30_flag_sum = sum(of summed

  • );
  •      end;

    end;

    run;

    Accountingperiod is the column holding the yyyymm value for the current observation and accountingperiod_6 is the varaible holding the yyyymm value for the 6th previous month.

    Can you guys see what i am missing?

    Gaurav


    Ksharp
    Super User

    There are a couple of questions, Is there a missing month like 01FEB2012  01JUN2012  , Is there some other day in the same month like 02FEB2012 04FEB2012 ?

    Still take three months for example :

    
    
    
     
    data have ;
    input id spend  date :date9.  ;
    format date date9.;
    CARDS;
    1     20      01JAN2010
    1     10      01FEB2010
    1      0      01Mar2010
    1     10      01Apr2010
    1     40      01May2010       
    2     20      01Mar2009
    2     10      01Apr2009   
    2     10      01May2009
    2     10      01jun2009
    2      5      01jul2009
    2     15      01Aug2009  
    ;;;;
    run;
    data want(drop=_date sum);
     if _n_ eq 1 then do;
      if 0 then set have(rename=(date=_date));
      declare hash ha(dataset:'have(rename=(date=_date))',hashexp:20);
       ha.definekey('id','_date');
       ha.definedata('spend');
       ha.definedone();
    end;
    set have;
    sum=0;
    do _date=intnx('month',date,-2,'s') to date;
     if ha.find()=0 then sum+spend;
    end;
    flag=ifc( sum eq 50 , 'Y' , 'N');
    run;
    

    Xia Keshan

    sas-innovate-2024.png

    Don't miss out on SAS Innovate - Register now for the FREE Livestream!

    Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

     

    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 25 replies
    • 8436 views
    • 1 like
    • 11 in conversation