Calculations and counters with time-series data in SAS EG

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

Calculations and counters with time-series data in SAS EG

I'm trying to calculate a value, let's call it lambda, that is an average of the inter-arrival times of sales within each customer in each region in my time-series dataset.  I am using SAS Enterprise Guide.  Here is an example of how I'm trying to calculate this value:

photo.JPG

Here is a sample of a smaller dataset that is similar to my larger one that I've been working on:

data sample;

  input customer region qtr year sales;

  informat qtr yyq6.;

  format qtr yyq6.;

  datalines;

  1 2 2003Q1 2003 0

  1 2 2003Q2 2003 0

  1 2 2003Q3 2003 3

  1 2 2003Q4 2003 5

  1 2 2004Q1 2004 0

  1 2 2004Q2 2004 0

  1 2 2004Q3 2004 6

  1 2 2004Q4 2004 0

  1 2 2005Q1 2005 2

  1 2 2005Q2 2005 0

  1 2 2005Q3 2005 3

  1 2 2005Q4 2005 0

  2 6 2003Q1 2003 0

  2 6 2003Q2 2003 0

  2 6 2003Q3 2003 3

  2 6 2003Q4 2003 5

  2 6 2004Q1 2004 0

  2 6 2004Q2 2004 0

  2 6 2004Q3 2004 0

  2 6 2004Q4 2004 2

  2 6 2005Q1 2005 0

  2 6 2005Q2 2005 4

  2 6 2005Q3 2005 0

  2 6 2005Q4 2005 0

  ;

run;

proc sort data=sample;

  by customer region;

run;

I've attached an image below that shows the method I've been attempting and what the output would look like if I could get the correct results, but I'm not sure how to get this done in SAS.  I am open to any suggestions if there is an easier or more effective method.

Anyway, I've been trying to create a variable count_sales that counts the number of sales within each customer and region. Then I was trying to calculate a variable count_gap that would count the number of quarters between sales (starting at the first sale). I was then going to have a variable sum_gaps that would keep track of the sum of these inter-arrival times (gaps between sales), so it would basically add the value of count_gap at each sale.  I then thought I could calculate lambda by taking sum_gaps and dividing that by (count_sales - 1).  I'm not sure if this is the most effective method, but it's all I could come up with at the time, and I'm still not sure how to do the calculations in SAS. Any help is greatly appreciated.

Capture.PNG


Accepted Solutions
Solution
‎06-03-2014 11:38 AM
Super User
Posts: 10,041

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

If I understand what you mean.


data sample;
  input customer region qtr year sales;
  informat qtr yyq6.;
  format qtr yyq6.;
  datalines;
  1 2 2003Q1 2003 0
  1 2 2003Q2 2003 0
  1 2 2003Q3 2003 3
  1 2 2003Q4 2003 5
  1 2 2004Q1 2004 0
  1 2 2004Q2 2004 0
  1 2 2004Q3 2004 6
  1 2 2004Q4 2004 0
  1 2 2005Q1 2005 2
  1 2 2005Q2 2005 0
  1 2 2005Q3 2005 3
  1 2 2005Q4 2005 0
  2 6 2003Q1 2003 0
  2 6 2003Q2 2003 0
  2 6 2003Q3 2003 3
  2 6 2003Q4 2003 5
  2 6 2004Q1 2004 0
  2 6 2004Q2 2004 0
  2 6 2004Q3 2004 0
  2 6 2004Q4 2004 2
  2 6 2005Q1 2005 0
  2 6 2005Q2 2005 4
  2 6 2005Q3 2005 0
  2 6 2005Q4 2005 0
  ;
run;
 
proc sort data=sample;
  by customer region;
run;
data sample;
 set sample;
 by customer region;
 if first.region then do; count_sales=0; n=0;end;
 if sales ne 0 then count_sales+1; 
 if (lag(sales) ne 0 and sales eq 0) or (lag(customer) ne customer) or (lag(region) ne region)  then n+1;
run;
data sample;
 set sample;
 by customer region n;
 retain start 0;
 if first.n then do;count_gap=0; start=0;end;
 if n=1 then do;
               if  start and sales ne 0 then count_gap+1;
                  if  sales ne 0 then start=1;
                 end;
  else count_gap+1;
  drop start;
run;
data sample;
 set sample;
 by customer region n;
 retain temp sum_gaps;
 if first.region then temp=0;
 if last.n and sales ne 0 then temp+count_gap;
 sum_gaps=temp;
  drop n temp;
run;
data sample;
 set sample;
 by customer region ;
 if last.region then lambda=sum_gaps/(count_sales-1);
run;




Xia Keshan

Message was edited by: xia keshan

View solution in original post


All Replies
Super User
Super User
Posts: 7,977

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

Count sales seems pretty straightforward:

proc sql;

     select count(*) as count_sales from have group by customer,region having sales >0;

quit;

Not sure how you derive gaps though, what is this gap?

Contributor
Posts: 26

Re: Calculations and counters with time-series data in SAS EG

The gaps are the number of quarters between each sale. So in my example above, the count_gaps variable would be zero until the first sale of each customer, then it would start counting the number of quarters between each sale.

Super User
Posts: 7,832

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

data want1;

set have;

by cust;

retain counter;

if first.cust then counter = -1;

if sales ne 0

then do;

  if counter > 0 then output;

  counter = 1;

end;

else if counter > 0 then counter + 1;

run;

proc summary data=want1 noprint;

by cust;

var counter;

output

  out=want (drop=_type_ _freq_)

  mean=

;

run;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 26

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to KurtBremser

KurtBremser,

This appears to work on my smaller sample dataset, but for some reason is not working on my larger one.  It seems that on my larger dataset, the counter is not starting over at the beginning of each customer, and I'm not sure why.

Super User
Posts: 7,832

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

I see that you are sorting by region. If that is an additional criterion, then you should sort by customer, region and quarter, and use first.region instead of first.cust in my data step.

If you add additional output statements in the data step so that all records are written to want1, you can follow the development of the counter and see where the logic goes astray.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 26

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

Are there any other approaches that may be more efficient than mine?

Solution
‎06-03-2014 11:38 AM
Super User
Posts: 10,041

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

If I understand what you mean.


data sample;
  input customer region qtr year sales;
  informat qtr yyq6.;
  format qtr yyq6.;
  datalines;
  1 2 2003Q1 2003 0
  1 2 2003Q2 2003 0
  1 2 2003Q3 2003 3
  1 2 2003Q4 2003 5
  1 2 2004Q1 2004 0
  1 2 2004Q2 2004 0
  1 2 2004Q3 2004 6
  1 2 2004Q4 2004 0
  1 2 2005Q1 2005 2
  1 2 2005Q2 2005 0
  1 2 2005Q3 2005 3
  1 2 2005Q4 2005 0
  2 6 2003Q1 2003 0
  2 6 2003Q2 2003 0
  2 6 2003Q3 2003 3
  2 6 2003Q4 2003 5
  2 6 2004Q1 2004 0
  2 6 2004Q2 2004 0
  2 6 2004Q3 2004 0
  2 6 2004Q4 2004 2
  2 6 2005Q1 2005 0
  2 6 2005Q2 2005 4
  2 6 2005Q3 2005 0
  2 6 2005Q4 2005 0
  ;
run;
 
proc sort data=sample;
  by customer region;
run;
data sample;
 set sample;
 by customer region;
 if first.region then do; count_sales=0; n=0;end;
 if sales ne 0 then count_sales+1; 
 if (lag(sales) ne 0 and sales eq 0) or (lag(customer) ne customer) or (lag(region) ne region)  then n+1;
run;
data sample;
 set sample;
 by customer region n;
 retain start 0;
 if first.n then do;count_gap=0; start=0;end;
 if n=1 then do;
               if  start and sales ne 0 then count_gap+1;
                  if  sales ne 0 then start=1;
                 end;
  else count_gap+1;
  drop start;
run;
data sample;
 set sample;
 by customer region n;
 retain temp sum_gaps;
 if first.region then temp=0;
 if last.n and sales ne 0 then temp+count_gap;
 sum_gaps=temp;
  drop n temp;
run;
data sample;
 set sample;
 by customer region ;
 if last.region then lambda=sum_gaps/(count_sales-1);
run;




Xia Keshan

Message was edited by: xia keshan

Contributor
Posts: 26

Re: Calculations and counters with time-series data in SAS EG

Ksharp,

Exactly! Thanks so much.

Respected Advisor
Posts: 3,156

Re: Calculations and counters with time-series data in SAS EG

Posted in reply to klappy711

I must have missed something, do we have to be so complicated? Proc SQL seems to be much more straightforward?

data sample;

input customer region qtr year sales;

informat qtr yyq6.;

format qtr yyq6.;

datalines;

1 2 2003Q1 2003 0

1 2 2003Q2 2003 0

1 2 2003Q3 2003 3

1 2 2003Q4 2003 5

1 2 2004Q1 2004 0

1 2 2004Q2 2004 0

1 2 2004Q3 2004 6

1 2 2004Q4 2004 0

1 2 2005Q1 2005 2

1 2 2005Q2 2005 0

1 2 2005Q3 2005 3

1 2 2005Q4 2005 0

2 6 2003Q1 2003 0

2 6 2003Q2 2003 0

2 6 2003Q3 2003 3

2 6 2003Q4 2003 5

2 6 2004Q1 2004 0

2 6 2004Q2 2004 0

2 6 2004Q3 2004 0

2 6 2004Q4 2004 2

2 6 2005Q1 2005 0

2 6 2005Q2 2005 4

2 6 2005Q3 2005 0

2 6 2005Q4 2005 0

;

run;

proc sql;

select customer,intck('qtr',min(qtr),max(qtr))/(count(*)-1) as lamda  from sample where sales >0 group by customer;

quit;

Now you have your lamda, merge it back if needed.

Regards,

Haikuo

Contributor
Posts: 26

Re: Calculations and counters with time-series data in SAS EG

Hai.kuo,

Yes this seems to be very effective! Thanks

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 410 views
  • 6 likes
  • 5 in conversation