Text mining and content categorization

How to generate Time Series Data, based on this variable's lag numbers

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

How to generate Time Series Data, based on this variable's lag numbers

Good afternoon,

I may have a problem to generate new data.   As we can see that after Jan-01-2014, Rev is 0. And I would like to fill Rev after Jan-01-2014 by calculating: Rev=lag2(Rev)*Exp(Num). However, there is no data for Rev after Jan-01-2014, so SAS cannot calculate Rev based on the lagged Rev. I tried Retain but it didn't workSmiley Sad

NumRevDate
345454Jan-01-2013
4354353Feb-01-2013
725425Mar-01-2013
3254235Jan-01-2014
8Feb-01-2014
9Mar-01-2014
12Jan-01-2015
45Feb-01-2015

Accepted Solutions
Solution
‎03-17-2014 09:09 AM
Respected Advisor
Posts: 4,755

Re: How to generate Time Series Data, based on this variable's lag numbers

An easy way:

data want;

   back2 = lag(rev);

   set have;

   if rev=. then rev=back2 * exp(num);

run;

By placing the LAG function before the SET statement, it retrieves 2 observations prior, and also reflects changes that were made by the data manipulation statements.

Good luck.

...

To go back 4 observations, the equivalent of lag4, use

back4 = lag3(rev);

View solution in original post


All Replies
Grand Advisor
Posts: 16,850

Re: How to generate Time Series Data, based on this variable's lag numbers

What does your data look like versus what you want?

Is the above data set what you want, or what you have?

Occasional Contributor
Posts: 6

Re: How to generate Time Series Data, based on this variable's lag numbers

thanks Reeza,

what I have is:

NumRevDate
345454Jan-01-2013
4354353Feb-01-2013
725425Mar-01-2013
3254235Jan-01-2014
8Feb-01-2014
9Mar-01-2014
12Jan-01-2015
45Feb-01-2015

And what I want is following, with conditions: 1) When Date<=Jan-01-2014, Rev_r=Rev; 2) When Date>Jan-01-2014, Rev_r=lag2(Rev_r) * exp(Num)

NumRevDateRev_r
345454Jan-01-201345454
4354353Feb-01-2013354353
725425Mar-01-201325425
3254235Jan-01-2014254235
8Feb-01-201475790856.82
9Mar-01-20142060087542
12Jan-01-20151.23353E+13
45Feb-01-20157.19677E+28
Respected Advisor
Posts: 4,606

Re: How to generate Time Series Data, based on this variable's lag numbers

The trick is to use the proper lag at the proper place in the datastep (i.e. after Rev gets its value):

data have;

input Num Rev Date :ANYDTDTE11.;

format date date9.;

datalines;

3 45454 Jan-01-2013

4 354353 Feb-01-2013

7 25425 Mar-01-2013

3 254235 Jan-01-2014

8 . Feb-01-2014

9 . Mar-01-2014

12 . Jan-01-2015

45 . Feb-01-2015

;

data want;

retain lag2Rev;

set have;

if missing(Rev) then Rev = lag2Rev * exp(Num);

lag2Rev = lag(Rev);

drop lag2Rev;

run;

proc print data=want noobs; run;

PG

PG
Respected Advisor
Posts: 3,124

Re: How to generate Time Series Data, based on this variable's lag numbers

Here is an idea, seems working for your present data:

data have;

input Num   Rev   Date :anydtdte20.;

format date date9.;

cards;

3     45454 Jan-01-2013

4     354353      Feb-01-2013

7     25425 Mar-01-2013

3     254235      Jan-01-2014

8     .     Feb-01-2014

9     .     Mar-01-2014

12    .     Jan-01-2015

45    .     Feb-01-2015

;

data want;

  set have;

   array rv(0:1) _temporary_;

   array rvr(0:1) _temporary_;

   if _n_ >= 3 then rvr(mod(_n_,2))=rv(mod(_n_,2)) * exp(num);

   rv(mod(_n_,2)) = coalesce(rev, rvr(mod(_n_,2)));

   if date < = '01jan2014'd then Rev_r = rv(mod(_n_,2));

   else Rev_r=rvr(mod(_n_,2));

run;


Haikuo

Occasional Contributor
Posts: 6

Re: How to generate Time Series Data, based on this variable's lag numbers

thank you Hai Kuo, but could you do me a favor to tell me what is condition statement: if _n_>3 here means? and what if Rev_r=lag4(Rev_r) * exp(Num) instead of lag2, how shall I change the code?

Respected Advisor
Posts: 3,124

Re: How to generate Time Series Data, based on this variable's lag numbers

An easy tweak will do:

data want;

  set have;

  array rv(0:3) _temporary_;

  array rvr(0:3) _temporary_;

  if _n_ >= 5 then rvr(mod(_n_,4))=rv(mod(_n_,4)) * exp(num);

  rv(mod(_n_,4)) = coalesce(rev, rvr(mod(_n_,4)));

  if date < = '01jan2014'd then Rev_r = rv(mod(_n_,4));

  else Rev_r=rvr(mod(_n_,4));

run;

However, I would recommend those solutions suggested by PG or Astounding, for the sake of easier coding.

Good Luck,

Haikuo

Solution
‎03-17-2014 09:09 AM
Respected Advisor
Posts: 4,755

Re: How to generate Time Series Data, based on this variable's lag numbers

An easy way:

data want;

   back2 = lag(rev);

   set have;

   if rev=. then rev=back2 * exp(num);

run;

By placing the LAG function before the SET statement, it retrieves 2 observations prior, and also reflects changes that were made by the data manipulation statements.

Good luck.

...

To go back 4 observations, the equivalent of lag4, use

back4 = lag3(rev);

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 867 views
  • 9 likes
  • 5 in conversation