turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Forecasting
- /
- how to simuluate a recursive formula

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted
# how to simuluate a recursive formula

Options

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-16-2018 02:42 PM

Hi all,

I have a data table that I have created using the following code

data simulation; a1 = 0; x1 = 50; seed = -2000; do i = -50 to 100; a = rannor(seed); x = x1 + a - .6*x1; if i > 0 then output; x1 = x; a1 = a; end; run; quit;

proc arima data = simulation; identify var = x nlag = 20; estimate p = 1 noint printall ; /*identify var = y crosscorr = x nlag = 20;*/ forecast lead = 5 out = boxj; run; quit;

I want to simulate a new variable y which relies on itself and x as such:

is there a way to simulate this variable in SAS ? This is recursive and uses values of the already simulated variable x

Thank you

Accepted Solutions

Solution

05-19-2018
06:58 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

05-17-2018 03:22 PM

Hi,

I think you will need to use the RETAIN statement and explicitly create your recursive lag variables for Y and A in order to simulate your data for Y. Please see if the following allows you to obtain your desired result:

```
data boxj(keep=x xl1 xl2);
set boxj; /* output data set from PROC ARIMA step */
if x=. then x=forecast; /* fill in missing X values with forecast */
xl1=lag1(x); xl2=lag2(x); /* create lag1 and lag2 variables for X */
run;
/* simulate yt = .95yt-1 - .225yt-2 + 1.0234xt-1 - .9xt-2 + at - .3at-1 */
data simy;
set boxj(firstobs=3); /* omit first 2 obs with missing Lag X values */
retain yl1 yl2 20 al1 0; /* specify starting values for yl1, yl2, al1 */
call streaminit(907356); /* specify positive seed to reproduce results */
a=rand('normal');
y = .95*yl1 - .225*yl2 + 1.0234*xl1 - .9*xl2 - .3*al1 + a;
output;
yl2=yl1; yl1=y; al1=a; /* generate recursive lag variables for y and a */
run;
```

I hope this helps!

DW

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Dids

05-16-2018 05:23 PM

Use the LAG1() and LAG2() functions.

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

05-16-2018 06:06 PM

Thank you for your reply,

I have never used the LAG functions. I am looking up examples and trying to fit them to my specific data.

Would this be along the lines of :

creating and assigning 4 new variables for yt - 1, yt-2, xt-1, xt - 2 ? then plugging them into my yt formula?

Would you know of good examples especially since I have Yt-k and Xt-k (two different variables)

Thank you

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Dids

05-17-2018 12:09 AM

This *should* work:

```
data sim2;
retain y 0;
set boxj;
y = 123*lag1(y) - 976*lag2(y) + 0.879*lag1(x) - 7654*lag2(x) + 456*a - 234*lag1(a);
run;
```

(untested)

PG

Solution

05-19-2018
06:58 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

05-17-2018 03:22 PM

Hi,

I think you will need to use the RETAIN statement and explicitly create your recursive lag variables for Y and A in order to simulate your data for Y. Please see if the following allows you to obtain your desired result:

```
data boxj(keep=x xl1 xl2);
set boxj; /* output data set from PROC ARIMA step */
if x=. then x=forecast; /* fill in missing X values with forecast */
xl1=lag1(x); xl2=lag2(x); /* create lag1 and lag2 variables for X */
run;
/* simulate yt = .95yt-1 - .225yt-2 + 1.0234xt-1 - .9xt-2 + at - .3at-1 */
data simy;
set boxj(firstobs=3); /* omit first 2 obs with missing Lag X values */
retain yl1 yl2 20 al1 0; /* specify starting values for yl1, yl2, al1 */
call streaminit(907356); /* specify positive seed to reproduce results */
a=rand('normal');
y = .95*yl1 - .225*yl2 + 1.0234*xl1 - .9*xl2 - .3*al1 + a;
output;
yl2=yl1; yl1=y; al1=a; /* generate recursive lag variables for y and a */
run;
```

I hope this helps!

DW

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DW_SAS

05-18-2018 06:00 PM

Hi DW

Thank You this does seem to do the job. I am going to run it with different Arima models and see if I am getting close to my goal

This is greatly appreciated

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

05-17-2018 06:16 PM

Hi PG,

Thank you for the help,

Unfortunately it gives null values for the created y variable. I am basically simulating a transfer function model (dynamic regressive model) and I have the Yt formula as it is expanded by BoxJenkins...I wonder why it gives null values..

Again Thank you

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Dids

05-17-2018 07:12 PM

@Dids wrote:

Hi PG,

Thank you for the help,

Unfortunately it gives null values for the created y variable. I am basically simulating a transfer function model (dynamic regressive model) and I have the Yt formula as it is expanded by BoxJenkins...I wonder why it gives null values..

Again Thank you

Probably because you do not have any value of Y for the first record or two. If the formula is to look at two prior periods and they are missing then expect the result to be missing. "Recursion" has to start with something.

Example: **Fibonacci sequence**, and characterized by the fact that every number **after the first two** is the sum of the two preceding ones.

Values are defined as:

with seed values

- F 1 = 1 , F 2 = 1 {\displaystyle F_{1}=1,\;F_{2}=1}

So you need a seed or starting values of Y for the first two records in your data. Since you have never mentioned if Y is one of the results shown that's about as far as I can get on this particular issue.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Dids

05-18-2018 12:59 AM

As soon as yt is missing, yt+1 will be missing also, and so on... The only way out is to prevent y from being missing. For example:

```
data simulation;
call streaminit(7556);
x = 50;
do i = -50 to 100;
a = rand("normal");
x = x + a - .6*x;
output;
end;
run;
data sim2;
retain y 0;
set simulation;
y = coalesce (.95*lag1(y) - .225*lag2(y) + 1.0234*lag1(x) - .9*lag2(x) - .3*lag1(a) + a, 0);
if i > 0 then output;
drop i;
run;
```

PG