Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- moving standard deviation except t period.

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-12-2020 02:44 AM
(1081 views)

Hello!

Simpl question.

I want to obtain standard deviations of previous five-year sales. In other words, I need to calculate the std from **t-1 to t-5.**

As long as I know, the code below means the period between **t and t-4**. I have tried to find an exact code for this command but could not find but assume I may use other than Proc expand.

Could anyone help me with how to write the code for t-1 to t-n period?

Thank you so much!

```
PROC EXPAND DATA=comp1 OUT=sale;
id fyear;
CONVERT sale=stdsale / TRANSFORMOUT=(MOVSTD 5);
by gvkey;
RUN;
```

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry, the Std Function calculates the sample std and it seems you want the population std.

Just edited the program a little. See if this gives you the desired result. If so, I'll explain the logic 🙂

```
data have;
input gvkey fyear sale;
datalines;
1 2010 100
1 2011 200
1 2012 300
1 2013 400
1 2014 500
2 2010 100
2 2011 200
2 2012 300
2 2013 400
2 2014 500
;
data want(drop=obs i);
array x[0:3] _temporary_;
array y[0:3] _temporary_;
call stdize('replace', 'mult=', 0, of x[*], _N_);
do obs=1 by 1 until (last.gvkey);
do i = 0 to dim(x)-1;
y [i] = (x [i] - mean(of x[*])) ** 2;
end;
std = sqrt(divide(sum(of y[*]), 4));
set have;
by gvkey;
x[mod(obs, 4)] = sale;
output;
end;
run;
```

Result:

std gvkey fyear sale 0.000 1 2010 100 43.301 1 2011 200 82.916 1 2012 300 111.803 1 2013 400 111.803 1 2014 500 0.000 2 2010 100 43.301 2 2011 200 82.916 2 2012 300 111.803 2 2013 400 111.803 2 2014 500

10 REPLIES 10

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

How about

```
data comp1;
input gvkey fyear sale;
datalines;
1 2010 100
1 2011 100
1 2012 100
1 2013 100
1 2014 200
2 2010 100
2 2011 100
2 2012 100
2 2013 100
2 2014 200
;
PROC EXPAND DATA=comp1 OUT=sale;
id fyear;
CONVERT sale=stdsale / TRANSFORMOUT=(MOVSTD (1 1 1 1 0));
by gvkey;
RUN;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your reply.

It seems that using

`TRANSFORMOUT=(MOVSTD (1 1 1 1 0))`

means x multiplied by(X) 0, x-1 X 1, and so on to x-4 X 1... which actually results in the period from t-1 to t-4 while the denominator is still 5. I just tried to test with the code below.

```
data work;
input gvkey fyear sale;
datalines;
1 2010 100
1 2011 100
1 2012 100
1 2013 200
1 2014 200
2 2010 100
2 2011 100
2 2012 100
2 2013 200
2 2014 200
;
```

What I obtained from it are just weird values.

The standard deviation of 21.65 looks out of nowhere. I can guess the variance there is sth like 468.7 so if I multiply 4 with this variance, then I get close to 1875, which is the nominator of the definition of STD, though I have no clue where this number comes from.

my calculations are... (As the most recent observations are not used)

mean= (100+100+100+200)/4 = 175

variance = {(100-175)^2 +(100-175)^2+(100-175)^2+(200-175)^2} / 4= 4375 (not like 1875)

Std = 66.1438

This looks not quite fit with what I seek now standard deviations from t-5 to t-1, which conceptually needs observations at the t-1 period as well.

Any help?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think you're right. Is it a requirement to use Proc Expand?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Not at all, I am just working on replicating a journal paper to use their codes in my research as well. A variable in this paper is defined as standards deviations of the previous five years' sale. I thought PROC EXPAND can handle but facing this difficulty...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Ok. I'm sure Proc Expand can handle this somehow, but I'm no expert.

Here is a data step approach that gives you what you want:

```
data have;
input gvkey fyear sale;
datalines;
1 2010 100
1 2011 200
1 2012 300
1 2013 400
1 2014 500
2 2010 100
2 2011 200
2 2012 300
2 2013 400
2 2014 500
;
data want(drop=obs);
array x[0:3] _temporary_;
call missing(of x[*]);
do obs=1 by 1 until (last.gvkey);
std = std(of x[*]);
set have;
by gvkey;
x[mod(obs, 4)]=sale;
output;
end;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your help.

but the output value by the above code does not provide correct standard deviations as well.

I could not understand "x[mod(obs, 4)]=sale;" part, and whether this code is forward-looking or backward is not sure.

I think I should use other than PROC EXPAND.

Thanks for your helps again 🙂

but the output value by the above code does not provide correct standard deviations as well.

I could not understand "x[mod(obs, 4)]=sale;" part, and whether this code is forward-looking or backward is not sure.

I think I should use other than PROC EXPAND.

Thanks for your helps again 🙂

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry, the Std Function calculates the sample std and it seems you want the population std.

Just edited the program a little. See if this gives you the desired result. If so, I'll explain the logic 🙂

```
data have;
input gvkey fyear sale;
datalines;
1 2010 100
1 2011 200
1 2012 300
1 2013 400
1 2014 500
2 2010 100
2 2011 200
2 2012 300
2 2013 400
2 2014 500
;
data want(drop=obs i);
array x[0:3] _temporary_;
array y[0:3] _temporary_;
call stdize('replace', 'mult=', 0, of x[*], _N_);
do obs=1 by 1 until (last.gvkey);
do i = 0 to dim(x)-1;
y [i] = (x [i] - mean(of x[*])) ** 2;
end;
std = sqrt(divide(sum(of y[*]), 4));
set have;
by gvkey;
x[mod(obs, 4)] = sale;
output;
end;
run;
```

Result:

std gvkey fyear sale 0.000 1 2010 100 43.301 1 2011 200 82.916 1 2012 300 111.803 1 2013 400 111.803 1 2014 500 0.000 2 2010 100 43.301 2 2011 200 82.916 2 2012 300 111.803 2 2013 400 111.803 2 2014 500

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your ongoing efforts! I actually came up with easier programming but your code certainly works as well! Thank you so much!

Have a good day!

Have a good day!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Btw, here mean = (100 + 100 + 100 + 200) / 4 = 125 😉

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

lol... I made a huge but simple mistake! thank you for letting me know that 🙂

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.