Solved
Contributor
Posts: 38

# rolling standard deviation calculation

Hi all,

Please find attached a example out of a large dataset, in which I want to calculate the 3-year rolling standard deviation of variable x, from 1982 to 1989.

For example, the standard deviation in 1982 is the standard deviation of x in 1980, 1981 and 1982.

The dataset is a panel, but there are missing values in variable x. For the second company code as an example, I want the standard deviation in 1982 to be the rolling standard deviation of 5 (in 1980) and 7 (in 1981), although x is missing in 1982 (is it reasonable?).

I also found there is similar question in the forum (like ), but mine is a little bit different from that and I did not know how to modify based on that since I am not familiar with array.

Can anybody do me a favor? Thanks.

Accepted Solutions
Solution
‎02-03-2014 09:13 PM
Super User
Posts: 10,770

## Re: rolling standard deviation calculation

HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.

```data have;
infile cards dlm=',' truncover;
input code year x;
cards;
255956,1980,
255956,1981,
255956,1982,
255956,1983,
255956,1984,5
255956,1985,7
255956,1986,4
255956,1987,6
255956,1988,2
255956,1989,1
255964,1980,5
255964,1981,7
255964,1982,
255964,1983,
255964,1984,5
255964,1985,3
255964,1986,7
255964,1987,3
255964,1988,8
255964,1989,7
;
run;
proc sql;
create table want as
select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std
from have as a;
quit;

```

Message was edited byverlook the code variable . Fixed.     xia keshan

All Replies
Solution
‎02-03-2014 09:13 PM
Super User
Posts: 10,770

## Re: rolling standard deviation calculation

HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.

```data have;
infile cards dlm=',' truncover;
input code year x;
cards;
255956,1980,
255956,1981,
255956,1982,
255956,1983,
255956,1984,5
255956,1985,7
255956,1986,4
255956,1987,6
255956,1988,2
255956,1989,1
255964,1980,5
255964,1981,7
255964,1982,
255964,1983,
255964,1984,5
255964,1985,3
255964,1986,7
255964,1987,3
255964,1988,8
255964,1989,7
;
run;
proc sql;
create table want as
select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std
from have as a;
quit;

```

Message was edited byverlook the code variable . Fixed.     xia keshan

Posts: 4,736

## Re: rolling standard deviation calculation

Hey Ksharp

A bit late but: Happy New Year. Hope you're well.

And nice that you haven't forgotten us here (I guess you're now spending more time in Java related forums).

Cheers

Patrick

Posts: 3,167

## Re: rolling standard deviation calculation

Resurrection of the Ksharp! . Please don't be a stranger, I have always enjoyed reading your posts and learning from them. Hope Java is treating you as well as SAS.

Haikuo

Contributor
Posts: 52

## Re: rolling standard deviation calculation

Hi Ksharp

I need something similar to the above requirement - standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I have attached a sample file.

I tried to modify your code as:

proc sql

create table std as

select *, (std(ret) from have where month between month-36 and month-12 and cusip=a.cusip) as rolling_std

from have as a;

quit;

This code did not return values against rolling_std. Is it a problem with (month-12)?

I read that truncover allows reading variable-length records; I do not think that should be a problem here.

Any help would be greatly appreciated!

Thanks

Super User
Posts: 10,770

## Re: rolling standard deviation calculation

Not sure. I noticed that there is a date variable. you can use it to judge ?

But maybe you need a lot of time to run it .

```proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace;
guessingrows=32767;
run;
data have;
set temp;
d=input(put(date,best12.),yymmdd12.);
r=input(ret,?? best32.);
drop date ret;
format d date9.;
run;

proc sql ;
create table std as
select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std
from have as a;
quit;
```

Xia Keshan

Contributor
Posts: 52

## Re: rolling standard deviation calculation

Thank you, Ksharp.

I shall try this code and let you know it it works.

I had actually tried your code [select *,(select std(x) from have where year between a.month-2 and a.month and cusip=a.cusip) as rolling_std] on my data.. For some reason, the std deviations in my output were wrong.

I had simply used month=month(date). Not sure if that is an issue. I just checked month is a numeric variable.

Contributor
Posts: 52

## Re: rolling standard deviation calculation

Ksharp

I tried the code but it yields a constant std dev value for each cusip. Not sure where it goes wrong.

I tried using proc expand and I think this will work:

Proc expand date=temp1 OUT=rollingstd ;

by cusip;

convert ret=std/method=none transformout=(nomiss movstd=24 trimleft=23);

run;

* I intend to download prior period's data for each firm so that I do not have missing ret for the first 23 months for the relevant time period.

* Next, I shall take the lag std so that once I merge this monthly dataset with the yearly dataset, I can have standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I hope I am making sense here

Super User
Posts: 10,770

## Re: rolling standard deviation calculation

Hi, For my opinion , I would like to use a date variable to judge the range of MONTH , YEAR ... , therefore when I saw a variable DATE in your data , I decide to use it instead of your variable MONTH , I don't know if it was suitable .

d=input(put(date,best12.),yymmdd12.);

And in your original data, I noticed there are some character in RET , so I force to change it into numeric variable :

r=input(ret,?? best32.);

"I tried the code but it yields a constant std dev value for each cusip. "

After that, I get std by sub-sql . and don't get a constand std for each cusip .

If you want MONTH as the condition , try this code :

proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace;

guessingrows=32767;

run;

data have;

set temp;

d=mdy(month,1,year);

r=input(ret,?? best32.);

drop date ret;

format d date9.;

run;

proc sql ;

create table std as

select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std

from have as a;

quit;

Xia Keshan

Contributor
Posts: 52

## Re: rolling standard deviation calculation

Hi Ksharp

This code works Thank you!

I can always download data for 24 months prior to 1990 when my data period starts. However in the current output, I have standard deviations for the year 1991, based on 12-month returns which is incorrect. In PROC EXPAND, trimleft=23 took care of it. Do we have an equivalent function in proc sql?

Posts: 4,736

## Re: rolling standard deviation calculation

Hi namrata

Can you please in the future start a new post for a new question? Simply copy/past a cross-reference (hyperlink) to the old and already answered question.

This way we're getting posts where the question and the most suitable answer is on the very top (once the question is answered and the OP also does the job and marks the most suitable answer as correct). This allows then to search through past posts and decide very fast if something is as needed.

Contributor
Posts: 52

## Re: rolling standard deviation calculation

Sure, Patrick. I will keep that in mind.

I had thought that because I am using the earlier code in this thread, I should continue from thereon

Sorry for the inconvenience!

Super User
Posts: 10,770

## Re: rolling standard deviation calculation

I don't understand what you mean. Can you make an example to explain your problem more ?

🔒 This topic is solved and locked.