BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
comeon2012
Fluorite | Level 6

Hi all,

Please find attached a example out of a large dataset, in which I want to calculate the 3-year rolling standard deviation of variable x, from 1982 to 1989.

For example, the standard deviation in 1982 is the standard deviation of x in 1980, 1981 and 1982.

The dataset is a panel, but there are missing values in variable x. For the second company code as an example, I want the standard deviation in 1982 to be the rolling standard deviation of 5 (in 1980) and 7 (in 1981), although x is missing in 1982 (is it reasonable?).

I also found there is similar question in the forum (like ), but mine is a little bit different from that and I did not know how to modify based on that since I am not familiar with array.

Can anybody do me a favor? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.

data have;
infile cards dlm=',' truncover;
input code year x;
cards;
255956,1980,
255956,1981,
255956,1982,
255956,1983,
255956,1984,5
255956,1985,7
255956,1986,4
255956,1987,6
255956,1988,2
255956,1989,1
255964,1980,5
255964,1981,7
255964,1982,
255964,1983,
255964,1984,5
255964,1985,3
255964,1986,7
255964,1987,3
255964,1988,8
255964,1989,7
;
run;
proc sql;
create table want as
 select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std
  from have as a;
quit;

Message was edited by:Overlook the code variable . Fixed.     xia keshan

View solution in original post

12 REPLIES 12
Ksharp
Super User

HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.

data have;
infile cards dlm=',' truncover;
input code year x;
cards;
255956,1980,
255956,1981,
255956,1982,
255956,1983,
255956,1984,5
255956,1985,7
255956,1986,4
255956,1987,6
255956,1988,2
255956,1989,1
255964,1980,5
255964,1981,7
255964,1982,
255964,1983,
255964,1984,5
255964,1985,3
255964,1986,7
255964,1987,3
255964,1988,8
255964,1989,7
;
run;
proc sql;
create table want as
 select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std
  from have as a;
quit;

Message was edited by:Overlook the code variable . Fixed.     xia keshan

Patrick
Opal | Level 21

Hey Ksharp

A bit late but: Happy New Year. Hope you're well.

And nice that you haven't forgotten us here (I guess you're now spending more time in Java related forums).

Cheers

Patrick

Haikuo
Onyx | Level 15

Resurrection of the Ksharp! Smiley Happy. Please don't be a stranger, I have always enjoyed reading your posts and learning from them. Hope Java is treating you as well as SAS.

Haikuo

namrata
Fluorite | Level 6

Hi Ksharp

I need something similar to the above requirement - standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I have attached a sample file.


I tried to modify your code as:


proc sql

create table std as

select *, (std(ret) from have where month between month-36 and month-12 and cusip=a.cusip) as rolling_std

from have as a;

quit;

This code did not return values against rolling_std. Is it a problem with (month-12)?

I read that truncover allows reading variable-length records; I do not think that should be a problem here.

Any help would be greatly appreciated!

Thanks Smiley Happy

Ksharp
Super User

Not sure. I noticed that there is a date variable. you can use it to judge ?

But maybe you need a lot of time to run it .

proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace;
guessingrows=32767;
run;
data have;
 set temp;
 d=input(put(date,best12.),yymmdd12.);
 r=input(ret,?? best32.);
 drop date ret;
 format d date9.;
run;


proc sql ;
create table std as
select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std
from have as a;
quit;

Xia Keshan

namrata
Fluorite | Level 6

Thank you, Ksharp.

I shall try this code and let you know it it works.

I had actually tried your code [select *,(select std(x) from have where year between a.month-2 and a.month and cusip=a.cusip) as rolling_std] on my data.. For some reason, the std deviations in my output were wrong.

I had simply used month=month(date). Not sure if that is an issue. I just checked month is a numeric variable.

namrata
Fluorite | Level 6

Ksharp

I tried the code but it yields a constant std dev value for each cusip. Not sure where it goes wrong.

I tried using proc expand and I think this will work:

Proc expand date=temp1 OUT=rollingstd ;

by cusip;

convert ret=std/method=none transformout=(nomiss movstd=24 trimleft=23);

run;

* I intend to download prior period's data for each firm so that I do not have missing ret for the first 23 months for the relevant time period.

* Next, I shall take the lag std so that once I merge this monthly dataset with the yearly dataset, I can have standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I hope I am making sense hereSmiley Wink

Ksharp
Super User

Hi, For my opinion , I would like to use a date variable to judge the range of MONTH , YEAR ... , therefore when I saw a variable DATE in your data , I decide to use it instead of your variable MONTH , I don't know if it was suitable .

d=input(put(date,best12.),yymmdd12.);

And in your original data, I noticed there are some character in RET , so I force to change it into numeric variable :

r=input(ret,?? best32.);

"I tried the code but it yields a constant std dev value for each cusip. "

After that, I get std by sub-sql . and don't get a constand std for each cusip .

If you want MONTH as the condition , try this code :

proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace;

guessingrows=32767;

run;

data have;

set temp;

d=mdy(month,1,year);

r=input(ret,?? best32.);

drop date ret;

format d date9.;

run;

proc sql ;

create table std as

select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std

from have as a;

quit;

Xia Keshan


1.png
namrata
Fluorite | Level 6

Hi Ksharp

This code works Smiley Happy Thank you!

I can always download data for 24 months prior to 1990 when my data period starts. However in the current output, I have standard deviations for the year 1991, based on 12-month returns which is incorrect. In PROC EXPAND, trimleft=23 took care of it. Do we have an equivalent function in proc sql?

Patrick
Opal | Level 21

Hi namrata

Can you please in the future start a new post for a new question? Simply copy/past a cross-reference (hyperlink) to the old and already answered question.

This way we're getting posts where the question and the most suitable answer is on the very top (once the question is answered and the OP also does the job and marks the most suitable answer as correct). This allows then to search through past posts and decide very fast if something is as needed.

namrata
Fluorite | Level 6

Sure, Patrick. I will keep that in mind.

I had thought that because I am using the earlier code in this thread, I should continue from thereon Smiley Happy

Sorry for the inconvenience!

Ksharp
Super User

I don't understand what you mean. Can you make an example to explain your problem more ?

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 9987 views
  • 2 likes
  • 5 in conversation