Hi all,
Please find attached a example out of a large dataset, in which I want to calculate the 3-year rolling standard deviation of variable x, from 1982 to 1989.
For example, the standard deviation in 1982 is the standard deviation of x in 1980, 1981 and 1982.
The dataset is a panel, but there are missing values in variable x. For the second company code as an example, I want the standard deviation in 1982 to be the rolling standard deviation of 5 (in 1980) and 7 (in 1981), although x is missing in 1982 (is it reasonable?).
I also found there is similar question in the forum (like ), but mine is a little bit different from that and I did not know how to modify based on that since I am not familiar with array.
Can anybody do me a favor? Thanks.
HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.
data have; infile cards dlm=',' truncover; input code year x; cards; 255956,1980, 255956,1981, 255956,1982, 255956,1983, 255956,1984,5 255956,1985,7 255956,1986,4 255956,1987,6 255956,1988,2 255956,1989,1 255964,1980,5 255964,1981,7 255964,1982, 255964,1983, 255964,1984,5 255964,1985,3 255964,1986,7 255964,1987,3 255964,1988,8 255964,1989,7 ; run; proc sql; create table want as select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std from have as a; quit;
Message was edited by:Overlook the code variable . Fixed. xia keshan
HaHa, Actually Patrick's solution is not good . It is very suitable for SQL's sub-query which is a big advantage for SQL.
data have; infile cards dlm=',' truncover; input code year x; cards; 255956,1980, 255956,1981, 255956,1982, 255956,1983, 255956,1984,5 255956,1985,7 255956,1986,4 255956,1987,6 255956,1988,2 255956,1989,1 255964,1980,5 255964,1981,7 255964,1982, 255964,1983, 255964,1984,5 255964,1985,3 255964,1986,7 255964,1987,3 255964,1988,8 255964,1989,7 ; run; proc sql; create table want as select *,(select std(x) from have where year between a.year-2 and a.year and code=a.code) as rolling_std from have as a; quit;
Message was edited by:Overlook the code variable . Fixed. xia keshan
Hey Ksharp
A bit late but: Happy New Year. Hope you're well.
And nice that you haven't forgotten us here (I guess you're now spending more time in Java related forums).
Cheers
Patrick
Resurrection of the Ksharp! . Please don't be a stranger, I have always enjoyed reading your posts and learning from them. Hope Java is treating you as well as SAS.
Haikuo
Hi Ksharp
I need something similar to the above requirement - standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I have attached a sample file.
I tried to modify your code as:
proc sql
create table std as
select *, (std(ret) from have where month between month-36 and month-12 and cusip=a.cusip) as rolling_std
from have as a;
quit;
This code did not return values against rolling_std. Is it a problem with (month-12)?
I read that truncover allows reading variable-length records; I do not think that should be a problem here.
Any help would be greatly appreciated!
Thanks
Not sure. I noticed that there is a date variable. you can use it to judge ?
But maybe you need a lot of time to run it .
proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace; guessingrows=32767; run; data have; set temp; d=input(put(date,best12.),yymmdd12.); r=input(ret,?? best32.); drop date ret; format d date9.; run; proc sql ; create table std as select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std from have as a; quit;
Xia Keshan
Thank you, Ksharp.
I shall try this code and let you know it it works.
I had actually tried your code [select *,(select std(x) from have where year between a.month-2 and a.month and cusip=a.cusip) as rolling_std] on my data.. For some reason, the std deviations in my output were wrong.
I had simply used month=month(date). Not sure if that is an issue. I just checked month is a numeric variable.
Ksharp
I tried the code but it yields a constant std dev value for each cusip. Not sure where it goes wrong.
I tried using proc expand and I think this will work:
Proc expand date=temp1 OUT=rollingstd ;
by cusip;
convert ret=std/method=none transformout=(nomiss movstd=24 trimleft=23);
run;
* I intend to download prior period's data for each firm so that I do not have missing ret for the first 23 months for the relevant time period.
* Next, I shall take the lag std so that once I merge this monthly dataset with the yearly dataset, I can have standard deviation(std) of monthly stock returns(ret) for the twenty-four months through the end of the last month of year (t-1). I hope I am making sense here
Hi, For my opinion , I would like to use a date variable to judge the range of MONTH , YEAR ... , therefore when I saw a variable DATE in your data , I decide to use it instead of your variable MONTH , I don't know if it was suitable .
d=input(put(date,best12.),yymmdd12.);
And in your original data, I noticed there are some character in RET , so I force to change it into numeric variable :
r=input(ret,?? best32.);
"I tried the code but it yields a constant std dev value for each cusip. "
After that, I get std by sub-sql . and don't get a constand std for each cusip .
If you want MONTH as the condition , try this code :
proc import datafile='c:\temp\sample.csv' out=temp dbms=csv replace;
guessingrows=32767;
run;
data have;
set temp;
d=mdy(month,1,year);
r=input(ret,?? best32.);
drop date ret;
format d date9.;
run;
proc sql ;
create table std as
select *, (select std(r) from have where d between intnx('month',a.d,-36,'s') and intnx('month',a.d,-12,'s') and cusip=a.cusip) as rolling_std
from have as a;
quit;
Xia Keshan
Hi Ksharp
This code works Thank you!
I can always download data for 24 months prior to 1990 when my data period starts. However in the current output, I have standard deviations for the year 1991, based on 12-month returns which is incorrect. In PROC EXPAND, trimleft=23 took care of it. Do we have an equivalent function in proc sql?
Hi namrata
Can you please in the future start a new post for a new question? Simply copy/past a cross-reference (hyperlink) to the old and already answered question.
This way we're getting posts where the question and the most suitable answer is on the very top (once the question is answered and the OP also does the job and marks the most suitable answer as correct). This allows then to search through past posts and decide very fast if something is as needed.
Sure, Patrick. I will keep that in mind.
I had thought that because I am using the earlier code in this thread, I should continue from thereon
Sorry for the inconvenience!
I don't understand what you mean. Can you make an example to explain your problem more ?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.