I have data of daily returns for several stocks. This is what my data looks like (simplified):
PERMNO | DATE | RETURN |
10078 | 2010JAN02 | 0.0500 |
10104 | 2010JAN02 | -0.0190 |
10107 | 2010JAN02 | 0.0020 |
10078 | 2010JAN03 | 0.0040 |
10104 | 2010JAN03 | -0.0400 |
10107 | 2010JAN03 | 0.0500 |
... | ... | ... |
10078 | 2015JAN02 | -0.0190 |
10104 | 2015JAN02 | 0.0100 |
10107 | 2015JAN02 | 0.0700 |
10078 | 2015JAN05 | 0.0500 |
10104 | 2010JAN03 | -0.0190 |
10107 | 2010JAN03 | 0.0020 |
PERMNO identifies a stock, DATE identifies a date (yyyymmmdd), RETURN is the daily stock return.
I have simplified this example for only three stocks (10078, 10104, 10107).
Goal: I am trying to calculate rolling skewness for each stock i in a given month t.
I want to calculate the monthly skewness measure for each stock using the previous 6 months (i.e. months t-6 to t-1) of daily returns data. Therefore, for a stock in e.g. July 2010, I want the skewness measure for that month to be based on its daily returns from January 2010 to June 2010.
I want the output data to include PERMNO, month ID, and the monthly skewness measure (based on prior 6 months of data) for that month. Here is a a picture to illustrate the desired output I want:
PERMNO | DATE | 6MONTH_SKEWNESS |
10078 | 2010JUL30 | 0.7257 |
10104 | 2010JUL30 | -0.7056 |
10107 | 2010JUL30 | -0.6781 |
10078 | 2010AUG31 | 0.9999 |
10104 | 2010AUG31 | -0.6719 |
10107 | 2010AUG31 | -0.7056 |
... | ... | ... |
10078 | 2015JUL30 | -0.1651 |
10104 | 2015JUL30 | 0.1056 |
10107 | 2015JUL30 | 0.6181 |
10078 | 2015AUG31 | -0.8886 |
10104 | 2015AUG31 | 0.6119 |
10107 | 2015AUG31 | 0.1056 |
I have searched the web extensively and tried this myself, but I feel really stuck on this problem. Thank you in advance for anyone who is able to help in any way.
If your data is continuous you can use something like the following:
https://gist.github.com/statgeek/07a3708dee1225ceb9d4aa75daab2c52
If you don't have continuous data or PROC EXPAND here is another method:
You'll have to modify either for skewness, but that doesn't even have a standard definition so make sure the ones in SAS are what you want.
data want;
set have;
by permno ;
obs+1;
if first.permno then obs=1;
array ret_hist{0:59} _temporary_;
if obs>=61 then do;
sk=skewness(of ret_hist{*});
output;
end;
ret_hist{mod(obs,60)}=return;
run;
This calculates lagged skewness for every window of size 60, by applying the skewness function against a 60-element history of returns. The array RET_HIST contains returns for date{i-60} through date{i-1}. Only after skewness is calculated is the history updated with returns for date{i} (replacing returns of date{i-60}). This establishes an array for dates{i-59} through date{i}, ready for the next incoming obs.
The program is slightly inefficient since each skewness is calculated from scratch. One could calculate skewness of a window from (a) skewness of the previous windows and the returns from the (b) new return in the window (actually the return for date{i-1}) and the (c) dropped return (for date{I-61}). Come back to us if you need that efficiency, although I doubt it would make that much difference.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.