Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- standard deviation for quarterly data

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-26-2012 08:09 PM

Good days to all,

I have a variable, return on assets (ROA) financial years 2003 to 2010 for US firms. I would like to calculate performance volatility (=standard deviation of ROA for 3 years) using quarterly data. It does mean that there will be (4 qtrs x 3 years =) maximum ROA of 12 quarters for the calculation of standard deviation.

fyearq = financial year

fqtr = quarter of the financial year (there are 4 quarters per annum)

tic = ticker number (company identification codes)

datafqtr = financial year + quarter

ROA = return on asset

I would like to calculate performance volatility for the following years:

year 2003 = standard deviation of ROA in 3 years (overlapping period 2003 - 2005)

year 2004 = standard deviation of ROA in 3 years (overlapping period 2004 - 2006)

year 2005 = standard deviation of ROA in 3 years (overlapping period 2005 - 2007)

year 2006 = standard deviation of ROA in 3 years (overlapping period 2006 - 2008)

year 2007 = standard deviation of ROA in 3 years (overlapping period 2007 - 2009)

year 2008 = standard deviation of ROA in 3 years (overlapping period 2008 - 2010)

additional rules:

1. if there is missing value is found in between/at the middle of the overlapping period, the standard deviation calculation shouldn't be affected (ie. the standard deviation value should be derived).

2. If missing value is found at the begining/ending of the year, the standard deviation calculation should be ignored (ie. should not be derived).

Thank you and hope to get some valuable response.

mspak

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mspak

02-26-2012 08:21 PM

You can always use the intnx function to get the starting and ending dates (see: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212700.htm )

and set the value to missing if you don't have a non-missing value for those dates.

proc summary might be the easiest way to do the calculations and will ignore missing values.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

02-26-2012 08:27 PM

Thanks,

Can proc expand helpful? Can SAS read the id as YYYYQQ?

I used proc expand for calculation of lag values. I know this function can calculate the lead values too.

I also think about another way - using FIRST. and LAST. function. But I do not know how this function apply to both tic and quarter?

Thank you.

mspak

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mspak

02-26-2012 08:28 PM

Why would you need it? If you are going to exclude missing values, why would you want to replace them with interpolated values?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

02-26-2012 08:37 PM

Hi again,

I wish to calculate the standard deviation of ROA for a given firm for 3 years period. As all the ROA are arranged in a column, I must identify the first and last ROA used to calculate the standard deviation. Therefore, I feel that FIRST. and LAST. might be useful for data cleaning before the SD calculation.

Or I also consider to arrange the ROA in (8 years x 4 qtrs) 32 columns using proc expand for easy calculation of SD.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mspak

02-26-2012 09:01 PM

or simply select records between x and y. According to your specs, if you don't have a record for both x and y, then you are not supposed to calculate the SD.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mspak

02-26-2012 09:08 PM

This query does the trick: it excludes all cases where at least one of the missing ROA values occurs on a first quarter (fqtr=1) . Otherwise, it calculates the standard deviations from all non-missing ROA values. There are 33661 std values returned.

data beginnings;

input begYear @@;

datalines;

2003 2004 2005 2006 2007 2008

;

proc sql;

create table Roa2Std as

select tic, begYear, std(ROA) as RoaStd

from fPath.Roa2, beginnings

where fyearq between begYear and begYear+2

group by tic, begYear

having sum(ROA is not missing or fqtr ne 1) = 12;

PG

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

2 weeks ago

proc sql;

create table exe_ccn_lancmpcrsp as

select a.*,std(b.RET)*sqrt(250) as lstkvol "lag annualized stock volitality"

from exe_ccn_lancmpllq a left join crspa.DSF b

on a.lpermno=b.permno and b.DATE between a.lfybegdt and a.lfyenddt

group by a.lpermno,a.fyear

having sum(b.RET is not missing)>1;

quit;

I use the same method as suggest to obtain stock volitality from crsp daily stock file. But my problem is that the observations go from 36 thousand to more than 1 billion. And there are many same results, with same lpermno, same fyear and same lstkvol. I couldn't find the reason. Thanks!