turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Challenging command -Return volatilities

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-01-2012 09:02 PM

Hi all

I am having difficulty engineering a code on SAS. I am new to SAS and finding this extremely difficult. I would greatly appreciative if you could help me here.

I have outlined my objectives below:

-For each PERMNO (company identifier) with DISTCD 1262 or 1272 (t=0), calculate the daily standard deviation in RET for two years before (t= -1,-2) and two years after (t= +1,+2) surrounding the day of this event.

-Then create a column called BEFORE. This is an average daily standard deviation across the years (–1,–2).

-Then create a column called AFTER. This is an average daily standard deviation across the years (+1, +2).

-Then subtract BEFORE-AFTER, to produce a difference (DIFF).

A successful code should let me refer back to a DISTCD of 1272 or 1262 and see the DIFF at that specific date.

I have been stuck for a while on this and find it challenging. I highly encourage help and useful code on this problem (please see data attachment

Thanks,

Katy

Accepted Solutions

Solution

11-02-2012
02:46 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 02:46 AM

That is more easy. Remove the case clause in SQL.

proc sql; create table temp as select h.*, (select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.date,-1,'s') and h.date) as std_one_year_before , (select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.date,-2,'s') and h.date) as std_two_year_before , (select std(ret) from test_vol where permno = h.permno and date between h.date and intnx('year',h.date,1,'s')) as std_one_year_after , (select std(ret) from test_vol where permno = h.permno and date between h.date and intnx('year',h.date,2,'s')) as std_two_year_after from test_vol as h;quit;

Ksharp

Message was edited by: xia keshan

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-01-2012 10:16 PM

You should post some small sample data and the **result** you want . Your attachment is to huge ,I am not going to work with it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-01-2012 11:00 PM

Ksharp

I am sorry about that, forgive me, I am still famailarising myself with SAS communities and did not realise that this database size is too large. Thank you for your comments, I have since followed your instructions and reduced the sample size by 2/3. Please see the attached data.

In response to the result I want, the table produced should look something like this (see the attached picture). Please note this is a hypothetical representation as I still need to create the code to find my results.

You will note that SDEV should be calculated for each observation. SDEV is just the standard deviation between todays return and yesterday’s return. For example, the standard deviation of RET 0.0021 and 0.0033 is 0.000849.

While BEFORE, AFTER, and DIFF are just calculated for those observations with either 1262 or 1272 in the DISTCD. I hope my explanations of the variables attached to the earlier post are adequate. If not, please let me know and I will provide further detail. BEFORE and AFTER are just averages of the PERMNOS previous two year SDEV and post two year SDEV.

Once again, I thank you for your response to my query. I very much hope you will be able to help me in solving my problem.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 12:28 AM

I'm afraid your test_vol data table seems to be corrupt. You have lots of null values and repeated rows, and the data is disordered. I ran the following code on your second data and got some results but I don't think that is what you expected.

Proc import

datafile = 'C:\Users\Public\Temp\test_vol.dbf'

Out = test_vol

DBMS = DBF

Replace

;

Run ;

Data Stdcalc (Rename = (DateRet = Date DclrRet = DclrDt DistRet = Distcd PrcRet = Prc Check = Ret)) ;

Length Permno DateRet DclrRet DistRet PrcRet Check before after diff 8 ;

Set test_vol ;

Format DateRet DclrRet Date9. ;

By Permno Notsorted ;

DateRet = Lag (Date) ;

DistRet = Lag (Distcd) ;

DclrRet = Lag (Dclrdt) ;

PrcRet = Lag (Prc) ;

Check = Lag (Ret) ;

If First.Permno

Then Check = . ;

Prior = Lag (Check) ;

Before = Std (Prior, check) ;

After = Std (Check, Ret) ;

Diff = Before - After ;

If Prior = . Or Diff = . then delete ;

If DistRet > . then output ;

Keep Permno DateRet DclrRet DistRet PrcRet Check before after diff ;

Run ;

Data Want ;

Set Stdcalc (Where = (Distcd IN (1262, 1272))) ;

Run ;

Permno Date DclrDt Distcd Prc Ret before after diff

11648 29DEC2010 21DEC2010 1262 20.1755 0.023271 0 0.0302903332 -0.030290333

10239 16NOV2007 07NOV2007 1272 27.41 0.015362 0 0 0

10239 16NOV2007 07NOV2007 1272 27.41 0.015362 0 0 0

10239 16NOV2007 07NOV2007 1272 27.41 0.015362 0 0 0

etc

Richard in Oz

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 12:57 AM

Hi Richard in Oz

First, thank you very much for your assistance. I greatly appreciate your help with this!

-I noticed you mentioned there seems to be a lot of duplicates in my data.Maybe we could use the following code to prevent this?

Data test_vol; set test; vol; if Permno=lag(permno) and date=lag(date) and distcd=lag(distcd) then delete; run;

-Also yes there should be a few blanks for distcd, this is OK. However if there are blanks in RET it is worthwhile deleting these also.

Data test_vol; set test; vol; if RET=. then delete; run;

-Ordering the data can be obtained by inserting the code:

PROC sort data=test_vol; by PERMNO DATE; run;

I am a bit troubled by the results generated your code above. I notice there is no SDEV column. Do you know why there are so many zeros in before, after and diff? I imagine there should be at least some SDEV two years before to populate the BEFORE estimate. Likewise, there should be at least some SDEV two years after, to populate the AFTER estimate. Though I'm very happy we are getting somewhere! Progress is being made! Thank you very much for your contribution hopefully we can tackle this problem together!

-Katy

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 07:16 AM

My big concern with the data is that there is no unique key combination. Initially I thought that missing values for DISTCD (not just a few - most are missing) did not matter because I could find the corresponding value of PERMNO but that did not work because it proved to be a many to many relationship. It seemed at first from your data that maybe there was an implicit order, PERMNO and then DATE but even that failed in the second data set. It would worry me that you are trying to calculate a fairly complex result from data that is so shaky. I urge you to make sure the data is clean before you calculate and use these results.

Richard in Oz

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 07:57 PM

Thank you for your concern Richard, do you have any suggestions to improving the PERMNO and DATE order? Do you mean that these are not consecutive? This is not too much of a problem because the focus here is on average volatiles (standard deviations). My main objective is just to compare the typical size of the standard deviation two years before with two years after.

Ksharp

I am having difficulty with the 1st code produced actually. It appears that:

"mean_two_year_after" has the same value as "mean_two_year_before". This cannot be correct. I suspect there is a problem with the code, maybe at the standard devation calculations:

proc sql;

create table temp as

select h.*,

case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between intnx('year',-1,h.date,'s') and h.date) else . end as std_one_year_before ,

case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between intnx('year',-2,h.date,'s') and h.date) else . end as std_two_year_before ,

case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between h.date and intnx('year',1,h.date,'s')) else . end as std_one_year_after ,

case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between h.date and intnx('year',2,h.date,'s')) else . end as std_two_year_after

from test_vol as h;

I see that it contains: " case when h.distcd in (1262 1272)" this worries me.

I shall provide an example of what I am looking to do, if permno 10008 had 1262 on 03/07/2010... I would like to calculate permos 10008 average standard deviation for 03/07/2008 to 03/07/2010, this is known as before (note the daily standard deviations used in the mean computation do not have to be associated with their own 1262 and 1272 distcd). Then, following the same pattern, I would also like to calculate permnos 10008 average standard deviation from 03/07/2010 to 03/07/2012. If my data doesn't give me the full two years before, and after, that is OK just utilize the observations that I have. Are you sure the code above accurately does this?

I have attempted to use your second code without the 1262 and 1272 but it appears that this is taking too long with my data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-03-2012 11:33 AM

I'm just looking at your question, and the responses, for the first time thus forgive me if I ask some dumb questions.

Your example wasn't very helpful as your data doesn't have any records for permno 10008.

Is DCLRDT supposed to be used for anything? I noticed that it wasn't used in any of the suggested solutions.

Permno 10239, has a number of records with DISTCD codes that meet your criteria. And, of those, the date ranges would include the same records regardless of whether one uses the date or the DCLRDT field as the base date. Are you expecting to treat each one of those separately?

Regardless of which date is supposed to be used, should the data for that particular date for a particular permno be used for the std devs and, if so, for the years before or the years after or both? Plus, regarding the specification one year before, is that 1 day to one year before, or one year to two years minus 1 day before?

KSharp's code refers to a file called have. Is there another file I didn't see in your post?

Also, KSharp had a slight error (I think) in his use of the intnx function which is why you were getting such odd results.

That said, the following may be incorrect as it depends upon your answers to the above questions, whether I correctly understand the problem, and whether my logic is actually correct in accomplishing the task. The code is definitely NOT guaranteed or fully tested:

proc sql;

create table met_criterion as

select distinct permno,DCLRDT

from test_vol

where DISTCD in (1262,1272)

;

create table want as

select h.*,

(select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.DCLRDT,-1,'s') and h.DCLRDT) as std_one_year_before ,

(select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.DCLRDT,-2,'s') and h.DCLRDT) as std_two_year_before ,

(select std(ret) from test_vol where permno = h.permno and date between h.DCLRDT and intnx('year',h.DCLRDT,1,'s')) as std_one_year_after ,

(select std(ret) from test_vol where permno = h.permno and date between h.DCLRDT and intnx('year',h.DCLRDT,2,'s')) as std_two_year_after

from met_criterion as h;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-04-2012 12:18 PM

Hi art

Thank you very much for your questions and helpfulness I was beginning to worry this problem may not get resolved.

In response to your questions:

DCLRDT is the date that the distribution code (1262 or 1272) is declared to the market. This being said, this date should act like the premise for the two years before and two years after calculation. I.e.: if the DCLRDT was 01-01-2010 then two years before should extend to 01-01-2008 for DATE and two years after should extend to 01-01-2012 for DATE, with respect to the averages for standard deviation. However the DCLRDT and the DATE for 1262 and 1272 should not be vastly different from one another. Therefore for simplicities sake I encourage using the DATE as the basis or starting point.

If Permno 10239 has a number of 1262s and 1272s which overlap in the +/-2 year date ranges this is ok. The overlap of SDEV information for the same permno is fine. But I would like to treat each 1262 and 1272 separately.

I have supplied no file called “have”. To be frank, I am only interested in 2 before and after analysis, not 1yr. I am not sure why this has been included.

Does your code supplied satisfy the conditions discussed above?

As there are concerns with the earlier posts. I would also like to ask you: how would your code change if we remove the 1262 and 1272 restriction?

That is, for each PERMNO calculate the daily standard deviation (SDEV) for RET two years before (t= -1,-2) and two years after (t= +1,+2) surrounding each date recorded, or each observation.

So the column called BEFORE, the average daily standard deviation across the years (–1,–2) is calculated for all observations.

And the column called AFTER. This is an average daily standard deviation across the years (+1, +2) is also calculated for all observations.

-I then subtract BEFORE-AFTER, to produce a difference column (DIFF) which should be produced for all the observations.

If is is possible to see code for:

(1) With 1262 and 1272 restriction

(2) And without 1262 and 1272 restriction

I would be unduly grateful. Thank you very much once again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-04-2012 03:32 PM

No, my code didn't meet all of your requirements and I discovered some additional complications in your data. When the distcd codes were added numerous duplicate records were introduced.

The following is an attempt to get rid of the duplicates without losing any of the critical information. You will have to check if it does what I had intended for it to do.

That said, the following does (I think at least) meet your various requirements. However, as for your question about expanding your data records to include calculations for each day, I'll have to leave that for someone more familiar with analyzing this type of data.

Here is the approach I took:

PROC IMPORT OUT= WORK.test_vol_in

DATAFILE= "C:\art\test_vol.dbf"

DBMS=DBF REPLACE;

GETDELETED=NO;

RUN;

data contains_distcd;

set test_vol_in;

if not missing(distcd);

run;

data does_not_contain_distcd;

set test_vol_in;

if missing(distcd);

run;

proc sort data=contains_distcd;

by permno date ret;

run;

data contains_distcd_cleaner;

set contains_distcd;

retain contains_1262 contains_1272 contains_other;

by permno date;

if first.date then do;

call missing(contains_1262);

call missing(contains_1272);

call missing(contains_other);

end;

if distcd eq 1262 then contains_1262=1;

else if distcd eq 1272 then contains_1272=1;

else contains_other=1;

if last.date then output;

run;

data both;

set does_not_contain_distcd contains_distcd_cleaner;

run;

proc sql;

create table met_criterion_1262 as

select distinct permno,date

from both

where contains_1262=1

;

create table met_criterion_1272 as

select distinct permno,date

from both

where contains_1272=1

;

create table want_1262 as

select h.*,

(select std(ret) from both where permno = h.permno and date between intnx('year',h.date,-2,'s') and h.date) as before ,

(select std(ret) from both where permno = h.permno and date between h.date and intnx('year',h.date,2,'s')) as after,

calculated before-calculated after as diff

from met_criterion_1262 as h

;

create table want_1272 as

select h.*,

(select std(ret) from both where permno = h.permno and date between intnx('year',h.date,-2,'s') and h.date) as before ,

(select std(ret) from both where permno = h.permno and date between h.date and intnx('year',h.date,2,'s')) as after,

calculated before-calculated after as diff

from met_criterion_1272 as h

;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-04-2012 08:18 PM

This code is thorough and works swimmingly! Thank you thank you thank you, these were the kind of results I was expecting!!

Just a quick question. As I also want to find the +/- 2 yr average SDEV for all stocks (regardless of 1262 and 1272 or not) how should I tweak the code? You will notice Ksharp provides a solution, but due to past concerns, is this an incorrect approach to take? I am just thinking surely this approach must be easier than that discussed above. Hmm, there must be a solution for this!

Btw, if duplicate data arises this can be easily solved using: data test_vol_in; set test_vol_in; if permno=lag(permno) and date=lag(date) then delete; run;

I am pleased progress is being made, you provide a very valuable contribution art! Almost there!

-Katy

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-04-2012 11:07 PM

Katy, Glad to have been able to be of help but, like I said, this is not my area. I am a psychologist, not a statistician or a economist. I'm sure that others can provide more correct guidance, at this point, than I can.

If you are looking for before and after effects, I would think that you have to decide what constitutes the border between before and after. If it is just ANY of the codes in the field from which you selected the other two, then it is an easy problem to solve. But, not knowing what any of those codes represent, I am hesitant to suggest an approach.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-05-2012 01:45 AM

No worries at all, you have been exceptionally helpful. I will test out Ksharps code and keep playing with this data!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-04-2012 08:49 PM

Arthur.T

Thank you to point out my mistake.

I think I am too long to learn SAS .

Best Regard.

Ksharp

proc sql; create table temp as select h.*, (select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.date,-1,'s') and h.date) as std_one_year_before , (select std(ret) from test_vol where permno = h.permno and date between intnx('year',h.date,-2,'s') and h.date) as std_two_year_before , (select std(ret) from test_vol where permno = h.permno and date between h.date and intnx('year',h.date,1,'s')) as std_one_year_after , (select std(ret) from test_vol where permno = h.permno and date between h.date and intnx('year',h.date,2,'s')) as std_two_year_after from test_vol as h;quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2012 01:12 AM

OK. If I understand what you mean.

proc sql; create table temp as select h.*, case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between intnx('year',-1,h.date,'s') and h.date) else . end as std_one_year_before , case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between intnx('year',-2,h.date,'s') and h.date) else . end as std_two_year_before , case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between h.date and intnx('year',1,h.date,'s')) else . end as std_one_year_after , case when h.distcd in (1262 1272) then (select std(ret) from have where permno = h.permno and date between h.date and intnx('year',2,h.date,'s')) else . end as std_two_year_after from test_vol as h; create table want as select *, mean(std_one_year_before) as mean_one_year_before , mean(std_two_year_before) as mean_two_year_before , calculated mean_one_year_before - calculated mean_two_year_before as diff_before , mean(std_one_year_after) as mean_one_year_after , mean(std_two_year_after) as mean_two_year_after , calculated mean_one_year_after - calculated mean_two_year_after as diff_after from temp group by permno; quit;

Ksharp