Solved
Contributor
Posts: 44

# Daily return filters

[ Edited ]

Hi everyone,

I have a daily data as following

Permno    Date (t)       Return (r)

10006    31/12/1926   0.00322

10006    01/01/1927   0.00322

10006    02/01/1927   0.00322

10006    03/01/1927   0.00322

20010    31/12/1926   0.00322

20010    01/01/1927   0.00322

20010    02/01/1927   0.00322

20010    03/01/1927   0.00322

78900    31/12/1926   0.00322

78900    01/01/1927   0.00322

78900    02/01/1927   0.00322

78900   03/01/1927    0.00322

I have to filter the data return if rt or rt-1 > 100% and (1 + rt-1)(1+ rt) - 1 < 20%, then both rt and rt-1 are set equal to a missing value. Additionally, any daily return greater than 200% is set to missing.

I really have no idea how I can run if ... then when 2 variables in the same column (rt and rt-1) and combine "or" and " and". Maybe the question is stupid but could you please help me with this.

Thank you very much,

Ha

Accepted Solutions
Solution
‎05-01-2017 04:55 AM
Posts: 1,399

## Re: Daily return filters

Instead of using lag, you might be better off using a merge with a firstobs=2 parameter to look ahead at the next obs.  If the current and next obs fullfill the criterion, set N_MISS to 2.  Before decrementing N_MISS by 1, check if it is positive.  If so set the current r to missing.

``````data have;
input permno date ddmmyy10. r percent7. ;
format date date9.  r percent8.2;
datalines;
10006 31/12/1926 101%
10006 01/01/1927 -50%
10006 02/01/1927 0.32%
10006 03/01/1927 200%
run;

data want (drop=next_: n_miss);
merge have
have (firstobs=2 keep=permno r rename=(permno=next_p r=next_r));

if permno=next_p and (r>1 or next_r>1) and (((1+r)*(1+next_r)-1)<.2) then n_miss=2;
if n_miss>0 or r>=2 then r=.;
n_miss+(-1);
run;``````

All Replies
PROC Star
Posts: 1,410

## Re: Daily return filters

rt-1 is a variable here? What does rt-1 represent?

Contributor
Posts: 44

## Re: Daily return filters

Hi daycut,

rt-1 is the return of the date right before t.

For example: For 10006, return of 1/1/1927 is rt and return of 31/12/1926 is r t-1.

Super User
Posts: 13,947

## Re: Daily return filters

It will help if you can also show what the result of this would look like.

"Filter" often means to subset the data but the way you say "set to missing" makes it seem that you may not actually want to subset the data.

Also since you have a requirement involving rt > 100% and 200% you should show an example in the input and what the output looks like with those.

Contributor
Posts: 44

## Re: Daily return filters

[ Edited ]

Hi @ballardw,

Maybe it was a mistake of origin, I try to replicate this for screening data. Let me explain what I want to do here with one ID.

ID          Date (t)       Return (r) in percentage

10006    31/12/1926   101% (rt-1)

10006    01/01/1927   -50% (rt)

10006    02/01/1927   0.32%

10006    03/01/1927   200%

Then rt-1=101% > 100% and (1 + rt-1)(1+ rt) - 1=0.5%< 20% => rt and rt-1 are set to be missing.

Another is return at date 03/01/1927 = 200% --> set to be missing.

The result I expect is:

ID          Date (t)       Return (r) in percentage

10006    31/12/1926   .(rt-1)

10006    01/01/1927   . (rt)

10006    02/01/1927   0.32%

10006    03/01/1927  .

Valued Guide
Posts: 653

## Re: Daily return filters

[ Edited ]

@yotsuba88 in your example data R is a constant (always=.0032), i assume that this is not actually the case.  perhaps if it varied a bit you could include an actual calculation of how you arrive at the 101%. I think that you want to look at the use of the LAG function to 'remember' the previous value of R.

is the percentage calculated something like this????

``value=(1 + lagr)*(1+ r);``

I assume not since adding 1 to these small will tend to yield a value over 1.  Help us with the calculation please.

Contributor
Posts: 44

## Re: Daily return filters

@ArtC Thank you for your suggestion of using lag.

1. For R, my mistake when I typed all R is 0.0032., but I change a little bit as below. 101% is an example for my table (it is already in data), my task is to screen the outlier as formula or rt is greater than 200%.

ID          Date (t)       Return (r) in percentage

10006    31/12/1926   101% (rt-1)

10006    01/01/1927   -50% (rt)

10006    02/01/1927   0.32%

10006    03/01/1927   200%

2. And the value is

value= (1+lagr)*(1+r)-1, so it can not be more than 1.

However, I still dont know how to set r and rt-1 missing if they met the condition.

Thank you very much.

Solution
‎05-01-2017 04:55 AM
Posts: 1,399

## Re: Daily return filters

Instead of using lag, you might be better off using a merge with a firstobs=2 parameter to look ahead at the next obs.  If the current and next obs fullfill the criterion, set N_MISS to 2.  Before decrementing N_MISS by 1, check if it is positive.  If so set the current r to missing.

``````data have;
input permno date ddmmyy10. r percent7. ;
format date date9.  r percent8.2;
datalines;
10006 31/12/1926 101%
10006 01/01/1927 -50%
10006 02/01/1927 0.32%
10006 03/01/1927 200%
run;

data want (drop=next_: n_miss);
merge have
have (firstobs=2 keep=permno r rename=(permno=next_p r=next_r));

if permno=next_p and (r>1 or next_r>1) and (((1+r)*(1+next_r)-1)<.2) then n_miss=2;
if n_miss>0 or r>=2 then r=.;
n_miss+(-1);
run;``````
Valued Guide
Posts: 653

## Re: Daily return filters

[ Edited ]

I came up with an array solution, but I like the one by @mkeintz much better.

Contributor
Posts: 44

## Re: Daily return filters

Thank you for your consideration, I also tried @mkeintz solution and it can run well.

Contributor
Posts: 44

## Re: Daily return filters

@mkeintz I can run screening for my data now. I really appreciate it. Best regards.

☑ This topic is solved.