04-06-2014 07:53 PM
Hello, one more question for tonight.
I have a large dataset in which I would like to create a variable for the lagged return of a fund. I have lots of data for many funds across a very large date range.
The data takes the following form:
ID QTR SHARES
001 2001Q1 100
001 2001Q2 101
001 2001Q3 103
001 2001Q4 103
002 2001Q1 10
002 2001Q2 10
002 2001Q3 11
002 2001Q4 11
So I am currently using the lag function to get the lagged value for SHARES, which for ID number 001 on date 2001Q2 would return a value of 100. However, when I do this, the value returned for ID number 002 on 2001Q1 is 103, which is incorrect.
How do I stop the incorrect values when the ID variable switches over? As a note, I do not have consistent date ranges for all of the ID variables. Some have 30 years of data, some have a few years, starting very recently.
I am hoping this is an easy fix, so thank you in advance!
04-06-2014 08:13 PM
You don't prevent it from happening. You merely re-set the value. For example:
prior_shares = lag(shares);
if first.id then prior_shares=.;