01-27-2012 03:26 PM
Hi, all suppose I have folowing data set:
I want to generate a new variable NewVar for each observation such that NewVar(of observation n) = Var( of observation n) - Var(of observation n-1). The new dataset should look like:
how can I do this?
01-27-2012 03:45 PM
One approach could be:
data want (drop=_;
merge have have(firstobs=2 rename=var=_var);
Have a Nice Weekend!
01-27-2012 04:12 PM
Thank you, Hai.kuo
I know SAS has functions such as RETAIN, LAG that can return value from previous records; is there any similar fucntions that can return value from forward records?
01-27-2012 04:39 PM
I do not know of a look-ahead function.
If you sort your data in reverse order, you can use the lag function to obtain this functionality. But then you will need to re-sort after doing look-aheads. I think that may be a more general solution because it would let you use group by processing in situations more complicated than your example.
01-27-2012 04:52 PM
Not a "look ahead" function, but using two set statements where you start reading on the second observation in the second set will do:
set my_data(firstobs=2 rename=(my_var = next_var));
dif_ahead = next_var - my_var;
Be careful with the last observation: the second set statement hits end-of-file before the first observation. If you need that last observation of the first set, you have to take some measures.
By the way: in your original post you wrote the formula: NewVar(of observation n) = Var( of observation n) - Var(of observation n-1).
I assume that you mean: NewVar(of observation n) = Var( of observation n+1) - Var(of observation n)
01-27-2012 05:07 PM
Eric, FYI, if you use the two set statement approach, you need to nest the file as a dummy within the second set statement. e.g.:
set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_ );
That way you won't get the error when you read the last observation.
01-27-2012 05:23 PM
That is an elegant way to do it. Another way to do it, is by adding a NOBS option:
do n=1 to 10;
if _n_ le nobs then set test(firstobs=2 rename=( n=m) nobs=nobs);
01-28-2012 02:28 PM
The data step will stop looping when the first SET statement reads the EOF so it does not matter if you add 1 or N extra records to the second SET statement by using no variable dataset. This eliminates the need for the OBS=1 dataset option. This really could make a difference if you want to look ahead more than 1 observation.
01-27-2012 10:22 PM
I know this is getting silly (I mean my solution), if you want to use LAG() without sorting, here you go:
data want (drop=_;
set have(rename=var=_var) end=last;
if _n_>1 then output;
if last then
01-28-2012 09:24 AM
Or if you have ETS and don't care about sign, here is another way:
proc expand data=have out=want;
convert var = newvar / transformout=(reverse MOVRANGE 2 reverse);
Or you need one more step to have your sign,
proc expand data=have out=want1;
convert var = _lead / transformout=(lead 1);
data want (drop=time _lead);
As you have noted, proc expand does have sort of LEAD function.