Hi, all suppose I have folowing data set:
var |
---|
1 |
3 |
4 |
8 |
3 |
7 |
I want to generate a new variable NewVar for each observation such that NewVar(of observation n) = Var( of observation n) - Var(of observation n-1). The new dataset should look like:
var | NewVar |
---|---|
1 | 2 |
3 | 1 |
4 | 4 |
8 | -5 |
3 | 4 |
7 |
how can I do this?
One approach could be:
data have;
infile cards;
input var;
cards;
1
3
4
8
3
7
;
data want (drop=_:);
merge have have(firstobs=2 rename=var=_var);
newvar=_var-var;
run;
proc print;run;
Have a Nice Weekend!
Haikuo
Thank you, Hai.kuo
I know SAS has functions such as RETAIN, LAG that can return value from previous records; is there any similar fucntions that can return value from forward records?
I do not know of a look-ahead function.
If you sort your data in reverse order, you can use the lag function to obtain this functionality. But then you will need to re-sort after doing look-aheads. I think that may be a more general solution because it would let you use group by processing in situations more complicated than your example.
Not a "look ahead" function, but using two set statements where you start reading on the second observation in the second set will do:
data look_ahead;
set my_data;
set my_data(firstobs=2 rename=(my_var = next_var));
dif_ahead = next_var - my_var;
run;
Be careful with the last observation: the second set statement hits end-of-file before the first observation. If you need that last observation of the first set, you have to take some measures.
By the way: in your original post you wrote the formula: NewVar(of observation n) = Var( of observation n) - Var(of observation n-1).
I assume that you mean: NewVar(of observation n) = Var( of observation n+1) - Var(of observation n)
Eric, FYI, if you use the two set statement approach, you need to nest the file as a dummy within the second set statement. e.g.:
set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_ );
That way you won't get the error when you read the last observation.
Art,
That is an elegant way to do it. Another way to do it, is by adding a NOBS option:
data test;
do n=1 to 10;
output;
end;
run;
data lookahead;
set test;
if _n_ le nobs then set test(firstobs=2 rename=( n=m) nobs=nobs);
run;
proc print;run;
The data step will stop looping when the first SET statement reads the EOF so it does not matter if you add 1 or N extra records to the second SET statement by using no variable dataset. This eliminates the need for the OBS=1 dataset option. This really could make a difference if you want to look ahead more than 1 observation.
I know this is getting silly (I mean my solution), if you want to use LAG() without sorting, here you go:
data want (drop=_:);
set have(rename=var=_var) end=last;
newvar=_var-lag(_var);
var=lag(_var);
if _n_>1 then output;
if last then
do;
var=_var;
call missing(newvar);
output;
end;
run;
Kindly Regards,
Haikuo
Or if you have ETS and don't care about sign, here is another way:
proc expand data=have out=want;
convert var = newvar / transformout=(reverse MOVRANGE 2 reverse);
run;
Or you need one more step to have your sign,
proc expand data=have out=want1;
convert var = _lead / transformout=(lead 1);
run;
data want (drop=time _lead);
set want1;
newvar=_lead-var;
run;
As you have noted, proc expand does have sort of LEAD function.
Kindly Regards,
Haikuo
Hi, All
I apologize for late response. And Thank you all very much for great help.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.