- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Right now the best thing I can find is reversing the order of my dataset and using LAG. Is there an alternative to this?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If not, here's some approaches.
https://communities.sas.com/t5/SAS-Communities-Library/How-to-simulate-the-LEAD-function-opposite-of...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If not, here's some approaches.
https://communities.sas.com/t5/SAS-Communities-Library/How-to-simulate-the-LEAD-function-opposite-of...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There're many. IMO, the simplest (which I don't find in the link by @Reeza, nor in the old link within the link) is reading the Kth record downstream from the current using SET POINT=K.
Here's an example for the most common scenario of creating K=1 lead values in every BY group:
data have ;
input id var ;
cards ;
1 1
2 2
2 3
3 4
3 5
3 6
;
run ;
data lead ;
set have ;
by id ;
_n_ + 1 ;
if not last.id then set have (keep=var rename=var=lead_var) point = _n_ ;
else call missing (lead_var) ;
run ;
Result:
id var lead_var --------------------- 1 1 . 2 2 3 2 3 . 3 4 5 3 5 6 3 6 .
If the leads are to be created throughout the file with no regard to BY grouping, the idea is the same, and the code is virtually identical:
data lead ;
set have end = z ;
_n_ + 1 ;
if not z then set have (keep=var rename=var=lead_var) point = _n_ ;
else call missing (lead_var) ;
run ;
Kind regards
Paul D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To me the simplest (and fastest) is simply to use :
data T2;
merge T1
T1 (firstobs=2 keep=VAR rename=(VAR=LEAD_VAR) );
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's simplest and fastest in this simple case. In the more common BY case it gets less simple since the records in each BY group have to be enumerated first, i.e., for example:
data have ;
input id var ;
cards ;
1 1
2 2
2 3
3 4
3 5
3 6
;
run ;
data v (keep=id var seq) / view = v ;
set have ;
by id ;
if first.id then seq = 1 ;
else seq + 1 ;
run ;
data lead (drop = seq) ;
merge have v (rename=var=lead_var where=(seq > 1)) ;
by id ;
if last.id then call missing (lead_var) ;
run ;
As far as the fastest goes, with this method the file ends up being read twice even when generating a lead is needed only for certain records upon a specific condition. E.g., picture a situation where one needs to replace VAR with its lead only when VAR is missing and such records constitute but a tiny fraction of the whole file. In such a case, it would be faster to grab the needed leads via POINT= only for those specific records, would it not?
Kind regards
Paul D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@hashman For a BY group, I'd probably do something like this:
data LEAD2;
set HAVE;
by ID;
if ^LASTOBS then
set HAVE(firstobs=2 keep=VAR rename=(VAR=LEAD_VAR)) end=LASTOBS;
if last.ID then call missing (LEAD_VAR) ;
run;
If a small fraction of the values has to be retrieved, the losses due to direct POINT= access would be small, and as you point out this method would probably be preferable.
In any case, since we are reading the next observation, losses of either method are probably limited since we are always reading from the cache. One of the reads will be made from RAM, but . I random access logic has some overheads that sequential read does not.
data HAVE ; do VAR=1 to 1e7; output; end;
data WANT;
merge HAVE
HAVE (firstobs=2 keep=VAR rename=(VAR=LEAD_VAR) );
run;
/* real time 2.48 seconds
user cpu time 1.95 seconds */
data lead ;
set have end = z ;
_n_ + 1 ;
if not z then set have (keep=var rename=var=lead_var) point = _n_ ;
else call missing (lead_var) ;
run ;
/* real time 3.54 seconds
user cpu time 2.71 seconds */
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@ChrisNZ: Amen to all that.