Hello,
I would like to get the following observations of a data set by groups. The - working (?) - solution would be this:
Data Have;
Input N $ X @@;
Datalines ;
A 1 A 3 A 6 B 2 B 9 B 9 C 1 C 5 C 6 C 7 C 8
;
Run;
Data Want;
Obs=_N_+1;
Set Have End=Eof;
By N; /* Can I do this? */
If not EoF Then Set Have (Keep=X Rename=(X=X_Next)) Point=Obs;
Else Call Missing (X_Next);
If Last.N Then Call Missing (X_Next);
Run;
However, in the documentation it says (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition😞
The solution is required to be reliable. My question is: Is this combination of point and by o.k.?
Thanks&kind regards
Yes your coding is a valid approach. It is thinking about:
- what happens with the PDV
- automatic advancing of records
- at what level the internal housekeeping is done
Your by step is working with the first set statement, there is no point defined.
You second set statement is having different options it acts sepeartely. You could use a table lookup (example 8 Set statement) SAS(R) 9.4 Statements: Reference, Third Edition
Using an index is an binary search on some values and than jumping to the complete record using a point / direct access. Using the point is thinking in an own indexed way.
A very nice paper is: http://support.sas.com/resources/papers/proceedings09/038-2009.pdf The double dow loop is using the different internal householding to restart calculations within an ordered group.
Why would you use these kind of constructs?
They can be much faster in execution time and resource usage common known constructs including SQL. It the reason of the hyping of NOSQL.
The disadvantage is that you have analyze your problem very well to understand what should be done and than the coding part takes more effort and verifying.
Hello,
Why not to use a merge ?
data want;
merge have have(firstobs=2 rename=(x=x_next n=n_next));
If n ne n_next then Call Missing (X_Next);
drop n_next;
run;
Your datastep is OK.
There are other methods to implement look ahead in a data step, but I like your version as well.
see:
Yes your coding is a valid approach. It is thinking about:
- what happens with the PDV
- automatic advancing of records
- at what level the internal housekeeping is done
Your by step is working with the first set statement, there is no point defined.
You second set statement is having different options it acts sepeartely. You could use a table lookup (example 8 Set statement) SAS(R) 9.4 Statements: Reference, Third Edition
Using an index is an binary search on some values and than jumping to the complete record using a point / direct access. Using the point is thinking in an own indexed way.
A very nice paper is: http://support.sas.com/resources/papers/proceedings09/038-2009.pdf The double dow loop is using the different internal householding to restart calculations within an ordered group.
Why would you use these kind of constructs?
They can be much faster in execution time and resource usage common known constructs including SQL. It the reason of the hyping of NOSQL.
The disadvantage is that you have analyze your problem very well to understand what should be done and than the coding part takes more effort and verifying.
Here is two solution . one is using routine CALL SET () . another is making a flag variable.
data want; set have; by n; next=.; retain dsid; if _n_ eq 1 then dsid=open('have(keep=x rename=(x=next))','i'); call set(dsid); if not last.n then rc=fetchobs(dsid,_n_+1); run;
Data Have; Input N $ X @@; Datalines ; A 1 A 3 A 6 B 2 B 9 B 9 C 1 C 5 C 6 C 7 C 8 ; Run; proc sort data=have;by n;run; data want; set have nobs=nobs end=last; if _n_ ne nobs then do; obs=_n_+1; set have(Rename=(n=_n X=X_Next)) point=obs ; end; If n ne _n or last Then Call Missing (X_Next); run;
Xia Keshan
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.