DATA Step, Macro, Functions and more

DATASTEP

Reply
N/A
Posts: 0

DATASTEP

Hi,
How can i create a new dataset which contains only the last 10 observations from an old dataset which has N number of observations.
N/A
Posts: 0

Re: DATASTEP

I think you can sort as DESCENDING by date and output first 10 records (obs=10) ...
Contributor
Posts: 30

Re: DATASTEP

Hi,

you could use something like:

data _null_;
set OLD_TABLE;
call symput('lines',_N_);
run;

data last_ten;
set OLD_TABLE;
if _n_ ge (&lines - 10);
run;

There is a way to "enter" the dataset at the exact line that you want but I can't remember the correct syntax. And there is a "nicer" way to evaluate the rows of a table...anyone? cynthia? :-)

Hope that helps,
Thomas
Frequent Contributor
Posts: 139

Re: DATASTEP

Try using the FIRSTOBS and OBS options in the set statement. FIRSTOBS has SAS go to the record # that you want. OBS is the total number of records to be read. Eventhough you want the last 10 you still need to read all of them but ignore all but the last 10, so have set OBS= your N # of records. See my example below.


data one;
do i=1 to 1000;
output;
end;
run;


data two;
set one (firstobs=991 obs=1000);
run;

Hope this helps

-Darryl
N/A
Posts: 0

Re: DATASTEP

2-ways:

1) reads a total of 11 observations, 1 first, then 10 second
[pre]
data _null_;
set indata (obs=1) nobs=N;
first = N-9;
call symput('first',first);
call symput('last',N);
run;
quit;
%put first = &first ;
%put last = &last ;
data trial;
set indata (firstobs=&first obs=&last);
run;
quit;
[/pre]

2) reads a total of 10 observations
[pre]
data trial;
do i = (N-9) to N;
set indata nobs=N point=i;
output;
end;
stop;
run;
quit;
[/pre]
Super Contributor
Posts: 260

Re: DATASTEP

I suggest using the NOBS option in the SET statement.
[pre]
DATA last10 ;
SET dataset NOBS=totalN ;
IF _N_> totalN-10 THEN OUTPUT ;
RUN ;
[/pre]

Regards.
Olivier
Valued Guide
Posts: 2,175

Re: DATASTEP

as it is coming from a base SAS data set, you can vary Oliver's program, just a little to obtain top performance from v.large tables.[pre]
DATA last10 ;
do _n_ = total_n - 9 to totalN ;
SET dataset NOBS=totalN POINT=_n_ ;
output ;
end ;
RUN ; [/pre]
Using POINT access ensures you read only the last 10 rows, even if your table has a billion rows.
Total_n is acquired at compile time, even before the SET statement is executed.

Of course, that information may not be available if the input dataset does not support the "point access" that base SAS tables allows. Examples of that limitation occur with data views and non-SAS tables like tables accessed through a SAS/Access to rdbms.

PeterC
N/A
Posts: 0

Re: DATASTEP

Pete, look at my previous post, option #2.

Also, people should be careful with the use of _N_.

_N_ is an automatic SAS variable for a SAS Data Step, and represents the number of iterations of the DATA step, not necessarily the number of items read from a set command.
N/A
Posts: 0

Re: DATASTEP

Chuck

Thanks and Sorry for wasting bandwidth repeating your solution (which I had missed).

As to _n_:
I find _N_ convenient because it is not added to the output dataset, so I need no drop statement (for variable I in your example), nor do I need to take care to select a variable name for the loop control that is not already on the input data set.

PeterC
Ask a Question
Discussion stats
  • 8 replies
  • 154 views
  • 0 likes
  • 5 in conversation