BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
How can i create a new dataset which contains only the last 10 observations from an old dataset which has N number of observations.
8 REPLIES 8
deleted_user
Not applicable
I think you can sort as DESCENDING by date and output first 10 records (obs=10) ...
tbatliner
Calcite | Level 5
Hi,

you could use something like:

data _null_;
set OLD_TABLE;
call symput('lines',_N_);
run;

data last_ten;
set OLD_TABLE;
if _n_ ge (&lines - 10);
run;

There is a way to "enter" the dataset at the exact line that you want but I can't remember the correct syntax. And there is a "nicer" way to evaluate the rows of a table...anyone? cynthia? 🙂

Hope that helps,
Thomas
darrylovia
Quartz | Level 8
Try using the FIRSTOBS and OBS options in the set statement. FIRSTOBS has SAS go to the record # that you want. OBS is the total number of records to be read. Eventhough you want the last 10 you still need to read all of them but ignore all but the last 10, so have set OBS= your N # of records. See my example below.


data one;
do i=1 to 1000;
output;
end;
run;


data two;
set one (firstobs=991 obs=1000);
run;

Hope this helps

-Darryl
deleted_user
Not applicable
2-ways:

1) reads a total of 11 observations, 1 first, then 10 second
[pre]
data _null_;
set indata (obs=1) nobs=N;
first = N-9;
call symput('first',first);
call symput('last',N);
run;
quit;
%put first = &first ;
%put last = &last ;
data trial;
set indata (firstobs=&first obs=&last);
run;
quit;
[/pre]

2) reads a total of 10 observations
[pre]
data trial;
do i = (N-9) to N;
set indata nobs=N point=i;
output;
end;
stop;
run;
quit;
[/pre]
Olivier
Pyrite | Level 9
I suggest using the NOBS option in the SET statement.
[pre]
DATA last10 ;
SET dataset NOBS=totalN ;
IF _N_> totalN-10 THEN OUTPUT ;
RUN ;
[/pre]

Regards.
Olivier
Peter_C
Rhodochrosite | Level 12
as it is coming from a base SAS data set, you can vary Oliver's program, just a little to obtain top performance from v.large tables.[pre]
DATA last10 ;
do _n_ = total_n - 9 to totalN ;
SET dataset NOBS=totalN POINT=_n_ ;
output ;
end ;
RUN ; [/pre]
Using POINT access ensures you read only the last 10 rows, even if your table has a billion rows.
Total_n is acquired at compile time, even before the SET statement is executed.

Of course, that information may not be available if the input dataset does not support the "point access" that base SAS tables allows. Examples of that limitation occur with data views and non-SAS tables like tables accessed through a SAS/Access to rdbms.

PeterC
deleted_user
Not applicable
Pete, look at my previous post, option #2.

Also, people should be careful with the use of _N_.

_N_ is an automatic SAS variable for a SAS Data Step, and represents the number of iterations of the DATA step, not necessarily the number of items read from a set command.
deleted_user
Not applicable
Chuck

Thanks and Sorry for wasting bandwidth repeating your solution (which I had missed).

As to _n_:
I find _N_ convenient because it is not added to the output dataset, so I need no drop statement (for variable I in your example), nor do I need to take care to select a variable name for the loop control that is not already on the input data set.

PeterC

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 854 views
  • 0 likes
  • 5 in conversation