BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DeepakSwain
Pyrite | Level 9

Hi there,

Can somebody guide me to control the number of output following proc sort. Number of input can be controlled by firstobs and obs in proc sort. I want to control the number of record in my output. 

data test;
input id weight;
datalines;
1 10
2 20
3 30
4 20
5 30
;
run;

proc sort data =test (firstobs=1 obs=4) out =test1; by weight; run;

Swain
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The only way I know to do this is with the undocumented function MONOTONIC. MONOTONIC() returns 1 on the first call and increases by 1 every time it is called. So, to keep the first two obs from proc sort (or any other proc) output:

 


proc sort data=test out=test1(where=(monotonic()<=2)); by weight; run;

 

 

PG

View solution in original post

11 REPLIES 11
jklaverstijn
Rhodochrosite | Level 12

Proc sort will have to sort the entire table before determining the first n rows. So a subsequent datastep to achive that would not be much overhead.

 

proc sort data =test out =test1; 
   by weight; 
run;

data test2;
   set test1 (firstobs=4 obs=7);
run;

 Alternatively, if you just want the top n rows:

 

proc sql outobs=10;
   create table test2
     as select * from test
     order by weight;
quit;

Hope this helps,

- Jan.

DeepakSwain
Pyrite | Level 9

Hi Jklaverstijn,

 

 

 

Swain
LinusH
Tourmaline | Level 20

Why not SQL?

Why not use a succeeding data step?

Motivate!

Data never sleeps
DeepakSwain
Pyrite | Level 9

Hi there,

Just desire to explore alternative and keep enthusiasm.

Regards,

Deepak

Swain
jklaverstijn
Rhodochrosite | Level 12

Hi Deepak,

 

I'm afraid you cannot do that in proc sort. I checked and obs= and firstobs= just don't work. Surprisingly the options are silently ignored.

 

As I said, sort will have to create the entire table before s subset base on rownumbercan be determined. So it is not much of a  matter of efficiency to have a datastep as follow-up.

 

- Jan.

ballardw
Super User

@jklaverstijn wrote:

Hi Deepak,

 

I'm afraid you cannot do that in proc sort. I checked and obs= and firstobs= just don't work. Surprisingly the options are silently ignored.

 

 

- Jan.


 

Not at all "silent", from the documentation:

Specifies the last observation that SAS processes in a data set.

OBS= Data Set Option



Specifies the last observation that SAS processes in a data set.

Valid in: DATA step and PROC steps
Category: Observation Control
Default: MAX
Restriction: Use with input data sets only
Restriction:

Cannot use with PROC SQL views


PGStats
Opal | Level 21

I would prefer to get something like

 

NOTE: Dataset option OBS= is ignored for output datasets.

PG
DeepakSwain
Pyrite | Level 9

Hi Jan,

Thanks for your kind reply. It is helpful to me.

Regards,

Deepak

Swain
PGStats
Opal | Level 21

The only way I know to do this is with the undocumented function MONOTONIC. MONOTONIC() returns 1 on the first call and increases by 1 every time it is called. So, to keep the first two obs from proc sort (or any other proc) output:

 


proc sort data=test out=test1(where=(monotonic()<=2)); by weight; run;

 

 

PG
jklaverstijn
Rhodochrosite | Level 12



proc sort data=test out=test1(where=(monotonic()<=2)); by weight; run;

 

 


What I like is the cleverness. What I like less is the obscure character of monotonic(). It is undocumented for a reason. Eg, I am uncertain if it works reliably in a threaded environment. So your mileage may vary.

 

- Jan.

FreelanceReinh
Jade | Level 19

Maybe a WHERE= dataset option on the output dataset is even better than enforcing a fixed number of observations, because in case of ties the selection might be a bit arbitrary otherwise:

proc sort data=test out=test1(where=(weight>=30));
by weight; 
run;

Drawback: You have to know a reasonable threshold value.

 

Or check PROC UNIVARIATE with the NEXTROBS= option (I learned about this from one of Ksharp's posts😞

ods select none;
ods output ExtremeObs=check;
proc univariate data=test nextrobs=2;
id id;
var weight;
run;
ods select all;

Another approach would be PROC SUMMARY with the IDGROUP option of the OUTPUT statement.

 

If you want to select the n largest/smallest values (as opposed to extreme observations), you can use PROC UNIVARIATE with the NEXTRVAL= option.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 4890 views
  • 11 likes
  • 6 in conversation