Help using Base SAS procedures

How to select the top and bottom 100 observations for 5 variables each in a dataset?

Reply
Contributor
Posts: 35

How to select the top and bottom 100 observations for 5 variables each in a dataset?

Hi,

I have searched around online for answers on the fastest way, in a database of around a million rows, to select the top and bottom observarions each for a selected few variables. It could either be the top/bottom 100 obs, or selected based on some criteria (for example; select all observations above/below a certain value, or above/below 1 standard deviation from the mean etc.).

To clarify, I want to create a new dataset for each of the variables I want the tob/bottom values from . Does anyone have some code-examples to share?

Best regards,

Hank

Super User
Posts: 5,426

Re: How to select the top and bottom 100 observations for 5 variables each in a dataset?

PROC RANK?

Data never sleeps
Occasional Contributor
Posts: 7

Re: How to select the top and bottom 100 observations for 5 variables each in a dataset?

PROC SURVEYSELECT?

Contributor
Posts: 35

Re: How to select the top and bottom 100 observations for 5 variables each in a dataset?

I have tried it and maybe my coding is poor, but it only gives you the ranking of the variables. To proceed, I would then have to select the top and bottom of the ranks which in my dataset adds no value. Not familiar with proc surveyselect, but will check it out.

Respected Advisor
Posts: 3,799

Re: How to select the top and bottom 100 observations for 5 variables each in a dataset?

I don't see how SURVEYSELECT will help.

How do you define top and bottom.

  • order by var; 100 from the top 100 from end (could be less than 100 values if there are ties)
  • rank by var; and take top 100 ranks and bottom 100 ranks.  (could be more obs if there are ties)
Ask a Question
Discussion stats
  • 4 replies
  • 874 views
  • 6 likes
  • 4 in conversation