BookmarkSubscribeRSS Feed
Hank
Fluorite | Level 6

Hi,

I have searched around online for answers on the fastest way, in a database of around a million rows, to select the top and bottom observarions each for a selected few variables. It could either be the top/bottom 100 obs, or selected based on some criteria (for example; select all observations above/below a certain value, or above/below 1 standard deviation from the mean etc.).

To clarify, I want to create a new dataset for each of the variables I want the tob/bottom values from . Does anyone have some code-examples to share?

Best regards,

Hank

4 REPLIES 4
Hank
Fluorite | Level 6

I have tried it and maybe my coding is poor, but it only gives you the ranking of the variables. To proceed, I would then have to select the top and bottom of the ranks which in my dataset adds no value. Not familiar with proc surveyselect, but will check it out.

data_null__
Jade | Level 19

I don't see how SURVEYSELECT will help.

How do you define top and bottom.

  • order by var; 100 from the top 100 from end (could be less than 100 values if there are ties)
  • rank by var; and take top 100 ranks and bottom 100 ranks.  (could be more obs if there are ties)

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2833 views
  • 6 likes
  • 4 in conversation