BookmarkSubscribeRSS Feed
Hank
Fluorite | Level 6

Hi,

I have searched around online for answers on the fastest way, in a database of around a million rows, to select the top and bottom observarions each for a selected few variables. It could either be the top/bottom 100 obs, or selected based on some criteria (for example; select all observations above/below a certain value, or above/below 1 standard deviation from the mean etc.).

To clarify, I want to create a new dataset for each of the variables I want the tob/bottom values from . Does anyone have some code-examples to share?

Best regards,

Hank

4 REPLIES 4
Hank
Fluorite | Level 6

I have tried it and maybe my coding is poor, but it only gives you the ranking of the variables. To proceed, I would then have to select the top and bottom of the ranks which in my dataset adds no value. Not familiar with proc surveyselect, but will check it out.

data_null__
Jade | Level 19

I don't see how SURVEYSELECT will help.

How do you define top and bottom.

  • order by var; 100 from the top 100 from end (could be less than 100 values if there are ties)
  • rank by var; and take top 100 ranks and bottom 100 ranks.  (could be more obs if there are ties)

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2298 views
  • 6 likes
  • 4 in conversation