BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MarkWik
Quartz | Level 8

Hey guys,

How to rank a variable without touching proc rank. I would like datastep logic in ranking a variable, For example if i have a dateset that has studentid, Grades( in percentile).it is kinda challenging to rank them when the students grades are in percentiles for the reason many students will fall into a percentile Should we bucket or group into category like for example students falling into 75th percentile? or use something to work around the TIES in ranks.

I prefer the solution in datastep. I know proc rank enough i guess. Thanks

Good day,

Mark

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Similarly,

data want;

set have;

  by studentid descending grade ;

order+1-order*first.studentid;

rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;

run;

Haikuo

View solution in original post

6 REPLIES 6
ballardw
Super User

I think a small example of input and desired output might be helpful to answering your question.

Astounding
PROC Star

Here's one variation, where tied scores get the same rank:

proc sort data=have;

   by descending score;

run;

data want;

   set have;

   by descending score;

   if first.score then rank=_n_;

   retain rank;

run;

If you are looking to assign buckets, rather than individual ranks, that can be done too but it's a little more complex.  It requires using the total number of observations in the rank formula:

   set have nobs=_nobs_;

So if this isn't exactly what you need, just clarify.  Good luck.

MarkWik
Quartz | Level 8

Astounding,

Thank you for the response. I guess that would rank all student, for example if there are 300 students there would 300 ranks.

In my case, I have several grades for one student and for many students. I want to rank the grades obtained by each student. The purpose is know how well the student(he/she) has shown any improvment. So again for example,  if one student id has 300 grades there would have to be 300 grades for each studentid and the same applies to all other student ids. So basically I am trying to rank the grades per studentid.

The input dataset i have is:

Studentid        Date     Grades

and The desired output shoud look like

Studentid           Date                         Grades         Ranked_grades
student1         mm/dd//yy                     53                      1

student1         mm/dd/yy                      50                      2

student1              -                               40                      3

student1              -                               40                      3

student2                                              69                      4

student2                                              70                      3

student2                                              75                      1

student2                                              75                      1

/*Ranked_grades- You would notice ties in the ranks for student1 in 3 rd rank and for student2 in first rank. So far this how my output data i managed. I want to remove the ties so where 3 follows previous 3, i want that as 4 and likewise for all scenarios where ties apply. Thanks*/

Patrick
Opal | Level 21

Are you after something simple like below:

data have;
  input StudentID $ Grade;
  retain date 18000;
  date+1;
  format date date9.;
  datalines;
1 53
1 50
1 40
1 40
2 69
2 70
2 75
2 75
;
run;

proc sort data=have;
  by StudentID descending Grade date;
run;

data want;
  set have;
  by StudentID;
  if first.StudentID then Ranked_Grade=0;
  Ranked_Grade+1;
run;
 

Tom
Super User Tom
Super User

You need to retain two variables. An ORDER variable that increased by one for each record within a student. And RANK which is set to ORDER when a new value of grade is hit.

proc sort ;

  by studentid descending grade ;

run;

data want;

  set grades;

  by studentid descending grade ;

  if first.studentid then order=0;

  order+1;

  if first.grade then rank=order;

  retain order rank;

run;

Haikuo
Onyx | Level 15

Similarly,

data want;

set have;

  by studentid descending grade ;

order+1-order*first.studentid;

rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;

run;

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 15275 views
  • 8 likes
  • 6 in conversation