Programming the statistical procedures from SAS

How to rank a variable is datastep without using PROC RANK?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

How to rank a variable is datastep without using PROC RANK?

Hey guys,

How to rank a variable without touching proc rank. I would like datastep logic in ranking a variable, For example if i have a dateset that has studentid, Grades( in percentile).it is kinda challenging to rank them when the students grades are in percentiles for the reason many students will fall into a percentile Should we bucket or group into category like for example students falling into 75th percentile? or use something to work around the TIES in ranks.

I prefer the solution in datastep. I know proc rank enough i guess. Thanks

Good day,

Mark


Accepted Solutions
Solution
‎10-17-2012 11:10 PM
Respected Advisor
Posts: 3,147

Re: How to rank a variable is datastep without using PROC RANK?

Similarly,

data want;

set have;

  by studentid descending grade ;

order+1-order*first.studentid;

rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;

run;

Haikuo

View solution in original post


All Replies
Super User
Posts: 10,871

Re: How to rank a variable is datastep without using PROC RANK?

I think a small example of input and desired output might be helpful to answering your question.

Super User
Posts: 5,255

Re: How to rank a variable is datastep without using PROC RANK?

Here's one variation, where tied scores get the same rank:

proc sort data=have;

   by descending score;

run;

data want;

   set have;

   by descending score;

   if first.score then rank=_n_;

   retain rank;

run;

If you are looking to assign buckets, rather than individual ranks, that can be done too but it's a little more complex.  It requires using the total number of observations in the rank formula:

   set have nobs=_nobs_;

So if this isn't exactly what you need, just clarify.  Good luck.

Frequent Contributor
Posts: 75

Re: How to rank a variable is datastep without using PROC RANK?

Astounding,

Thank you for the response. I guess that would rank all student, for example if there are 300 students there would 300 ranks.

In my case, I have several grades for one student and for many students. I want to rank the grades obtained by each student. The purpose is know how well the student(he/she) has shown any improvment. So again for example,  if one student id has 300 grades there would have to be 300 grades for each studentid and the same applies to all other student ids. So basically I am trying to rank the grades per studentid.

The input dataset i have is:

Studentid        Date     Grades

and The desired output shoud look like

Studentid           Date                         Grades         Ranked_grades
student1         mm/dd//yy                     53                      1

student1         mm/dd/yy                      50                      2

student1              -                               40                      3

student1              -                               40                      3

student2                                              69                      4

student2                                              70                      3

student2                                              75                      1

student2                                              75                      1

/*Ranked_grades- You would notice ties in the ranks for student1 in 3 rd rank and for student2 in first rank. So far this how my output data i managed. I want to remove the ties so where 3 follows previous 3, i want that as 4 and likewise for all scenarios where ties apply. Thanks*/

Respected Advisor
Posts: 4,011

Re: How to rank a variable is datastep without using PROC RANK?

Are you after something simple like below:

data have;
  input StudentID $ Grade;
  retain date 18000;
  date+1;
  format date date9.;
  datalines;
1 53
1 50
1 40
1 40
2 69
2 70
2 75
2 75
;
run;

proc sort data=have;
  by StudentID descending Grade date;
run;

data want;
  set have;
  by StudentID;
  if first.StudentID then Ranked_Grade=0;
  Ranked_Grade+1;
run;
 

Super User
Super User
Posts: 6,699

Re: How to rank a variable is datastep without using PROC RANK?

You need to retain two variables. An ORDER variable that increased by one for each record within a student. And RANK which is set to ORDER when a new value of grade is hit.

proc sort ;

  by studentid descending grade ;

run;

data want;

  set grades;

  by studentid descending grade ;

  if first.studentid then order=0;

  order+1;

  if first.grade then rank=order;

  retain order rank;

run;

Solution
‎10-17-2012 11:10 PM
Respected Advisor
Posts: 3,147

Re: How to rank a variable is datastep without using PROC RANK?

Similarly,

data want;

set have;

  by studentid descending grade ;

order+1-order*first.studentid;

rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;

run;

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 7621 views
  • 8 likes
  • 6 in conversation