Hey guys,
How to rank a variable without touching proc rank. I would like datastep logic in ranking a variable, For example if i have a dateset that has studentid, Grades( in percentile).it is kinda challenging to rank them when the students grades are in percentiles for the reason many students will fall into a percentile Should we bucket or group into category like for example students falling into 75th percentile? or use something to work around the TIES in ranks.
I prefer the solution in datastep. I know proc rank enough i guess. Thanks
Good day,
Mark
Similarly,
data want;
set have;
by studentid descending grade ;
order+1-order*first.studentid;
rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;
run;
Haikuo
I think a small example of input and desired output might be helpful to answering your question.
Here's one variation, where tied scores get the same rank:
proc sort data=have;
by descending score;
run;
data want;
set have;
by descending score;
if first.score then rank=_n_;
retain rank;
run;
If you are looking to assign buckets, rather than individual ranks, that can be done too but it's a little more complex. It requires using the total number of observations in the rank formula:
set have nobs=_nobs_;
So if this isn't exactly what you need, just clarify. Good luck.
Astounding,
Thank you for the response. I guess that would rank all student, for example if there are 300 students there would 300 ranks.
In my case, I have several grades for one student and for many students. I want to rank the grades obtained by each student. The purpose is know how well the student(he/she) has shown any improvment. So again for example, if one student id has 300 grades there would have to be 300 grades for each studentid and the same applies to all other student ids. So basically I am trying to rank the grades per studentid.
The input dataset i have is:
Studentid Date Grades
and The desired output shoud look like
Studentid Date Grades Ranked_grades
student1 mm/dd//yy 53 1
student1 mm/dd/yy 50 2
student1 - 40 3
student1 - 40 3
student2 69 4
student2 70 3
student2 75 1
student2 75 1
/*Ranked_grades- You would notice ties in the ranks for student1 in 3 rd rank and for student2 in first rank. So far this how my output data i managed. I want to remove the ties so where 3 follows previous 3, i want that as 4 and likewise for all scenarios where ties apply. Thanks*/
Are you after something simple like below:
data have;
input StudentID $ Grade;
retain date 18000;
date+1;
format date date9.;
datalines;
1 53
1 50
1 40
1 40
2 69
2 70
2 75
2 75
;
run;
proc sort data=have;
by StudentID descending Grade date;
run;
data want;
set have;
by StudentID;
if first.StudentID then Ranked_Grade=0;
Ranked_Grade+1;
run;
You need to retain two variables. An ORDER variable that increased by one for each record within a student. And RANK which is set to ORDER when a new value of grade is hit.
proc sort ;
by studentid descending grade ;
run;
data want;
set grades;
by studentid descending grade ;
if first.studentid then order=0;
order+1;
if first.grade then rank=order;
retain order rank;
run;
Similarly,
data want;
set have;
by studentid descending grade ;
order+1-order*first.studentid;
rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;
run;
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.