Hey guys,
How to rank a variable without touching proc rank. I would like datastep logic in ranking a variable, For example if i have a dateset that has studentid, Grades( in percentile).it is kinda challenging to rank them when the students grades are in percentiles for the reason many students will fall into a percentile Should we bucket or group into category like for example students falling into 75th percentile? or use something to work around the TIES in ranks.
I prefer the solution in datastep. I know proc rank enough i guess. Thanks
Good day,
Mark
Similarly,
data want;
set have;
by studentid descending grade ;
order+1-order*first.studentid;
rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;
run;
Haikuo
I think a small example of input and desired output might be helpful to answering your question.
Here's one variation, where tied scores get the same rank:
proc sort data=have;
by descending score;
run;
data want;
set have;
by descending score;
if first.score then rank=_n_;
retain rank;
run;
If you are looking to assign buckets, rather than individual ranks, that can be done too but it's a little more complex. It requires using the total number of observations in the rank formula:
set have nobs=_nobs_;
So if this isn't exactly what you need, just clarify. Good luck.
Astounding,
Thank you for the response. I guess that would rank all student, for example if there are 300 students there would 300 ranks.
In my case, I have several grades for one student and for many students. I want to rank the grades obtained by each student. The purpose is know how well the student(he/she) has shown any improvment. So again for example, if one student id has 300 grades there would have to be 300 grades for each studentid and the same applies to all other student ids. So basically I am trying to rank the grades per studentid.
The input dataset i have is:
Studentid Date Grades
and The desired output shoud look like
Studentid Date Grades Ranked_grades
student1 mm/dd//yy 53 1
student1 mm/dd/yy 50 2
student1 - 40 3
student1 - 40 3
student2 69 4
student2 70 3
student2 75 1
student2 75 1
/*Ranked_grades- You would notice ties in the ranks for student1 in 3 rd rank and for student2 in first rank. So far this how my output data i managed. I want to remove the ties so where 3 follows previous 3, i want that as 4 and likewise for all scenarios where ties apply. Thanks*/
Are you after something simple like below:
data have;
input StudentID $ Grade;
retain date 18000;
date+1;
format date date9.;
datalines;
1 53
1 50
1 40
1 40
2 69
2 70
2 75
2 75
;
run;
proc sort data=have;
by StudentID descending Grade date;
run;
data want;
set have;
by StudentID;
if first.StudentID then Ranked_Grade=0;
Ranked_Grade+1;
run;
You need to retain two variables. An ORDER variable that increased by one for each record within a student. And RANK which is set to ORDER when a new value of grade is hit.
proc sort ;
by studentid descending grade ;
run;
data want;
set grades;
by studentid descending grade ;
if first.studentid then order=0;
order+1;
if first.grade then rank=order;
retain order rank;
run;
Similarly,
data want;
set have;
by studentid descending grade ;
order+1-order*first.studentid;
rank+ifn(grade=lag(grade),0,1)-rank*first.studentid;
run;
Haikuo
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.