I want to identify max 3 values in a column of total grades and create a new variable containing "Top one", "Second" and "third" only then print it. I tried the following code but it put "." in both Total_Grades and Top_three variables.
data Grades_Addeed_top3; set work.Grades; if Total_Grades = max then Top_three=1; else if (Total_Grades = (max-1)) then Top_three=2; else if (Total_Grades = (max-2)) then Top_three=3; else Top_three=.; run; proc print data=proc print data=work.diving; run;
Any advice will be greatly appreciated;
M
proc univariate would give you the max, use an ods statement to throw the value into a dataset. To get the max three values of a column (ie a variable) use proc rank: http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#rank-overview.htm
Hi @mrahouma ,
Maybe try this step-by-step approach?
All the best
Bart
data work.Grades;
call streaminit(11);
do _N_ = 1 to 10;
Total_Grades = int(rand('uniform')*100);
some_other_variable = rand('uniform');
output;
end;
run;
proc print;
run;
data GradesV;
set Grades(keep=Total_Grades) curobs=co; /* to keep original order */
curobs=co;
run;
proc sort data = GradesV out = GradesV;
by descending Total_Grades;
run;
data GradesV;
set GradesV;
by descending Total_Grades;
if first.Total_Grades then _TEMP_ + 1;
drop _TEMP_;
if _TEMP_ < 4 then Top_three = _TEMP_;
run;
proc sort data = GradesV out = GradesV(drop = curobs);
by curobs;
run;
data Total_Grades / view = Total_Grades;
set Grades ;
set GradesV;
run;
proc print data=Total_Grades;
run;
There's a PROC for that 🙂
Try PROC RANK, which will create the ranks for the variable. If you want just the top 3, filter those out, however you should consider how you'll handle ties. If values are tied it averages the ranks so that can be a slightly harder problem. If you need to consider that, you need to filter it out based on the ranks, which I've done at the bottom of the program. Adapt as needed.
*Create ranks for variables value;
proc rank data=sashelp.class out=class_ranked
/* (where= (rank_weight < 4)) */
;
var weight;
ranks rank_weight;
run;
*display output;
title 'Unsorted but ranked';
proc print data=class_ranked;
run;
*sort for display;
proc sort data=class_ranked;
by rank_weight;
run;
title 'Sorted and ranked';
proc print data=class_ranked;
run;
*filter out the top 3 regardless of the number of ties;
data final;
set class_ranked;
by rank_weight;
if first.rank_weight then group + 1;
if group <= 3;
run;
@mrahouma wrote:
I want to identify max 3 values in a column of total grades and create a new variable containing "Top one", "Second" and "third" only then print it. I tried the following code but it put "." in both Total_Grades and Top_three variables.
data Grades_Addeed_top3; set work.Grades; if Total_Grades = max then Top_three=1; else if (Total_Grades = (max-1)) then Top_three=2; else if (Total_Grades = (max-2)) then Top_three=3; else Top_three=.; run; proc print data=proc print data=work.diving; run;Any advice will be greatly appreciated;
M
Hello @mrahouma and welcome to the SAS Support Communities!
PROC SUMMARY allows for another fairly short solution, provided that your Grades dataset contains a variable which uniquely identifies an observation, such as NAME in SASHELP.CLASS:
proc summary data=sashelp.class;
output out=top3 idgrp(max(weight) out[3] (name)=_);
run;
data want(drop=_:);
if _n_=1 then set top3;
set sashelp.class;
_w=whichc(name, of __:);
Top_three=ifn(_w,_w,.);
run;
Edit: Admittedly, with PROC RANK (as suggested by @Reeza) it could still be shorter. The DESCENDING option of the PROC RANK statement would help selecting the largest values. The two procedures differ in the options how tied values are dealt with. For example, in PROC SUMMARY you could specify one or more variables to serve as "tie-breakers."
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.