Hello All,
How can I find 3 most frequently occurring values in my dataset using SAS?
Regards,
Aleksandra
If you don't need a dataset Proc Freq with the order=freq will generate a table with the most frequent at the top and provide the count at the same time. This also has an advantage of easily adding multiple, or even all variables in a data set with one pass through the data though that can create a lot of output.
Proc freq data=haveorder=freq;
tables variable;
run;
From one variable or more? You could probably use the count() function with proc sql, get counts of your values then sort them.
Assuming your data are in mydat and the variable you would like to know the most frequent values of is myvar, you could use:
/* get frequency of myvar in your data set */
proc freq data=mydat;
tables myvar / noprint out=tmp (keep=myvar count);
run;
/* sort in descending order by frequency */
proc sort data=tmp;
by descending count;
run;
/* get the frequency of the third most frequent item, need to do this as there might be multiple value of myvar tied for this value; note that
code will also work if there are fewer than 3 distinct values for myvar in your data set. */
data _NULL_;
set tmp;
if _n_ <= 3 then call symput('maxcount',count);
run;
/* only keep observations with a frequency of maxcount or higher */
data tmp;
set tmp (where=(count >= &maxcount));
run;
proc print data=tmp;
var myvar;
run;
Beate
If you don't need a dataset Proc Freq with the order=freq will generate a table with the most frequent at the top and provide the count at the same time. This also has an advantage of easily adding multiple, or even all variables in a data set with one pass through the data though that can create a lot of output.
Proc freq data=haveorder=freq;
tables variable;
run;
And you can use a variation of 's answer to create a bar chart of the top k categories.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.