Hi Guys,
I am trying to obtain the median value using median() function in PROC SQL as follows (SAS9.4):
/* select clause with additional columns: Name Height*/
proc sql;
select Name,Sex,Height,median(Height) as Med_Height,mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;The results are as follows:
However, when removing columns Name, Height in select clause like this:
/* select clause without additional column */
proc sql;
select Sex, median(Height) as Med_Height, mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;The results are as follows:
Comparing the two tables, the mean value of Height is still 63.91 while the median of Height changes from 64.3 to 64.15 (of course the 64.15 is the right value).
Where did the first median value 64.3 come from? How did this difference (64.3 v.s. 64.15) happen? After all the mean value remains the same. I feel confused.
Can anyone explain this? Thanks a lot guys. Appreciate it.
What SAS version. Mine is 9.4 (TS1M8) and I do not see difference.
Update: Problem Note 66896: Inconsistent results are generated with the PROC SQL Median summary function
proc sql;
select Name,Sex,Height,median(Height) as Med_Height,mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;
proc means data=sashelp.class n median mean;
class sex;
var height;
run;
proc sql;
select Sex, median(Height) as Med_Height, mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;
What SAS version. Mine is 9.4 (TS1M8) and I do not see difference.
Update: Problem Note 66896: Inconsistent results are generated with the PROC SQL Median summary function
proc sql;
select Name,Sex,Height,median(Height) as Med_Height,mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;
proc means data=sashelp.class n median mean;
class sex;
var height;
run;
proc sql;
select Sex, median(Height) as Med_Height, mean(Height) as Mean_Height
from sashelp.class
group by sex
having sex="M";
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.