BookmarkSubscribeRSS Feed
Kastchei
Pyrite | Level 9
I am trying to produce the coefficient of variation. I see that PROC MEANS calculates this simply as the STD/Arith.Mean, and I can verify that calculation if I output the std, mean, and cv.

However, I have also been told that PROC TTEST calculates the CV as well, when using the option dist=lognormal. In the past, I have only used this option for getting the geometric mean, so I was hesitant to use this to get the CV. However, the documentation does show that the CV is calculated. And while in the syntax documentation for TTEST itself, there is no mention of how CV is calculated, in this example, it states that the CV displayed with dist=lognormal is in fact the same: "The CV of 0.1676 is the ratio of the standard deviation to the (arithmetic) mean" (http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#/documentation/cdl/en/statug/63033/HTML/default/statug_ttest_sect013.htm).

My problem is that these two PROCs do not give the same result. The simple test I did was a dataset of number 1-10. Obviously, the dist option is affecting the calculation of the CV somehow. Can anyone explain to me the difference? Thanks!

data junk;
input x;
datalines;
1
2
3
4
5
6
7
8
9
10
;
run;

ods listing;
proc ttest data = junk dist = lognormal;
var x;
run;
proc means data = junk n mean std cv;
var x;
run;

Message was edited by: Kastchei (edited, since I had pasted in code that I had altered in an attempt to figure out the difference)


Message was edited by: Kastchei
2 REPLIES 2
SteveDenham
Jade | Level 19
I did a quick search of the PROC TTEST documentation, and learned something new (as I almost always do when I read the manual). For the lognormal distribution, the formula for the CV is NOT the standard deviation divided by the mean. It is instead sqrt(exp(variance) - 1).

Given that, it doesn't surprise me so much that the CV of a lognormal variable is not the standard deviation divided by the mean. See also the documentation for PROC GLIMMIX, under the MODEL statement for the DIST= option, where the estimators for the lognormal distribution are given.

Steve Denham
Kastchei
Pyrite | Level 9
Thanks for the information! For the folks at SAS, could I suggest that this definition be placed in the documentation in an easier place to find? The page covering CV (http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/statug_ttest_a0000000123.htm) has a section for the lognormal distribution, but does not mention that it is calculated differently from the normal distribution, which would tend to lead people to believe it is calculated the same way. Thanks!


Message was edited by: Kastchei

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2557 views
  • 0 likes
  • 2 in conversation