I am trying to produce the coefficient of variation. I see that PROC MEANS calculates this simply as the STD/Arith.Mean, and I can verify that calculation if I output the std, mean, and cv.
However, I have also been told that PROC TTEST calculates the CV as well, when using the option dist=lognormal. In the past, I have only used this option for getting the geometric mean, so I was hesitant to use this to get the CV. However, the documentation does show that the CV is calculated. And while in the syntax documentation for TTEST itself, there is no mention of how CV is calculated, in this example, it states that the CV displayed with dist=lognormal is in fact the same: "The CV of 0.1676 is the ratio of the standard deviation to the (arithmetic) mean" (http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#/documentation/cdl/en/statug/63033/HTML/default/statug_ttest_sect013.htm).
My problem is that these two PROCs do not give the same result. The simple test I did was a dataset of number 1-10. Obviously, the dist option is affecting the calculation of the CV somehow. Can anyone explain to me the difference? Thanks!
I did a quick search of the PROC TTEST documentation, and learned something new (as I almost always do when I read the manual). For the lognormal distribution, the formula for the CV is NOT the standard deviation divided by the mean. It is instead sqrt(exp(variance) - 1).
Given that, it doesn't surprise me so much that the CV of a lognormal variable is not the standard deviation divided by the mean. See also the documentation for PROC GLIMMIX, under the MODEL statement for the DIST= option, where the estimators for the lognormal distribution are given.
Thanks for the information!
For the folks at SAS, could I suggest that this definition be placed in the documentation in an easier place to find? The page covering CV (http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/statug_ttest_a0000000123.htm) has a section for the lognormal distribution, but does not mention that it is calculated differently from the normal distribution, which would tend to lead people to believe it is calculated the same way. Thanks!