07-20-2016 10:38 PM - edited 07-20-2016 10:42 PM
Hello! Basic SAS 9.4 user here. My dependent variable is disease (0=absent, 1=present) & my indepedent variable (vitd) is not normally distributed. I'm trying to use geometric mean (GM) to describe "vitd". However, I don't have expereince with using GM. I would appreciate help on any of these:
1) If I'm describing my data using GM, I should report standard error (SE) and not coefficient of variation, correct?
2) In using a T test, I used the following code to get the GM and p values, however, I only get the coefficient of variation and not the SE. How can I get the SE while using proc ttest? I don't know how to use proc surveymeans to stratify my data by class, so I'm using the following:
proc ttest data=work.run1 alpha=0.05 dist=lognormal;
3) While using proc surveymeans, how can I to stratify my data so that I get GM and SE for "vitd" for each class of "disease"?
4) Lastly, I'm used to comparing based on the mean and sds (paired with p) to explain clinical significance. How would I effectively interpret the GM between those with disease versus those without disease?
Thanks in advance!!
07-22-2016 08:58 AM
Do you have any real or fake data that you can share? It's hard to know what you are seeing with regards to SE and CV. There is a lognormal example in the doc, but is is for paired data. If you can't post the data, can you show the tables?
Just to clarify, if m is the mean of the log-transformed data, then exp(m) is the geometric mean of the data.
When you use DIST=LOGNORMAL, you are testing whether the ratio mu1/mu2 is significantly different from 1. The doc says:
The DIST= LOGNORMAL analysis is handled by log-transforming the data and null value, performing a DIST= NORMAL analysis, and then transforming the results back to the original scale. See the section Normal Data (DIST=NORMAL) for the one-sample design for details on how the DIST= NORMAL computations for means and standard deviations are transformed into the DIST= LOGNORMAL results for geometric means and CVs. As mentioned in the section Coefficient of Variation, the assumption of equal CVs on the lognormal scale is analogous to the assumption of equal variances on the normal scale.
07-25-2016 10:08 AM - edited 07-25-2016 09:59 PM
Thanks for the response, it was really helpful! Based on what you've told me and stuff I've read elsewhere, I'm planning on reporting the geometric mean (gm) and the standard deviation (sd) for gm. I derived these two by exponentiating the mean and sd for my log transformed variable, "ln_vit." Essentially, exponentiating both mean & sd of "ln_vit" = gm & sd(of gm) for "vit",
proc means data=work.run1; var vit ln_vit; class disease; run;
My last question I would appreciate help with is doing ANOVA for my skewed variable, "vit," by disease severity (i.e. three categories - mild, moderate, & severe). My guess is that I do ANOVA based on "ln_vit" (the log transformed values for "vit"). Correct me if I'm wrong - the p value I get for this ANOVA would also be the p value if I compared gm of vit, by disease severity?
PROC GLM DATA=work.run1 PLOTS=none; CLASS disease_severity; MODEL ln_vit=disease_severity; MEANS disease_severity; RUN;
Side note - Here are the results you suggested I post, regarding my initial questions: