Dear Community, DATA: I have these data on the campaign expenses of election candidates (N=10,400). While these figures generally are continuous and positive (namely the amount spent during the campaign), there also is a high number of candidates that did not spend money during their campaign. The data thus are semi-continuous or zero-inflated. Moreover, they are also highly skewed to the right, as only a few candidates spend very high sums of money. GOAL: I want to fit a model to test whether female candidates spend more/less than male candidates. Gender (as a dummy) thus is the main independent variable, next to several other variables (incumbency, age, party...). As the dependent variable is highly skewed, previous studies on this topic have generally logarithmically transformed this variable. For candidates with no expenses, a minor value (e.g. 0.0001) is added to be able to calculate the logarithm. This variable is then used in a simple OLS regression model. QUESTION: Although this approach of a log-linear model seems quite common, I doubt whether it is fully correct from a statistical point of view (as the p-values fluctuate strongly according to the minor value that is added in case of no expenses). I have read that there are some alternative approaches (such as mixed-effect mixed distribution models or two-part latent growth models), but how can I implement them is SAS to run my model and test my case? I have already tried to use Tooze's MIXCORR macro, but that doesn't seem to work (and I don't know why). Any help is highly appreciated! (I am using SAS 9.4 on Windows.)
... View more