About WillTheKiwi

WillTheKiwi · ‎10-25-2021

Why are you using Glimmix, if the data (i.e., the residuals) appear normally distributed? I know you can use Glimmix for such data, but why not Mixed, which probably gives you more control over the structure of the residuals and less convergence and infinite-likelihood problems. And I always use ddfm=Sat in Mixed, which seems to give sensible degrees of freedom for everything. As for negative variance, yeah, I don't have any problem with it. When a true variance is small and the uncertainty is large, sampling uncertainty can easily result in an observed negative variance, and even if the sample variance is positive, the only way you can get sensible confidence limits is to allow them (i.e., the lower limit) to be negative.

WillTheKiwi · ‎10-20-2021

When you allow negative variance in Mixed, SAS assumes the variances estimated by the random statement are normally distributed. If you don't allow negative variance, SAS assumes they are chi-squared distributed. The residual in Mixed is always chi-squared distributed. I guess in Glimmix you have to decide on the distribution of the random effects yourself, and that's why SAS provides only the standard error. I have always assumed normality. The distribution of the residuals is, of course, given by dist= (and link=), without or with overdispersion given by random _residual_. I hope I have got that right.

WillTheKiwi · ‎10-20-2021

Compatibility is Sander Greenland's ("retire statistical significance") preferred term for confidence. The values of the effect in the interval are most compatible with the data and the model (and all the assumptions underlying the model and the sampling). It's his way of keeping the model and the assumptions up front. Which is fine, but in the end practitioners still have to interpret the interval as the likely range of the true effect, by implicitly adopting a Bayesian assessment with a minimally or sufficiently weakly informative prior; in other words, they have to interpret the interval as uncertainty in the true value, or as values that they are most confident the true effect could have. Without those interpretations, the effect has no practical application, in my view, anyway. In plain language, the interval represents how small or big the effect could be, and you interpret that uncertainty with reference to magnitude thresholds: trivial, small +ive and -ive, moderate +ive and -ive, and so on.

WillTheKiwi · ‎10-19-2021

It looks like I am not going to get any further help on my problem of whether I rescale the random-effect solution to increase the SD of the solution to make it the same as the square root of the corresponding covparm, so I've done some more simulations to try to figure it out. I found that the compatibility intervals for the individual responses provided by the random-effect solution appear to be accurate; that is, with 90% intervals, on average the intervals contain the sample true value of the individual response 90% of the time, even though the SD of the random-effect solution is way less than the square root of the covparm (which happens when there are only a few degrees of freedom for the covparm). So although the individual estimated values are shrunk, the compatibility intervals capture the true values correctly. I therefore think it's safe to rescale the random-effect solution and the compatibility limits. It amounts to a correction for shrinkage.

WillTheKiwi · ‎10-19-2021

Looks like we are on the same page about not testing for whatever. I never do non-parametric tests, by the way. The whole rationale is weird: you're supposed to do them because the data are not normal, yet the distribution of the rank is uniform, not normal. Log transformation usually reduces non-uniformity. Here's a though: what about treating the rank as a count, after subtracting 1 off it, so that you don't predict impossible ranks? Allow for overdispersion. This might work well for rank representing performance, where rank=1 is the best. Residuals vs predicteds would determine if you can achieve reasonable uniformity that way. I can get compatibility limits for the random-effect solution in Glimmix. Here's the code I just tried: random int/subject=GameID s cl alpha=α Glimmix won't give CLs for the variances, but it gives standard errors, so I just assume normality, as SAS assumes with nobound in Mixed.

WillTheKiwi · ‎10-18-2021

Sorry, forgot to add that "I am still undecided about whether I should use the random-effect solution as is, or whether I should inflate the values so that their simple SD squared is the same as the covparm variance." It comes back the issue that, if I interpret the SD given by the square root of the covparm variance, I get a value with an expected value that is correct: the SD I have used to generate the individual responses, and it's substantial, if I make it large enough. (Let's leave aside the issue of halving the usual magnitude thresholds when you are interpreting the magnitude of an SD.) But, if that SD has only a few degrees of freedom, then the SD of the random-effect solution will be much smaller, and it could be trivial. So from the random-effect solution, I get the impression that the individual values of the individual responses are trivial, yet their SD via the covparms is substantial. It seems I have no choice but to inflate the values of the random-effect solution so that their SD is the same as that given by the covparm.

WillTheKiwi · ‎10-18-2021

Thanks for your continuing engagement, Steve. The group= approach is the easier way to account for different variance in the control and exptal groups, but it doesn't give compatibility limits for their difference, which in this case represent individual responses. By the way, I never test for homogeneity of variance. If the variances could be different, I specify different variances. If the resulting observed values are similar, I haven't lost anything, and I get a chance to see how different they could be via the compatibility limits. Besides, such tests are too dependent on sample size: if the sample size is small, you can have substantial heterogeneity that is not significant; if the sample size is large, you can have trivial heterogeneity that is significant. The same goes for tests of normality, which I also never do. In any case, it's difficult to know what is a small vs a large sample size, because that depends in a complicated way on smallest important values for effects. It's always better to get compatibility intervals and interpret the lower and upper limits.

WillTheKiwi · ‎10-16-2021

Steve, thanks heaps for the detailed reply. I haven't followed those links yet, because they may not be relevant. What I have done instead is write a simulation (shown below) for a controlled trial in which there are individual responses in an experimental group and in which the change scores in the control and experimental groups are analyzed with two simple mixed models: the first estimates extra variance in the experimental group, representing the individual responses in that group; the second estimates extra variance in the control group, in which case the estimate of the variance is expected to be negative, equal and opposite to the estimate in the first model. It all hangs together nicely. Both models give exactly the same values for the mean effect of the treatment, the residual variance, and the random-effect variance (with a change of sign). So I conclude there is no problem with allowing negative variance when the data need it, and we should allow negative variance anyway, to get unbiased estimates and compatibility limits. (Note that, if the estimate of the variance is positive, allowing only positive variance makes no difference to the estimate and its standard error, but the compatibility limits provided by SAS are unrealistic, because they are based on an unrealistic assumption about the sampling distribution of the variance. The assumption of normality when variance is allowed to be negative produces correct coverage of the interval. Bias would arise when only positive variance is allowed, because negative values of variance arising from sampling variation--or indeed if the true value is negative because of truly less variance in the experimental group--are set to zero.) However, there is a bit of a problem with the random-effect solution when the corresponding variance is negative: SAS doesn't want to give it a standard error (and therefore compatibility limits). This problem occurred occasionally with some but not all values of the random-effect solution in my generalized mixed model with much more data and more complex fixed- and random-effects models, but it occurs every time with every value of the solution in this simulation. I figured that, because of the way the random-effect solution is added to the other effects to get predicted values, there would have to be a negative correlation between the random-effect solution and the residuals, when the variance is negative (the second model), because the residual variance is the variance in the experimental group, so the random-effect solution in the control group has to add to and reduce the values of the residuals in the control group to give the lower SD in the control group. Yes, the correlation was negative alright, but unexpectedly it was perfect, -1.00. In the first model, I didn't expect a correlation, but it was also perfect, this time positive, 1.00. These correlations must arise from the way the variance is partitioned between the residuals and the random effect. I conclude that, at least in this simple individual-responses data and model, the random-effect solution for a negative variance has to be interpreted in the light of the residuals and any other random effects that it is competing with. When the true extra variance is in the experimental group, but the extra variance is estimated in the control group (the second model), the random-effect solution for the negative variance represents individual responses that reduce the residual variance (which represents the variance of the experimental group) rather than increase it, as in the first model. So the random-effect solution for negative variance does not have any immediate useful interpretation, and the lack of compatibility limits is even more problematic for the interpretation of the values. Not shown is the estimation of the variance representing individual responses derived by squaring the SD of the random-effect solution and multiplying by a degrees-of-freedom correction factor, but I did it in a spreadsheet, and the correspondence is good, so the mismatch between the two estimates is just a degrees-of-freedom issue. I am still undecided about whether I should use the random-effect solution as is, or whether I should inflate the values so that their simple SD squared is the same as the covparm variance. Will /* Simulation of a controlled trial, with individual responses in an exptal group. Analysis of the change scores is done with a general linear mixed model allowing for extra variance, first in the exptal group, then in the control group. */ %let SDdelta=1; *SD of change scores in control group (error of measurement is 1/sqrt(2) times this value); %let MeanResp=5; *mean response to a treatment in exptal group; %let IndResp=2; *SD of individual responses in exptal group; %let GrpSampSize=10; *sample size in control and exptal groups; %let alpha=0.1; *alpha for compatibility limits; %let seed=2; *set to 0 for a new random sample every time; data dat1; do SubjectID=1 to &GrpSampSize; Group="Control"; IndResp=0; Y=&SDdelta*rannor(&seed); xVarCont=1; *dummy variable to estimate extra variance representing individual responses in control group; xVarExpt=0; *dummy variable to estimate extra variance representing individual responses in exptal group; output; end; do SubjectID=&GrpSampSize+1 to 2*&GrpSampSize; Group="Exptal"; IndResp=&IndResp*rannor(&seed); Y=&SDdelta*rannor(&seed)+&MeanResp+IndResp; xVarCont=0; xVarExpt=1; output; end; format _numeric_ 5.2; title "The data"; proc print; run; proc means maxdec=2; class Group; run; title1 "Individual responses estimated in Exptal group"; ods select none; proc mixed data=dat1 covtest cl nobound alpha=0.1; class SubjectID Group; model Y=Group/noint outp=pred ddfm=sat alphap=&alpha ddfm=sat; random xVarExpt/subject=SubjectID s cl alpha=α estimate "Treatment effect" Group -1 1/cl alpha=α ods output covparms=cov; ods output solutionr=solr; ods output estimates=est; run; ods select all; title2 "Treatment mean effect (Expt-Control); expected value = &MeanResp"; proc print data=est; run; title2 "Covparms, expected values: xVarExpt = &IndResp**2; Residual = &SDdelta**2"; data cov1; set cov; DegFree=2*Zvalue**2; proc print data=cov1; run; title2 "Random-effect solution for xVarExpt"; proc print data=solr; run; title2 "Predicted and residual values"; proc print data=pred; run; title2 "Merge residuals and random-effect solution and get correlation"; data residrand; merge pred solr(rename=(Alpha=AlphaEst Lower=LowerEst Upper=UpperEst)); by SubjectID; proc print; var SubjectID Group IndResp Resid Estimate LowerEst UpperEst AlphaEst; run; proc corr; var Resid; with Estimate; by Group; run; title1 "Individual responses estimated in Control group"; ods select none; proc mixed data=dat1 covtest cl nobound alpha=0.1; class SubjectID Group; model Y=Group/noint outp=pred ddfm=sat alphap=&alpha ddfm=sat; random xVarCont/subject=SubjectID s cl alpha=α estimate "Treatment effect" Group -1 1/cl alpha=α ods output covparms=cov; ods output solutionr=solr; ods output estimates=est; run; ods select all; title2 "Treatment mean effect (Expt-Control); expected value = &MeanResp"; proc print data=est; run; title2 "Covparms, expected values: xVarCont = -&IndResp**2; Residual = &SDdelta**2+&IndResp**2"; data cov1; set cov; DegFree=2*Zvalue**2; proc print data=cov1; run; title2 "Random-effect solution for xVarCont"; proc print data=solr; run; title2 "Predicted and residual values"; proc print data=pred; run; title2 "Merge residuals and random-effect solution and get correlation"; data residrand; merge pred solr(rename=(Alpha=AlphaEst Lower=LowerEst Upper=UpperEst)); by SubjectID; proc print; var SubjectID Group IndResp Resid Estimate LowerEst UpperEst AlphaEst; run; proc corr; var Resid; with Estimate; by Group; run;

WillTheKiwi · ‎10-14-2021

Apologies for this long posting. I hope some of the mixed-model mavens will take the time to read and respond. I posted this request for help on the datamethods forum, but so far no useful replies, hence trying this SAS community. I am using Proc Glimmix to account for repeated measurement (of counts of things representing performance in multiple matches of different sports teams, but that is not particularly relevant). The random-effect solution for the identity of the subjects (the teams) represents relative magnitudes of the subject means of the dependent variable. The covariance parameter for subject identity is provided as a variance, and I have always interpreted the square root of that covparm as the between-subject standard deviation, after adjustment for everything else in the model. Residual error has been partitioned out of it, so I call it the true between-subject SD, as opposed to the observed between-subject SD, which includes residual error. Fine (or at least I hope so), but here’s my problem. The SD of the random-effect solution gives another estimate of the pure between-subject SD. I realize it isn’t right to start with, because of a degrees-of-freedom issue: if I have n subjects, the SD of the random-effect solution is calculated as if there are n-1 degrees of freedom, but the SD provided by the variance for the subjects is calculated, or at least should be consistent with, the actual degrees of freedom for the variance. An estimate of the degrees of freedom is given by 2*Z^2, where Z is the variance divided by its standard error. Well, I’ve done the calculation to correct the SD of the random-effect solution using the degrees of freedom, and the correspondence is not exact, but it’s near enough, so let’s assume that the mismatch between the SDs is just a degrees-of-freedom issue. With these data the degrees of freedom of the subject variance are small–I am getting values of around 1 or even less, sometimes–so the SD of the random-effect solution is a lot less, by a factor of ~10, than the SD given by the square root of the covparm variance. So according to the random-effect solution, there are small differences between teams, but according to the SD of the covparm variance, the differences between teams are 10x greater. I actually want to use the random-effect solution to assess individual teams, but I am reluctant to, because things don’t add up. I am also using the random-effect solution as a linear within-subject predictor in another linear model with a different dependent variable, and I use 2 SD of linear predictors to assess the magnitude of the effect of such predictors. I started off using twice the SD derived from the square root of the covparm, but it gave ridiculous answers. When I use twice the SD of the random-effect solution, the answers are sensible. Hmmm... Immediately after I posted the above, I realized I should have added another related question. I use nobound to allow negative variance, and SAS provides a random-effect solution even when the variance is negative. I checked that the solution is included additively in the linear model to give predicted values, in exactly the same manner as when the variance is positive. But now how do I interpret the random-effect solution? It’s like there are negative real differences between the subjects. Someone on the datamethods forum wondered whether I had a convergence or parameterization problem. Here's my reply... My question about negative variance is not a convergence or parameterization issue. In Proc Mixed and Proc Glimmix you can state “nobound” to allow negative variance. (Proc Nlmixed doesn’t allow it, by the look of the documentation.) Nobound increases the risk of failure to converge, but I can usually get around that by stating initial values of the covparms and/or by relaxing the convergence criteria, then relaxing them even further to check that there is no substantial change in the estimates. Negative variance is pretty-much essential when you have a random effect representing individual responses, because it’s the only way to get sensible compatibility (confidence) intervals. And it’s pretty obvious that sampling variation can result in negative variance, when the sample size and/or true variance is small. I’ve used simulation to check that the intervals include true values at the chosen level of the interval.

WillTheKiwi · ‎08-29-2019

Thanks, Robert. As I said to Rick, I don't particularly want to switch plotting procedures, having put in a lot of time to get the code right for SGPLOT.

WillTheKiwi · ‎08-29-2019

Thanks, Rick, but according to the documentation for SGPLOT, I should get uniform X axes: ALL specifies that both the legend group values and the axis scaling are shared between all of the levels of the BY variable or variables. I don't want to use a different plotting procedure, having spent hours-no, days, if you add it all up-getting just a rudimentary knowledge of how to get what I want out of SGPLOT.

WillTheKiwi · ‎08-28-2019

I want to make two graphs via by-processing with the same scaling of X and Y axes. SGPLOT resolutely refuses to make uniform scaling of the X axes with UNIFORM=ALL, UNIFORM=SCALE and UNIFORM=XSCALE. The Y axes are made uniform, however. Here's the code, and attached is the output. Sorry I can't easily include the dataset. In any case, it's probably something arcane to do with the attributes or something that I have got wrong. (BTW, I used log transformation and back-transformed, in case anyone picks up on the non-uniformity.) And can anyone see why I don't get red outlines on the markers? Thank you! title height=1.0 "&pred predicting &dep, Name means and between-Name quadratic"; ods graphics / reset width=16cm height=13cm imagemap attrpriority=none; proc sgplot data=bothpred1 noborder noautolegend uniform=ALL; reg x=Mean&pred.Unscaled y=PredM/degree=2 nomarkers lineattrs=(thickness=1 color=blue pattern=solid); reg x=Mean&pred.Unscaled y=LowerM/degree=2 nomarkers lineattrs=(thickness=1 color=blue pattern=dot); reg x=Mean&pred.Unscaled y=UpperM/degree=2 nomarkers lineattrs=(thickness=1 color=blue pattern=dot); scatter x=Mean&pred.Unscaled y=Mean&dep/filledoutlinedmarkers markerattrs=(symbol=circlefilled size=10 color=red) markerfillattrs=(color=black); yaxis label="Mean&dep" labelpos=top labelattrs=(size=10) valueattrs=(size=10); by PosGrp; run;

WillTheKiwi · ‎06-15-2019

It's OK, I got the prompt when I started SAS Studio. Sigh...

WillTheKiwi · ‎06-15-2019

I have updated SAS Studio, but I didn't get the authorization code for renewing my license. The start page says SAS University Edition is up-to-date. but it still says Your SAS license has expired and the software will stop working on Jul 31, 2019. See the documentation for details. According to http://support.sas.com/software/products/university-edition/faq/license_expire.htm "Several months before the expiration date, SAS University Edition releases an update that includes the authorization code that renews your software license. When you receive this update, renew your software license when prompted and accept the Terms & Conditions from SAS." Now what? Shall I reboot my laptop? By the way, it wouldn't hurt to say something on the start page like "when you update SAS Studio, you will be prompted to renew your license", that is, assuming it works like it's supposed to. Thanks. Will

WillTheKiwi · ‎11-08-2018

I have sent a request to SAS Technical Support. I also have some additional information. The PhD student informs me that when he has a repeated statement that specifies more than one residual, and no random statement, the outp confidence limits are now different from those of outpm. They seem to be believable as prediction intervals. So Proc Mixed with a dummy or no repeated statement and no random statement should produce the same prediction intervals as Proc Reg, but obviously it doesn't. I guess I can stop worrying about the trustworthiness of the prediction intervals in my random-effect meta-regressions, when I use the Yang trick of setting the residual variance to unity. I guess I should check by using an estimate statement that includes random effects to reproduce a predicted value. The confidence intervals should be the same, hopefully correct!

Online Status	Offline
Date Last Visited	‎04-05-2024 06:53 PM

Re: No standard error for the overdispersion factor in a simple Poisso...

No standard error for the overdispersion factor in a simple Poisson re...

Re: how to use non-parametric way to analysis one sample?

Re: SAS Studio interface responding too slowly

Re: SAS Studio interface responding too slowly

Re: SAS Studio interface responding too slowly

SAS Studio interface responding too slowly

Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=)

Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=)

Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=)

Re: SAS Studio interface responding too slowly

Re: Interpreting the random-effect solution in a mixed model

Delete the Studio cookies if updating fails and you can't get SAS Stud...

Re: Which estimation method for missing data using PROC GLIMMIX?

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Re: Interpreting the random-effect solution in a mixed model

Interpreting the random-effect solution in a mixed model

Re: Another bug with SGPLOT: UNIFORM=ALL does make X axes uniform?

Re: Another bug with SGPLOT: UNIFORM=ALL does make X axes uniform?

Another bug with SGPLOT: UNIFORM=ALL does make X axes uniform?

Re: Renewed SAS Studio but got no authorization for license renewal

Renewed SAS Studio but got no authorization for license renewal

Re: Wrong confidence limits for predicted values in simple linear regr...

SAS Analytics Explorers