multilevel beta regression errors

emaguin · Posted 10-17-2022 04:09 PM

I am doing multilevel analyses using beta regression because my dependent variable is a proportion. The range of proportion values includes 0 and 1; however, prior to analysis 0 was recoded to 0.005 and 1 was recoded to 0.995. Most of the analyses executed without error but four did not. The attached file shows the syntax for each analysis and the noted error (the same for three of the four analyses). As this my first experience with beta regression, please pitch your reply to the level of somebody with no experience with beta regression and an incomplete understanding of all glimmix options and keywords and their possible interactions. A low level slow walk would be appreciated.

Thanks Gene Maguin

Ksharp · Posted 10-18-2022 08:16 AM

Calling @Rick_SAS

Rick_SAS · Posted 10-18-2022 08:24 AM

I don't think we'll be able to reproduce these errors without having the data.

SteveDenham · Posted 10-18-2022 08:31 AM

I am going to follow this closely as this is an interesting question that I don't have an answer for. Some additional information could be helpful, such as what are the numerator and denominator in the calculation of compliance2, how many records you have, and whether you have tried a pseudo-likelihood approach, rather than the quadrature.

SteveDenham.

emaguin · Posted 10-19-2022 03:11 PM

Rick_SAS: I'm not the PI. May not be possible.

All: Ecological Momentary Assessment dataset. Longitudinal. 9 time points. Persons are (attempted to be) assessed by a single yes/no item once each day at wake-up ("Wake-up") and then at four random times during the following 12 hour period ("Random"). Each time point is a 7 day period. Each person has a Wake-up proportion and a Random proportion at each time point. Wake-up proportion is the count of Yeses at wake-up (0-7) for the week divided by 7 days. Random proportion is count of Yeses at the random times for the week (0-4*7) divided by 4*7=28. Number of persons at week 0 is ~255 and at week 9 is ~230. In the model statement "slope" is time point variable and assesstype is the Wake-up vs Random indicator.

Tried pseudo-likelihood? No. (Quadrature was recommended). I see that there are four PL options (RSPL | MSPL | RMPL | MMPL), which would you try first (why, please), then next?

Thanks, Gene Maguin

Rick_SAS · Posted 10-19-2022 03:27 PM

It doesn't have to be real data. If you can invent an example that shows the problem, we can try to understand what is causing the issue and recommend a solution.

If only the real data shows the issue, you can contact SAS Technical Support. They have procedures to handle proprietary or sensitive data from customers.

emaguin · Posted 10-21-2022 09:16 AM

I don't believe i have technical skill to create a plausible beta distribution test data model.
I want to ask about the meaning of the three error messages that I'm seeing. These are
ERROR: QUANEW Optimization cannot be completed.
ERROR: The function value of the objective function cannot be computed at the starting point.

ERROR: Infeasible parameter values for evaluation of objective function with 1 quadrature point.

What do these mean in terms of where the estimation process is failing. To me they are pretty opaque and I'd like to understand what is happening.

jiltao · Posted 10-19-2022 04:00 PM

I agree with Rick -- we would need to have your data in order to see what might have caused the convergence issue.

In the meantime, try rescaling your time variable (slope). For example, divide that by 10, or 100, to see if that helps.

Thanks,

Jill

SteveDenham · Posted 10-20-2022 10:26 AM

Well, it could be that using a beta distribution with the default log link is the problem. This phrase: Persons are (attempted to be) assessed by a single yes/no item. indicates to me that a binary/binomial distribution with a logit link may be more appropriate. In each case, the response variable is the proportion of Yeses out of the number of occasions for observing either Yes or No, rather than a proportion defined by a continuous variable divided by a larger, different continuous variable. With that in mind, take a look at Stroup and Claassen's paper:

https://econpapers.repec.org/article/sprjagbes/v_3a25_3ay_3a2020_3ai_3a4_3ad_3a10.1007_5fs13253-020-... .

The full paper may be behind a paywall (Springer) but keep digging and you'll find a pdf copy.

So the pseudo-likelihood method they used was the default for GLIMMIX - RSPL.

Good luck.

SteveDenham

emaguin · Posted 10-21-2022 09:46 AM

I can a copy from the library. Thanks for the recommendation. I'll try RSPL. I understand your alternative; I read about it in an ecology statistics book. Help me learn something. How would this be implemented? Our data is the computed proportion (because the dataset was originally analyzed assuming a normal distribution). I assume that that data won't work for the alternative method. Would I have to go back to the original data where each record represents an instance where a call was made? I know that the spss procedure genlinmixed allows the data to be expressed as the number of successes of the number of attempts, where the number of attempts can be a variable in the dataset. I just assume sas can do that, why not, but does glimmix have that capability? If so, where/how would it be documented? Lastly, the other problem is proportions of 0.0 or 1.0. In these current analyses, those values are recoded to 0.005 and 0.995, respectively. What happens in a logit formulation? Discarded as undefined? Given an arbitrarily large or small value, which is what mplus apparently does? I agree that your proposed model is a better representation of the data. The question is implementation.

SteveDenham · Posted 10-24-2022 09:16 AM

Yes, that data will work. The binomial distribution is a default for models that use the events/trials syntax, but it can be applied to aggregate values as well, see example 51.4 Quasi-likelihood Estimation for Proportions with Unknown Distribution https://documentation.sas.com/doc/en/statug/15.2/statug_glimmix_examples07.htm and the complete code: https://documentation.sas.com/doc/en/pgmsascdc/v_032/statug/statug_code_gmxex04.htm .

One thing that you will want to do is put the extreme values of 0 and 1 back into the analysis dataset, rather than 0.005 and 0.995. The binomial distribution has support on the closed interval that includes zero and one when a logit link is used.

SteveDenham

emaguin · Posted 10-26-2022 12:42 PM

Steve, Thank you. I would never have found that. I'll try it out but it certainly looks like it is what I need.
One thing I noticed that that clicking on the complete code link is a dead end. Searching on "sas glimmix" and clicking on first result (https://support.sas.com/rnd/app/stat/procedures/glimmix.html) gets to a page with "SAS/STAT Software" as the heading. Under the examples is the example you pointed me to but there it is listed as Example 49.4 (and the link to the complete code works). So, who knows why; it like going into your familiar grocery store and nearly everything has been rearranged.
Again, thank you.

multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Re: multilevel beta regression errors

Registration is open