Programming the statistical procedures from SAS

PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Hi all,

 

My response variable is cognitive function (CASI) which is scored from 0 to 100. 

I want to model the relationship between CASI and some predictors.

One reccommendation is to transform CASI to CASI/100 and fit a logistic model.

So here CASI/100 ranges from 0 to 1. 

I plan to use PROC GLIMMIX in this case. But I'm not sure which distribution should I specify for this special response variable.

Is it beta distribution? I read somewhere that beta distribution doesn't accept value 0 or 1.

 

I would love to hear from your experience.

Thank you,

 

Trang


Accepted Solutions
Solution
‎06-27-2017 07:56 PM
PROC Star
PROC Star
Posts: 170

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

One approach is to gently rescale the data to lie strictly in (0, 1); see

https://www.ncbi.nlm.nih.gov/pubmed/16594767

 

A more elegant approach is the zero one inflated beta model; see

http://support.sas.com/resources/papers/proceedings12/325-2012.pdf

and an example

https://www.nefsc.noaa.gov/program_review/2015%20%20Review/BACKGROUND/B1A6SA%20Waring%20beta%20regre...

 

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,475

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

[ Edited ]

Yes, beta distribution. Unless responses are 0 or 100, you won't need to worry about whether the beta distribution "accepts values 0 or 1."

Contributor
Posts: 28

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Hi Rick,

If I choose beta distribution and I have some case that Casi/100 equals 0 or 1, although they are extremely rare, would SAS omit these cases from the analysis?
Thank you for your reply,
Trang
SAS Super FREQ
Posts: 3,475

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Yes, you are correct. The procedure will drop observations for which the response is not in (0,1) and will display the NOTE

NOTE: Some observations are not used in the analysis because of: not a
proportion, zero or negative response.

If you have 0 and 1 responses, perhaps beta is not the best model.

Contributor
Posts: 28

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Rick, so if I want to model this response variable in the most natural way, ie I can have some cases with Casi/100 equal to 0 or 1, what would be the most appropriate distribution?
Super User
Posts: 9,671

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

[ Edited ]

I could suggest using GAMMA distribution for CASI variable.

and if you have many zero , try tweedie distribution.

 

OR

 

Try Poisson distribution  +  offset= option.  Make an offset variable which is 100 .

Contributor
Posts: 28

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Hi Ksharp,
Thank you for your suggestion.
However, CASI is skewed, so I'm not sure that gamma distribution works here.
maybe it's the reason that I need to transform CASI to CASI/100, which ranges from 0 to 1.
I'm thinking of beta distribution, but it doesn't allow 0 and 1.
I want to find the most appropriate distribution that can accept value of 0 and 1.


Solution
‎06-27-2017 07:56 PM
PROC Star
PROC Star
Posts: 170

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

One approach is to gently rescale the data to lie strictly in (0, 1); see

https://www.ncbi.nlm.nih.gov/pubmed/16594767

 

A more elegant approach is the zero one inflated beta model; see

http://support.sas.com/resources/papers/proceedings12/325-2012.pdf

and an example

https://www.nefsc.noaa.gov/program_review/2015%20%20Review/BACKGROUND/B1A6SA%20Waring%20beta%20regre...

 

Contributor
Posts: 28

Re: PROC GLIMMIX: How to choose the distribution for response variable with limited range of value?

Hi sld,

Thank you so much for your reference with one-zero inflated beta regression!

I know something about beta regression, but never thought of one- zero inflated one.

This is exactly I want.

The illustration of SAS used similar type of response variable (a score from 0 to 100) like mine.

They also divided it by 100 and gave detailed guidance of the Macro for this analysis.

Best,

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 187 views
  • 3 likes
  • 4 in conversation