BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Minhtrang
Obsidian | Level 7

Hi all,

 

My response variable is cognitive function (CASI) which is scored from 0 to 100. 

I want to model the relationship between CASI and some predictors.

One reccommendation is to transform CASI to CASI/100 and fit a logistic model.

So here CASI/100 ranges from 0 to 1. 

I plan to use PROC GLIMMIX in this case. But I'm not sure which distribution should I specify for this special response variable.

Is it beta distribution? I read somewhere that beta distribution doesn't accept value 0 or 1.

 

I would love to hear from your experience.

Thank you,

 

Trang

1 ACCEPTED SOLUTION
8 REPLIES 8
Rick_SAS
SAS Super FREQ

Yes, beta distribution. Unless responses are 0 or 100, you won't need to worry about whether the beta distribution "accepts values 0 or 1."

Minhtrang
Obsidian | Level 7
Hi Rick,

If I choose beta distribution and I have some case that Casi/100 equals 0 or 1, although they are extremely rare, would SAS omit these cases from the analysis?
Thank you for your reply,
Trang
Rick_SAS
SAS Super FREQ

Yes, you are correct. The procedure will drop observations for which the response is not in (0,1) and will display the NOTE

NOTE: Some observations are not used in the analysis because of: not a
proportion, zero or negative response.

If you have 0 and 1 responses, perhaps beta is not the best model.

Minhtrang
Obsidian | Level 7
Rick, so if I want to model this response variable in the most natural way, ie I can have some cases with Casi/100 equal to 0 or 1, what would be the most appropriate distribution?
Ksharp
Super User

I could suggest using GAMMA distribution for CASI variable.

and if you have many zero , try tweedie distribution.

 

OR

 

Try Poisson distribution  +  offset= option.  Make an offset variable which is 100 .

Minhtrang
Obsidian | Level 7
Hi Ksharp,
Thank you for your suggestion.
However, CASI is skewed, so I'm not sure that gamma distribution works here.
maybe it's the reason that I need to transform CASI to CASI/100, which ranges from 0 to 1.
I'm thinking of beta distribution, but it doesn't allow 0 and 1.
I want to find the most appropriate distribution that can accept value of 0 and 1.


Minhtrang
Obsidian | Level 7
Hi sld,

Thank you so much for your reference with one-zero inflated beta regression!

I know something about beta regression, but never thought of one- zero inflated one.

This is exactly I want.

The illustration of SAS used similar type of response variable (a score from 0 to 100) like mine.

They also divided it by 100 and gave detailed guidance of the Macro for this analysis.

Best,

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4422 views
  • 3 likes
  • 4 in conversation