BookmarkSubscribeRSS Feed
buder
Fluorite | Level 6

Currently running quasi-likelihood regressions, where the code is PROC GLIMMIX.

 

Can someone tell me the difference between the following statements: 

 

model y ~ x /link = logit dist = binomial

 

model y ~ x / link = log solution

 

When I ran the code with 'link = logit dist = binomial' I was never able to get a value on the coefficients (which is what I'm looking for).

 
7 REPLIES 7
Ksharp
Super User

Link function is different.

 

model y ~ x /link = logit dist = binomial :

Like proc logistic .  link function is   log(p/(1-p))      y~binomial distribution

 

 

model y ~ x / link = log solution:

 link function is    log(y)      y usually is positive value.

SteveDenham
Jade | Level 19

Logistic regression in GLIMMIX often requires more iterations than the default to reach convergence.  Providing your code and any quotes from the output regarding non-convergence, etc. would be helpful in addressing your problems.

 

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

You have several issues/problems. With your code with link=log, you are using a normal distribution (the default). This is likely not what you want. Put in the dist= option. Also, your syntax is wrong for GLIMMIX. One does not use "y ~ x". One uses "y=x". Also, you are using pseudo-likelihood, not quasi-likelihood. Quasi-likelihood is when the likelihood is not defined (or definable). If you put in:

random _residual_;

then you would have quasi-likelihood with the binomial. Another way of getting quasi-, is to specify the mean and variance functions directly. Both ways are defining a variance:mean structure that does not correspond to a real probability distribution.

buder
Fluorite | Level 6

Thanks all for the clarification; I am still running into syntax error codes.

 

When I include dist = option is says syntax error, expecting one of the foollowing: B, BERNOULLI, BETA, etc.

 

Below is the code I ran: 

 

proc glimmix data = TEST;
model utility_score = AGE INCOME ... (number of independent variables here) / link = log dist = option;
where utility_score gt 0;
run;

 

Reason why I'm running quasi-likelihood is because my dependent variable is between 0 and 1; cannot take on the 0 value but can be 1. Also, the dependent variable is skewed (left). 

 

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12
You missed the point, maybe I wasn't clear. You need to specify a distribution with link=log, otherwise you will get a normal distribution. If you want a binomial, then use dist=binomial or some other distribution (there is no such thing as "dist=option"). If you are using normal distribution (the default), then you are using pseudo-likelihood (the default).
buder
Fluorite | Level 6

Got it. One (hopefully) last question; I believe that I need dist = beta because my dependent variable is between 0 and 1 (though it cannot take on the value of 0). 

 

But the question I still have remaining is that this dependent variable is left-skewed (skewness: -1.97); is one able to incorporate that in the distribution code anywhere? 

 

If I remember correctly a positively skewed variable would take on a Poisson distribution but I haven't come across how to incorporate a negatively-skewed variable.

 

Any thoughts would be greatly appreciated. 

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12
Beta does not allow 0 or 1 (defined only between those values). You would get a missing value for any 0s or 1s. This is a challenge when using the beta distribution. Skewness is not a problem. Beta can be very skewed. If you use beta, then you don't need the random _residual_ term I mentioned (there is a scale parameter automatically). If you want beta with 1s, you need to define (in an ad hoc way) a new variable that is always between 0 and 1. Something like yprime = (y-.005)/1.01. Be careful, results will depend on the constants you use.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1648 views
  • 3 likes
  • 4 in conversation