BookmarkSubscribeRSS Feed
Reeza
Super User
I suspect that's the one that's causing the quasi separation issue - because it does separate the data quite cleanly.
igforek
Quartz | Level 8

It may be so.
Also, when I try to run genmod with the factors Bact Host and their interaction, the warning about the Hessian matrix not being positive definite may be caused by that many zeroes.
The model runs ok with no interaction.

igforek
Quartz | Level 8

I took a look at this page: https://support.sas.com/rnd/app/stat/examples/GENMODZIP/roots.htm

but I am not sure if this will apply to my data, because I have categorical independent variables. And binomial.

igforek
Quartz | Level 8
HostB is naturally infected by BactB
HostC is naturally infected by BactC
and so on
BactE is naturally infected by BactE
igforek
Quartz | Level 8

Mr.

 

 

 

Ksharp
Super User

Yes. unbalanced data is way too biased from 1:1. Like  good:bad =99: 1.

You could oversample it into good:bad=30:70 and base that data to build a logistic model.

 

About your question, I am not able to follow , what does that ' special case ' mean ?

Reeza
Super User
This isn't a case of unbalanced data exactly - it's a case of extreme multicollinearity when two variables provide the same almost the exact same information. In that case, depending on context, one variable is usually dropped.
igforek
Quartz | Level 8
What do you think about Dr. Allison insight into dummy variables?
https://statisticalhorizons.com/multicollinearity
Reeza
Super User

Which of those cases do you think applies here?

 

Have you tried changing your reference level? Are you interested in estimates on the categorical variables - I think you are so they're not just control variables here. 

 


@igforek wrote:
What do you think about Dr. Allison insight into dummy variables?
https://statisticalhorizons.com/multicollinearity



igforek
Quartz | Level 8
Unfortunately, I am right mow working on a computer that does not run SAS
or any other statistical software. I will try changing my reference this
afternoon, after 5 pm central time.

I am thinking case 3 may apply to my analysis. And yes, I am interested in
estimates on the categorical variables



Reeza
Super User

@igforek wrote:
Unfortunately, I am right mow working on a computer that does not run SAS
or any other statistical software. I will try changing my reference this
afternoon, after 5 pm central time.

I am thinking case 3 may apply to my analysis. And yes, I am interested in
estimates on the categorical variables




Then his comments don't apply because it  only does if it's a control variable or you're interested in the overall effects. So you need to find some other way to deal with the collinearity or re-frame your problem/experiment somehow. If you knew this was likely to happen, host and bacteria would match primarily, a different design should have been choosen .... unfortunately it's likely too late to change that. 

igforek
Quartz | Level 8
To me personally, this look like a "Hindsight is 20/20" situation.
Other researchers in my field used this type of analysis successfully.
In my case, there are too many zeroes and some host do not even get infected by their own bacteria ("0" self-infection). Who knew?
Ksharp
Super User
What I mean is Y variable , NOT X variable , multicollinearity is for X variables .
igforek
Quartz | Level 8
Thank you for your comments.
I am still working on solving the issues with the analysis.
I will let all of you know how I goes.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 28 replies
  • 1276 views
  • 1 like
  • 3 in conversation