BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bncoxuk
Obsidian | Level 7

Hi,

I used the boostrapping method to create 16 random samples. For each sample, a logistic regression analysis was performed. There is a variable named occupation which did not show consistent significance. For the 16 trials, it failed to show significance 4 times (based on the 1% significance level). The sample size of the data is 60245. I am wondering if I should include this variable in the final model. Can you please give some advice on this question? Smiley Happy

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Doc_Duke
Rhodochrosite | Level 12

That's one of the reasons that we do Bootstraps...to see which variables are consistently in or out of a model.

As to whether you need to include occupation in the end, that is more subtle than just looking at statistical significance.  Maybe it is needed for "face validity," etc. 

One of the things that you may want to look at is how is Occupation coded.  If it is a class variable with lots of levels, then you may just have too many level with low frequencies and lumping the codes would provide you with more information and a more stable model.

It could also be that you have an interaction term that you have not accounted for (e.g. is education also in the model?).

Doc Muhlbaier

Duke

View solution in original post

1 REPLY 1
Doc_Duke
Rhodochrosite | Level 12

That's one of the reasons that we do Bootstraps...to see which variables are consistently in or out of a model.

As to whether you need to include occupation in the end, that is more subtle than just looking at statistical significance.  Maybe it is needed for "face validity," etc. 

One of the things that you may want to look at is how is Occupation coded.  If it is a class variable with lots of levels, then you may just have too many level with low frequencies and lumping the codes would provide you with more information and a more stable model.

It could also be that you have an interaction term that you have not accounted for (e.g. is education also in the model?).

Doc Muhlbaier

Duke

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1152 views
  • 0 likes
  • 2 in conversation