BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bncoxuk
Obsidian | Level 7

Hi,

I used the boostrapping method to create 16 random samples. For each sample, a logistic regression analysis was performed. There is a variable named occupation which did not show consistent significance. For the 16 trials, it failed to show significance 4 times (based on the 1% significance level). The sample size of the data is 60245. I am wondering if I should include this variable in the final model. Can you please give some advice on this question? Smiley Happy

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Doc_Duke
Rhodochrosite | Level 12

That's one of the reasons that we do Bootstraps...to see which variables are consistently in or out of a model.

As to whether you need to include occupation in the end, that is more subtle than just looking at statistical significance.  Maybe it is needed for "face validity," etc. 

One of the things that you may want to look at is how is Occupation coded.  If it is a class variable with lots of levels, then you may just have too many level with low frequencies and lumping the codes would provide you with more information and a more stable model.

It could also be that you have an interaction term that you have not accounted for (e.g. is education also in the model?).

Doc Muhlbaier

Duke

View solution in original post

1 REPLY 1
Doc_Duke
Rhodochrosite | Level 12

That's one of the reasons that we do Bootstraps...to see which variables are consistently in or out of a model.

As to whether you need to include occupation in the end, that is more subtle than just looking at statistical significance.  Maybe it is needed for "face validity," etc. 

One of the things that you may want to look at is how is Occupation coded.  If it is a class variable with lots of levels, then you may just have too many level with low frequencies and lumping the codes would provide you with more information and a more stable model.

It could also be that you have an interaction term that you have not accounted for (e.g. is education also in the model?).

Doc Muhlbaier

Duke

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1259 views
  • 0 likes
  • 2 in conversation