BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mjkop56
Obsidian | Level 7

I’d like to obtain 95% CIs for a variable nominal variable "gender" with 3 categories - male, female, and unknown; and have proportions over several years. Some of the same individuals are found in multiple years. Below is an example of the proportions I want to calculate the 95% CIs on:

rmfPS.png

Does using "Simultaneous confidence intervals for multinomial proportions" (e.g. https://blogs.sas.com/content/iml/2017/02/15/confidence-intervals-multinomial-proportions.html) look like the best approach to calculate these CIs?   

Can it take into account that the same individuals are found in multiple years? (or maybe that can be ignored here?)



1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

You can use PROC GEE to deal with the repeated measurements and to fit a model to the nominal multinomial response. The LSMEANS statement with the ILINK and CL options provides the estimated probabilities and confidence intervals at each year.

proc gee;
class year subject;
model gender=year / dist=mult link=glogit;
repeated subject=subject;
lsmeans year / ilink cl;
run;

View solution in original post

7 REPLIES 7
Ksharp
Super User

It looks like you want Regression model's CI , not multi-nominal proportions's CI.

Try PROC REG or 

proc loess data=sashelp.class;
model weight=height/ clm;
run;

or calling @Rick_SAS 

 

StatDave
SAS Super FREQ

You can use PROC GEE to deal with the repeated measurements and to fit a model to the nominal multinomial response. The LSMEANS statement with the ILINK and CL options provides the estimated probabilities and confidence intervals at each year.

proc gee;
class year subject;
model gender=year / dist=mult link=glogit;
repeated subject=subject;
lsmeans year / ilink cl;
run;
mjkop56
Obsidian | Level 7

Thank you! I am also running a model. My understanding with the model is that the confidence intervals go around the predicted probabilities, and not the observed proportions.

I was thinking of showing the CIs around both the observed proportions and the predicted probabilities? However, maybe this is not a good idea? Below is a link to the previous question about this.

 

https://communities.sas.com/t5/Statistical-Procedures/Question-about-standard-reporting-for-plots-of...

StatDave
SAS Super FREQ

When you say that "the confidence intervals go around the predicted probabilities, and not the observed proportions," I assume you mean that the point estimate is the predicted probability from the fitted model that used all of the data as opposed to the simple proportions computed using just the data in the separate gender-year combinations. It's up to you, but typically one tries to fit an appropriate model to all of the data and use that model to estimate the quantities of interest. That is what the code I showed earlier does.

mjkop56
Obsidian | Level 7

Thank you Dave! I definitely want to go with what is typically done so I really appreciate your response. By " "the confidence intervals go around the predicted probabilities, and not the observed proportions" -  I meant something like the below - the black dots are the observed proportions, then there is a trend line from a model (predicted probabilities), and the confidence intervals go around the trend line, as opposed to being around the particular observed proportion. madeupdata.png

StatDave
SAS Super FREQ
That plot assumes that YEAR is treated as a continuous variable in the model, and since the lines are curved, the model specification does not assume that the effect of YEAR is linear. So, code like this will allow YEAR to have a quadratic effect and the EFFECTPLOT statement produces the plot. See the documentation of the EFFECTPLOT statement for details and more options.
proc gee;
class subject;
model gender=year|year / dist=mult link=glogit;
repeated subject=subject;
effectplot fit(x=year) / obs;
run;
mjkop56
Obsidian | Level 7

Wonderful. thanks so much for the tips, Dave!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 687 views
  • 4 likes
  • 3 in conversation