BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
edhuang
Obsidian | Level 7

Hi,

 

I was wondering what is the correct regression procedure to use if I have a percentage as an continuous outcome (obviously bounded between 0-1).  It is number of procedure with polyps over number of total procedures done.  My understanding is that this can be used with either binomial like proc genmod with logit link with binomial distribution or proc logistic.  However, I also read is that using proc glimmix with beta distribution is the one to use.

 

Which one is the correct one to use? 

 

Thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

You need to use the Events/Trials syntax instead of the raw proportions. That is, instead of ADP, you need to specify the two variables ProceduresWithPolyps and NumberOfProcedures. For example, ADP=0.25 might correspond to ProceduresWithPolyps=2 and NumberOfProcedures=8.

 

Your model statement will look like this:
model ProceduresWithPolyps/NumberOfProcedures = /dist=binomial link=logit;

View solution in original post

5 REPLIES 5
StatDave
SAS Super FREQ

As stated at the start of this note:

When modeling response data consisting of proportions (or percentages), the observed values can be continuous or represent a summarized (or aggregated) binary response. For example, an observed proportion of 0.3 might represent 3 out of 10 subjects responding positively at a particular dose of a drug. At the subject level, the response is binary (positive or negative). If your data are aggregated binary data and you have the numerator and denominator counts making up the proportions, then you can fit a logistic model in procedures such as LOGISTIC, PROBIT, GENMOD, GAM, ADAPTIVEREG and others by using the events/trials syntax in the MODEL statement. These models assume the proportions represent a set of independent Bernoulli trials and have a binomial distribution.

SteveDenham
Jade | Level 19

Just a bit of expansion on what @StatDave  wrote.  A ratio of counts will not be beta distributed, as the beta distribution is a ratio of continuous variables, bounded on (0,1), with the endpoints excluded.  A binomial distribution would be the most logical for the example you propose (procedures with polyps/total procedures)

 

SteveDenham

edhuang
Obsidian | Level 7
Hi Steve,

I tried it as

proc genmod data=workdata.all_combined;
model adr=/dist=binomial link=logit;
run;

ADR = percentage (procedures with polyps/total procedures)

I get the following error.

ERROR: The response variable ADR has 150 levels. A binary response must
have two levels.
ERROR: No valid observations due to invalid or missing values in the
response, explanatory, offset, frequency, or weight variable.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE GENMOD used (Total process time):
real time 0.08 seconds

What am I doing wrong here?
Rick_SAS
SAS Super FREQ

You need to use the Events/Trials syntax instead of the raw proportions. That is, instead of ADP, you need to specify the two variables ProceduresWithPolyps and NumberOfProcedures. For example, ADP=0.25 might correspond to ProceduresWithPolyps=2 and NumberOfProcedures=8.

 

Your model statement will look like this:
model ProceduresWithPolyps/NumberOfProcedures = /dist=binomial link=logit;

StatDave
SAS Super FREQ

PROC LOGISTIC is the best tool for logistic regression and simpler syntax:

proc logistic;
model polyps/total = <your predictor variables>;
run;

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2561 views
  • 4 likes
  • 4 in conversation