<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Distribution for percentages in proc genmod in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964388#M48365</link>
    <description>Thanks Ksharp,&lt;BR /&gt;&lt;BR /&gt;So what if it is a continuous percentage like 45.76 and dont want to convert my data, then I guess I will use normal dist in proc genmod or glimmix as StatDave suggested</description>
    <pubDate>Wed, 16 Apr 2025 03:40:22 GMT</pubDate>
    <dc:creator>palolix</dc:creator>
    <dc:date>2025-04-16T03:40:22Z</dc:date>
    <item>
      <title>Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964381#M48359</link>
      <description>&lt;P&gt;Dear SAS Community,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using a gamma distribution to analyze my data in proc genmod but I am not so sure if I should use this dist if my outcome variable is a percentage.&amp;nbsp; I compared AIC values between normal and gamma and those for gamma were just slightly lower.&lt;/P&gt;
&lt;P&gt;When I plotted my data this is how it looks like:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc univariate data=one normal;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;var PercDM;&lt;BR /&gt;histogram PercDM;&lt;BR /&gt;run;&lt;BR /&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="palolix_0-1744763006841.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/106268i70F0F249661EFC2A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="palolix_0-1744763006841.png" alt="palolix_0-1744763006841.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;This is the code I am using:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc genmod data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDM=Harvest*Variety/type3 dist=gamma link= log;&lt;BR /&gt;slice Harvest*Variety/sliceby=Harvest sliceby=Variety adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would greatly appreciate your feedback!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;P&gt;Caroline&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 00:30:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964381#M48359</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-16T00:30:07Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964382#M48360</link>
      <description>&lt;P&gt;The gamma distribution is not bounded at 1 as is a proportion, so it theoretically does not apply. Is your percentage a ratio of two known counts, a numerator count and a total count, like the number of events that occurred out of some possible total? If so, then the response is actually binomial and can be modeled using the &lt;EM&gt;events/trial&lt;/EM&gt; syntax in PROC LOGISTIC. If the percentage is inherently continuous, like a proportion of a chemical in a mixture, then you could consider the types of models discussed in &lt;A href="http://support.sas.com/kb/56992" target="_self"&gt;this note&lt;/A&gt;&amp;nbsp;using the LOGISTIC or GLIMMIX procedures. However, if the histogram that you show summarizes data in a single population (one setting of both Variety and Harvest), then it appears to be reasonably normal with its mean large enough and its variance small enough to be reasonably symmetric. The penalty in that case with choosing the wrong distribution is small, affecting primarily the size of the standard errors which will, in turn, affect the significance of the tests.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 00:45:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964382#M48360</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2025-04-16T00:45:44Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964383#M48361</link>
      <description>Thank you so much for your reply StatDave. My outcome var is continuos (percentage of dry matter). I am using proc genmod instead of glimmix because I only have fixed effects. In that case I guess I would use a normal distribution as you suggested. In the case of a continuos outcome var that is an average (fruit weight), should I also use normal instead of gamma? &lt;BR /&gt;Thank you StatDave!</description>
      <pubDate>Wed, 16 Apr 2025 01:40:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964383#M48361</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-16T01:40:23Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964384#M48362</link>
      <description>You can still use GLIMMIX if you don't have random effects - just don't include a RANDOM statement. Again, see the usage note I referred to. For your continuous proportions, you can fit the fractional logistic model using either GLIMMIX or LOGISTIC as shown there. In the example there, the RANDOM statement in GLIMMIX is used just to estimate a reasonable scale parameter (it does not add a random effect) in the same way as the SCALE=PEARSON option is used in PROC LOGISTIC in that same note. Both use the binomial distribution. But as I've said, if the data in each separate population exhibits a distribution that is reasonably normal, then an analysis using that distribution might be reasonable. You might want to try both and see what trade-offs there.</description>
      <pubDate>Wed, 16 Apr 2025 01:49:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964384#M48362</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2025-04-16T01:49:49Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964385#M48363</link>
      <description>&lt;P&gt;You can try beta regression if your dependent variable lies in [0,1]. As you have said, the dependent variable of your model is a percentage, so it should lie in this interval. Beta regression is particularly suitable for handling heteroscedasticity in the model. As &lt;A href="https://support.sas.com/kb/57/480.html" target="_blank"&gt;57480 - Modeling continuous proportions: Normal and Beta Regression Models&lt;/A&gt;&amp;nbsp;says, beta regression is supported in the GLIMMIX procedure. Alternatively, see&amp;nbsp;&lt;A href="https://support.sas.com/resources/papers/proceedings11/335-2011.pdf" target="_blank"&gt;335-2011: Modeling Percentage Outcomes: The %Beta_Regression Macro&lt;/A&gt;&amp;nbsp;for a macro of beta regression.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 05:10:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964385#M48363</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-16T05:10:00Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964387#M48364</link>
      <description>You could use dist=binomial and link=identity to directly build a model for a ratio/percent, but you need to restructure your data into Y=0 1:&lt;BR /&gt;             &lt;A href="https://support.sas.com/kb/37/228.html" target="_blank"&gt;https://support.sas.com/kb/37/228.html&lt;/A&gt;&lt;BR /&gt;  proc genmod data=test descending;&lt;BR /&gt;         class a;&lt;BR /&gt;         model y = a / dist=binomial link=identity;&lt;BR /&gt;         lsmeans a / diff cl;&lt;BR /&gt;         run;&lt;BR /&gt;&lt;BR /&gt;Or try Possion Regression as StatDave said.&lt;BR /&gt;But you also need to change your data structure.&lt;BR /&gt;&lt;A href="https://support.sas.com/kb/24/188.html" target="_blank"&gt;https://support.sas.com/kb/24/188.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt; proc genmod data=insure;&lt;BR /&gt;         class car age;&lt;BR /&gt;         model c = car age / dist=poisson link=log offset=ln;&lt;BR /&gt;         run;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Apr 2025 02:43:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964387#M48364</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2025-04-16T02:43:15Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964388#M48365</link>
      <description>Thanks Ksharp,&lt;BR /&gt;&lt;BR /&gt;So what if it is a continuous percentage like 45.76 and dont want to convert my data, then I guess I will use normal dist in proc genmod or glimmix as StatDave suggested</description>
      <pubDate>Wed, 16 Apr 2025 03:40:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964388#M48365</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-16T03:40:22Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964390#M48366</link>
      <description>No. You can't expect percent is conforming to normal distribution when true percent is near zero or one,  &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; talk this topic at other session.&lt;BR /&gt;But from the graph you posted, it is near 0.5 and have bell shape , so I think use Normal or Lognormal dist is decent.</description>
      <pubDate>Wed, 16 Apr 2025 06:35:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964390#M48366</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2025-04-16T06:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964393#M48367</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Thanks Ksharp,&lt;BR /&gt;&lt;BR /&gt;So what if it is a continuous percentage like 45.76 and dont want to convert my data, then I guess I will use normal dist in proc genmod or glimmix as StatDave suggested&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I think it might be better to take a second look at what &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;said. It seems that &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;was not telling you to build normal regression models with PROC &lt;SPAN&gt;GLIMMIX. Instead, &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;was referring you to PROC GLIMMIX if you chose to build fractional logistic regression models.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Your goal of keeping the dependent variable as it is (i.e., no transformation) might not be possible if you followed &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;'s suggestion and used the fractional logistic regression model or followed mine and used the beta regression model, as both models involve the logit transformation of the dependent variable.&lt;/P&gt;
&lt;P&gt;It is up to you to decide which model to use. But now that you have raised your preference on whether or not transform the dependent variable, I think you may take a look at a statistical field receiving relatively less attention- the regression on the area under curve of the receiver operating characteristic curve (AUC). The AUC is a well-known measure of diagnostic and predictive accuracy and, similar to the dependent variable of your model, lies in the interval [0,1]. Regression models of AUC (i.e., with the AUC as the dependent variable and variables that correlate with the diagnostic or predictive accuracy as the independent variables) have been developed. See, for instance,&amp;nbsp;&lt;A title="REGRESSION ANALYSIS FOR THE PARTIAL AREA UNDER THE ROC CURVE" href="https://www3.stat.sinica.edu.tw/statistica/oldpdf/A18n31.pdf" target="_blank" rel="noopener"&gt;REGRESSION ANALYSIS FOR THE PARTIAL AREA UNDER THE ROC CURVE&lt;/A&gt;. You may take a look on this field to find out if there is a method that is not only capable of modeling doubly-bounded data that lie in [0,1] but also does not need transformation of the dependent variable, namely the goal you proposed.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 07:40:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964393#M48367</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-16T07:40:07Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964448#M48370</link>
      <description>&lt;P&gt;Look up leaf area index (LAI) and see what methods have been used for analyzing that endpoint. LAI is the proportion (or percentage/100) of the area of a given plot or transect that is covered by at least one leaf when viewed perpendicular to the ground. It is defined on the interval (0,1), bounded away from zero and one. When I last looked at the analyses that various folks used, there were a lot of options. Some have been mentioned here (beta regression, fractional logistic regression), but I am going to throw my support to some sort of resampling with replacement. Judging from the histogram you have a lot of observations, so taking samples of a relative size of (for example) total plots/20 and generating 5000 samples should not be difficult. From that, you can appeal to the central limit theorem to get means and confidence intervals. This might be more appropriate for your long right tail and non-unimodal data, which really looks like a mixture of two distributions to me.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;No guarantees, no warranty implied.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 17:27:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964448#M48370</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2025-04-16T17:27:18Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964451#M48371</link>
      <description>&lt;P&gt;Thank you very much for your feedback Steve and the rest of the folks helping me in this post!&lt;/P&gt;
&lt;P&gt;I analyzed the data with genmod using a normal distribution and also in glimmix using a beta dist and got almost identical adjusted p-values&amp;nbsp; for the differences between varieties within each harvest. I am just surprised that when comparing the percentage of dry matter between varieties I get so many significant differences at many harvest points even with adjusted p-values (see graph).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc genmod data=one;&lt;BR /&gt;where Season=2021;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDM=Harvest*Variety/type3 dist=normal link= log;&lt;BR /&gt;slice Harvest*Variety/sliceby=Harvest&amp;nbsp; adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc glimmix data=one;&lt;BR /&gt;where Season=2021;&lt;BR /&gt;PercDMp=PercDM/100;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDMp=Harvest*Variety/ dist=beta ddfm=kr;&lt;BR /&gt;lsmeans Harvest*Variety/slicediff=Harvest adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="palolix_0-1744827251027.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/106288i998DC627B2877097/image-size/medium?v=v2&amp;amp;px=400" role="button" title="palolix_0-1744827251027.png" alt="palolix_0-1744827251027.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Apr 2025 18:16:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964451#M48371</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-16T18:16:01Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964465#M48372</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you very much for your feedback Steve and the rest of the folks helping me in this post!&lt;/P&gt;
&lt;P&gt;I analyzed the data with genmod using a normal distribution and also in glimmix using a beta dist and got almost identical adjusted p-values&amp;nbsp; for the differences between varieties within each harvest. I am just surprised that when comparing the percentage of dry matter between varieties I get so many significant differences at many harvest points even with adjusted p-values (see graph).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc genmod data=one;&lt;BR /&gt;where Season=2021;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDM=Harvest*Variety/type3 dist=normal link= log;&lt;BR /&gt;slice Harvest*Variety/sliceby=Harvest&amp;nbsp; adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Interesting discovery. But I am afraid that you are building a lognormal model instead of a normal one with this code as there is a "link=log" option specified in the MODEL statement. The "link=identity" option keeps the dependent variable as it is and build normal regression models.&lt;/P&gt;
&lt;P&gt;You can also try out other options we suggested in this post. In addition to this, now that you have yielded similar &lt;EM&gt;P&lt;/EM&gt; values in the lognormal and beta model, you can work on their goodness-of-fit and statistical diagnostics, two more advanced aspects of statistical modeling. Goodness-of-fit can be measured by goodness-of-fit statistics like Akaike information criterion (AIC) and Bayesian information criterion (BIC), which are rotuinely output in most SAS statistical procedures, including PROC GLIMMIX and PROC GENMOD.&lt;/P&gt;
&lt;P&gt;Statistical diagnostics revolves around testing whether the underlying assumption of the statistical model built is tenable. To start with, you can first implement residual diagnostics, which has been studied to such an extent in logistic and beta regression that there is plenty literature to reference.&lt;/P&gt;
&lt;P&gt;As I have said, statistical diagnostics is a relatively advanced topic in regression modeling. So it is generally not deemed as a must for the time being, even in the non-statistical academic setting. For instance, take a look at medical research papers and you will find out that few of them stated that the researchers had conducted statistical diagnostics in the modeling process. However, if you do conduct it, the robustness of your results is greatly enhanced.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 02:16:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964465#M48372</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-17T02:16:18Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964524#M48373</link>
      <description>&lt;P&gt;Just for fun, consider a modeling approach that doesn't assume homogeneity of variance. Working from your GLIMMIX code, try something like:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/* Changes are in &lt;FONT color="#FF0000"&gt;red&lt;/FONT&gt;  */&lt;BR /&gt;proc glimmix data=one;
where Season=2021;
PercDMp=PercDM/100;
class Harvest Variety;
model PercDMp=Harvest*Variety/ dist=beta ddfm=kr&lt;FONT color="#FF0000"&gt;2&lt;/FONT&gt;;
&lt;FONT color="#FF0000"&gt;random _residual_ / group=Variety;&lt;/FONT&gt;
lsmeans Harvest*Variety/slicediff=Harvest adjust=simulate(seed=1);
&lt;FONT color="#FF0000"&gt;covtest 'common variance' homogeneity;&lt;/FONT&gt; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it turns out that the likelihood ratio test for variance homogeneity for Variety is not significant, try it again grouping by Harvest. I really don't know if either will affect your conclusions, but at least you have dealt with a common assumption (homogeneity) that may not be true for your data.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The other thing to look at is to try a different method than the default RSPL. Consider method=laplace, so that the error variance component is included in the optimization. That may yield standard errors that are more in line with expectations.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 17:02:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964524#M48373</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2025-04-17T17:02:55Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964525#M48374</link>
      <description>&lt;P&gt;Thank you so much for your reply Season! Thanks for pointing out the link mistake. I changed to link=identity but still getting very similar p-values and still significant. When comparing the AIC and BIC values for the genmod and glimmix models I am getting huge differences:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc genmod data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDM=Harvest*Variety/type3 dist=normal link=identity;&lt;BR /&gt;slice Harvest*Variety/sliceby=Harvest sliceby=Variety adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;BR /&gt;/*AIC 3984, BIC 4287*/&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc glimmix data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;PercDMp=PercDM/100;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDMp=Harvest*Variety/ dist=beta ddfm=kr;&lt;BR /&gt;lsmeans Harvest*Variety/slicediff=Harvest adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;BR /&gt;/*AIC -4342, BIC -4040*/&lt;/P&gt;
&lt;P&gt;Do I have to use something like this to check the residuals?&lt;/P&gt;
&lt;P&gt;proc reg data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;model PercDM=Harvest/dw clb;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you very much&lt;/P&gt;
&lt;P&gt;Caroline&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 17:13:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964525#M48374</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-17T17:13:44Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964527#M48375</link>
      <description>&lt;P&gt;Thank you so much Steve for keep helping me. Ok, I tried that code and the&amp;nbsp;&lt;SPAN&gt;likelihood ratio test for variance homogeneity for Variety was significant so I didn't change the group. I am still getting the same significant p-values. I tried laplace but same thing.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I am still surprised that I am getting these significant p-values with whatever model I try (see asteriks in graph, it doesn't look to me that those differences in perc dry matter between the varieties are significant, specially when adjusting the p-values). 1,2 and 4 in 2024 are the harvest months.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="palolix_0-1744912194526.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/106311i7CEDB7EF07307450/image-size/medium?v=v2&amp;amp;px=400" role="button" title="palolix_0-1744912194526.png" alt="palolix_0-1744912194526.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Apr 2025 17:54:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964527#M48375</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-17T17:54:08Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964595#M48383</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you so much for your reply Season! Thanks for pointing out the link mistake. I changed to link=identity but still getting very similar p-values and still significant.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Thank you for sharing your interesting discovery! So it seems what you observe is somewhat different from what&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;thought would happen.&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13633"&gt;@StatDave&lt;/a&gt;&amp;nbsp;said choosing the normal model might lead to altered standard errors and&amp;nbsp;&lt;EM&gt;P&lt;/EM&gt; values. So now the&amp;nbsp;&lt;EM&gt;P&amp;nbsp;&lt;/EM&gt;values are hardly affected by model misspecification. Do you mind sharing what impressions the standard errors of parameter estimates leave on you? You do not need to conduct formal hypothesis testing on the equivalence of standard errors. Simply sharing how you feel about this issue suffices.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:
&lt;P&gt;proc genmod data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDM=Harvest*Variety/type3 dist=normal link=identity;&lt;BR /&gt;slice Harvest*Variety/sliceby=Harvest sliceby=Variety adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;BR /&gt;/*AIC 3984, BIC 4287*/&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc glimmix data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;PercDMp=PercDM/100;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDMp=Harvest*Variety/ dist=beta ddfm=kr;&lt;BR /&gt;lsmeans Harvest*Variety/slicediff=Harvest adjust=simulate(seed=1);&lt;BR /&gt;run;&lt;BR /&gt;/*AIC -4342, BIC -4040*/&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I wonder if you had mistakenly added in the AIC and BIC values of the model produced by PROC GLIMMIX? I have never seen negative AIC and BIC values. I just ran a PROC GLIMMIX code with my data to confirm my suspicion, and its results echoed it- my AIC and BIC values are postitive.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:
&lt;P&gt;Do I have to use something like this to check the residuals?&lt;/P&gt;
&lt;P&gt;proc reg data=one;&lt;BR /&gt;where Season=2022;&lt;BR /&gt;model PercDM=Harvest/dw clb;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you very much&lt;/P&gt;
&lt;P&gt;Caroline&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You can use the REG procedure to let SAS compute the residuals, but you do not have to. You can stay in PROC GENMOD and add&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;output out=xxx(dataset name) RESRAW=r;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;to your code. This produces a dataset named xxx where raw residuals are stored in the variable named r. Other kinds of residuals (e.g., deviance residuals) can also be output from PROC GENMOD.&lt;/P&gt;
&lt;P&gt;In PROC GLIMMIX, add&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;output out=xxx resid=r;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;to your code.&amp;nbsp;This produces a dataset named xxx where residuals are stored in the variable named r. Again, other kinds of residuals can also be output. Take a look at SAS Help to find out more.&lt;/P&gt;</description>
      <pubDate>Fri, 18 Apr 2025 16:20:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964595#M48383</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-18T16:20:24Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964597#M48384</link>
      <description>My earlier statement:&lt;BR /&gt;     "The penalty in that case with choosing the wrong distribution is small, affecting primarily the size of the standard errors which will, in turn, affect the significance of the tests." &lt;BR /&gt;does not suggest that using the normal distribution will result in larger standard errors. It merely notes that a different distribution will change the standard errors, but that change could be very small or larger. The nature and size of the change depends on the model and data. I later noted that using the normal distribution might provide a reasonable analysis and encouraged trying more than one approach.</description>
      <pubDate>Fri, 18 Apr 2025 15:53:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964597#M48384</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2025-04-18T15:53:45Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964599#M48385</link>
      <description>&lt;P&gt;Thank you for your correction! Yes, I did not remember your message precisely. Sorry for the inconvenience it might cause. I have edited my original post accordingly.&lt;/P&gt;</description>
      <pubDate>Fri, 18 Apr 2025 16:21:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964599#M48385</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-18T16:21:53Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964654#M48386</link>
      <description>&lt;P&gt;Thank you for your suggestion. Yes, the AIC and BIC values are correct (negative) . The estimates and SE from the glimmix model using beta dist are much lower than the ones using genmod and normal dist.&lt;/P&gt;
&lt;P&gt;I added to the glimmix model&amp;nbsp;&lt;/P&gt;
&lt;PRE class="language-sas"&gt;&lt;CODE&gt;output out=xxx resid=r;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;but nothing happened.&amp;nbsp; Maybe I am missing something?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc glimmix data=one;&lt;BR /&gt;where Season=2021;&lt;BR /&gt;PercDMp=PercDM/100;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDMp=Harvest*Variety/ dist=beta ddfm=kr;&lt;BR /&gt;output out=xxx resid=r;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you very much Season!&lt;/P&gt;</description>
      <pubDate>Fri, 18 Apr 2025 22:14:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964654#M48386</guid>
      <dc:creator>palolix</dc:creator>
      <dc:date>2025-04-18T22:14:01Z</dc:date>
    </item>
    <item>
      <title>Re: Distribution for percentages in proc genmod</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964666#M48387</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you for your suggestion. Yes, the AIC and BIC values are correct (negative) .&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That is interesting. You might have noticed a message read "(smaller is better)" following "AIC" and "BIC" in SAS output. This indicates that models with smaller AIC or BIC values are better-fit models than those with larger values of them. Given that the AIC's and BIC's of beta regression model are both negative while their counterparts for the normal model are positive, it is demonstrated by the two statistics that the beta model fits the data dramatically better than the normal model does.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The estimates and SE from the glimmix model using beta dist are much lower than the ones using genmod and normal dist.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That is yet another interesting discovery. So it is the case that model misspecification leads to enlarged standard errors in your data. Given that the resultant&amp;nbsp;&lt;EM&gt;P&amp;nbsp;&lt;/EM&gt;values are similar, I speculate that the regression coefficient estimates of the normal model are larger than those of the beta model.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/6496"&gt;@palolix&lt;/a&gt;&amp;nbsp;wrote:
&lt;P&gt;I added to the glimmix model&amp;nbsp;&lt;/P&gt;
&lt;PRE class="language-sas"&gt;&lt;CODE&gt;output out=xxx resid=r;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;but nothing happened.&amp;nbsp; Maybe I am missing something?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc glimmix data=one;&lt;BR /&gt;where Season=2021;&lt;BR /&gt;PercDMp=PercDM/100;&lt;BR /&gt;class Harvest Variety;&lt;BR /&gt;model PercDMp=Harvest*Variety/ dist=beta ddfm=kr;&lt;BR /&gt;output out=xxx resid=r;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you very much Season!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The OUTPUT statement writes a dataset to a SAS library where datasets are stored. In response to your code, the dataset named "xxx" is stored in the Work library. You can visit that library to find it.&lt;/P&gt;</description>
      <pubDate>Sat, 19 Apr 2025 07:35:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Distribution-for-percentages-in-proc-genmod/m-p/964666#M48387</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-04-19T07:35:50Z</dc:date>
    </item>
  </channel>
</rss>

