<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: hpgenselect for continuous target variable in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373985#M19589</link>
    <description>&lt;P&gt;As others have correctly pointed out, there are a few ways to fit models to data with a beta distribution. GLIMMIX is the easiest way. However, since the original question dealt with HPGENSELECT, one would assume that they were trying to do variable selection from a large number of potential predictor variables. That cannot be done in an automated way with GLIMMIX or NLMIXED.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One should always be careful with the beta distribution: it is defined for 0 &amp;lt; y &amp;lt; 1. This means that all values of y equal to 0 or 1 will become missing values in GLIMMIX. My experience is that datasets with continuous proportions usually have 0s and 1s.&lt;/P&gt;</description>
    <pubDate>Fri, 07 Jul 2017 14:31:50 GMT</pubDate>
    <dc:creator>lvm</dc:creator>
    <dc:date>2017-07-07T14:31:50Z</dc:date>
    <item>
      <title>hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373450#M19539</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am unsure if&amp;nbsp;hpgenselect can be applied when target is continuous and has beta distribution. I do not want to use Beta Regression, does any other approach work if not&amp;nbsp;&lt;SPAN&gt;hpgenselect&amp;nbsp;?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind Regards&lt;/P&gt;
&lt;P&gt;SK&lt;/P&gt;</description>
      <pubDate>Wed, 05 Jul 2017 22:21:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373450#M19539</guid>
      <dc:creator>Siddharth123</dc:creator>
      <dc:date>2017-07-05T22:21:04Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373480#M19541</link>
      <description>&lt;P&gt;Unfortunately, this procedure cannot handle the beta distribution. As an approximation, you could use PROC GLMSELECT. You could use the weight statement to account for unequal variances for Y.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Jul 2017 01:41:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373480#M19541</guid>
      <dc:creator>lvm</dc:creator>
      <dc:date>2017-07-06T01:41:13Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373529#M19545</link>
      <description>&lt;P&gt;Or you can use proc hpnlmod. The beta distribution is quite simple, so you can specify the likelihood inside hpnlmod, and use the "general" likelihood in the model statement.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Jul 2017 08:16:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373529#M19545</guid>
      <dc:creator>JacobSimonsen</dc:creator>
      <dc:date>2017-07-06T08:16:43Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373602#M19546</link>
      <description>&lt;P&gt;Here a simple example of how you can find the log-likelihood estimates of the two parameters if all data are beta-distributed with same parameters. I think the example easily can be extended to situations where there are some covariates in the data.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data simulation;
  do i=1 to 1000;
    y=rand('beta',2,3);
	sqy=y**2;
	output;
  end;
run;

*start values are found by the moment method. Therefore, mean of y and y^2 are calculated.;
proc means data=simulation mean ;
  var y sqy;
  output out=startvalues mean=y sqy;
run;

data _NULL_;
  set startvalues;
  a=y*(y-sqy)/(sqy-y**2);
  b=(y-1)*(sqy-y)/(sqy-y**2);
  put a= b=;
  call symput('starta',put(a,best.));
  call symput('startb',put(b,best.));
run;

*here the likelihood estimates will be found; 
*The moment estimators from above are used as starting values;

proc hpnlmod data=simulation;
  parm a &amp;amp;starta. b &amp;amp;startb.;
  ll=(a-1)*log(y)+(b-1)*log(1-y)-logbeta(a,b);
  model i~general(ll);
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 06 Jul 2017 11:46:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373602#M19546</guid>
      <dc:creator>JacobSimonsen</dc:creator>
      <dc:date>2017-07-06T11:46:02Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373622#M19548</link>
      <description>&lt;P&gt;I like JacobSimonsen's approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10078"&gt;@JacobSimonsen&lt;/a&gt;, could you share why you decided to go with PROC HPNLMOD? &amp;nbsp;I would have chosen PROC NLMIXED, like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc nlmixed data=simulation;
  parms a &amp;amp;starta. b &amp;amp;startb.;
  bounds 0 &amp;lt; a,b;
  ll=(a-1)*log(y)+(b-1)*log(1-y)-logbeta(a,b);
  model y ~ general(ll);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/24118"&gt;@Siddharth123&lt;/a&gt;, if you want to see additional examples formulating models as MLE problems and using SAS procedures (such as NLMIXED) to solve, see&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://blogs.sas.com/content/iml/2017/06/12/log-likelihood-function-in-sas.html" target="_self"&gt;"Two simple ways to construct a log-likelihood function in SAS"&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://blogs.sas.com/content/iml/2017/06/14/maximum-likelihood-estimates-in-sas.html" target="_self"&gt;"Two ways to compute maximum likelihood estimates in SAS"&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 06 Jul 2017 12:29:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373622#M19548</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2017-07-06T12:29:44Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373626#M19549</link>
      <description>&lt;P&gt;My simple rule of thumb of whether I should choose PROC HPNLMOD or PROC NLMIXED is that if I have random effects then I use NLMIXED and otherwise HPNLMOD. That is simple because HPNLMOD in general is faster. In this case I have no strong opinion of which of these two procedure that should be used. Why would you choose NLMIXED?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I agree that it is wise to have the boundary option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I find it a bit funny that when the "general" likelihood is used, then it doesnt matter what variable that is on the left side of "~". Both NLMIXED and HPNLMOD&amp;nbsp;require a variable there.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Jul 2017 12:41:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373626#M19549</guid>
      <dc:creator>JacobSimonsen</dc:creator>
      <dc:date>2017-07-06T12:41:48Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373976#M19585</link>
      <description>&lt;P&gt;You can fit a beta model using PROC GLIMMIX or PROC FMM. &amp;nbsp;See the DIST=BETA option in the MODEL statement. See &lt;A href="http://support.sas.com/kb/57/480.html" target="_self"&gt;this example&lt;/A&gt; of using the beta distribution in GLIMMIX to model a continuous proportion response.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jul 2017 13:59:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373976#M19585</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2017-07-07T13:59:49Z</dc:date>
    </item>
    <item>
      <title>Re: hpgenselect for continuous target variable</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373985#M19589</link>
      <description>&lt;P&gt;As others have correctly pointed out, there are a few ways to fit models to data with a beta distribution. GLIMMIX is the easiest way. However, since the original question dealt with HPGENSELECT, one would assume that they were trying to do variable selection from a large number of potential predictor variables. That cannot be done in an automated way with GLIMMIX or NLMIXED.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One should always be careful with the beta distribution: it is defined for 0 &amp;lt; y &amp;lt; 1. This means that all values of y equal to 0 or 1 will become missing values in GLIMMIX. My experience is that datasets with continuous proportions usually have 0s and 1s.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jul 2017 14:31:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/hpgenselect-for-continuous-target-variable/m-p/373985#M19589</guid>
      <dc:creator>lvm</dc:creator>
      <dc:date>2017-07-07T14:31:50Z</dc:date>
    </item>
  </channel>
</rss>

