Contributor
Posts: 57

# hpgenselect for continuous target variable

Hi,

I am unsure if hpgenselect can be applied when target is continuous and has beta distribution. I do not want to use Beta Regression, does any other approach work if not hpgenselect ?

Kind Regards

SK

Valued Guide
Posts: 684

## Re: hpgenselect for continuous target variable

Unfortunately, this procedure cannot handle the beta distribution. As an approximation, you could use PROC GLMSELECT. You could use the weight statement to account for unequal variances for Y.

Super Contributor
Posts: 298

## Re: hpgenselect for continuous target variable

Or you can use proc hpnlmod. The beta distribution is quite simple, so you can specify the likelihood inside hpnlmod, and use the "general" likelihood in the model statement.

Super Contributor
Posts: 298

## Re: hpgenselect for continuous target variable

Here a simple example of how you can find the log-likelihood estimates of the two parameters if all data are beta-distributed with same parameters. I think the example easily can be extended to situations where there are some covariates in the data.

``````data simulation;
do i=1 to 1000;
y=rand('beta',2,3);
sqy=y**2;
output;
end;
run;

*start values are found by the moment method. Therefore, mean of y and y^2 are calculated.;
proc means data=simulation mean ;
var y sqy;
output out=startvalues mean=y sqy;
run;

data _NULL_;
set startvalues;
a=y*(y-sqy)/(sqy-y**2);
b=(y-1)*(sqy-y)/(sqy-y**2);
put a= b=;
call symput('starta',put(a,best.));
call symput('startb',put(b,best.));
run;

*here the likelihood estimates will be found;
*The moment estimators from above are used as starting values;

proc hpnlmod data=simulation;
parm a &starta. b &startb.;
ll=(a-1)*log(y)+(b-1)*log(1-y)-logbeta(a,b);
model i~general(ll);
run;
``````
SAS Super FREQ
Posts: 3,753

## Re: hpgenselect for continuous target variable

I like JacobSimonsen's approach.

@JacobSimonsen, could you share why you decided to go with PROC HPNLMOD?  I would have chosen PROC NLMIXED, like this:

``````proc nlmixed data=simulation;
parms a &starta. b &startb.;
bounds 0 < a,b;
ll=(a-1)*log(y)+(b-1)*log(1-y)-logbeta(a,b);
model y ~ general(ll);
run;
``````

@Siddharth123, if you want to see additional examples formulating models as MLE problems and using SAS procedures (such as NLMIXED) to solve, see

Super Contributor
Posts: 298

## Re: hpgenselect for continuous target variable

My simple rule of thumb of whether I should choose PROC HPNLMOD or PROC NLMIXED is that if I have random effects then I use NLMIXED and otherwise HPNLMOD. That is simple because HPNLMOD in general is faster. In this case I have no strong opinion of which of these two procedure that should be used. Why would you choose NLMIXED?

I agree that it is wise to have the boundary option.

I find it a bit funny that when the "general" likelihood is used, then it doesnt matter what variable that is on the left side of "~". Both NLMIXED and HPNLMOD require a variable there.

SAS Employee
Posts: 282

## Re: hpgenselect for continuous target variable

You can fit a beta model using PROC GLIMMIX or PROC FMM.  See the DIST=BETA option in the MODEL statement. See this example of using the beta distribution in GLIMMIX to model a continuous proportion response.

Valued Guide
Posts: 684