Hi everyone,
I am wondering if anyone has any insight on my problem.
The data for analysis designed the four-way dataset and they are unbalanced.
Number of Observations Read: 12852
Number of Observations Used: 8404
Number of Observations Not Used: 4448
Class | Levels |
---|---|
treat | 2 |
location | 18 |
year | 7 |
variety | 51 |
I use PROC HPMIXED with assumed to unstructured UN variance-covariance matrix and the result was correct, but now I want to assumed different variance-covariance
matrix, and try use PROC MIXED or PROC HPLMIXED with e.g. type=FA(1) it doesn’t working.
| PROC | PROC | PROC HPLMIXED |
---|---|---|---|
UN | ERROR: The SAS System stopped processing this step because of | OK | ERROR: PROC HPLMIXED |
FA(1) | ERROR: The SAS System stopped processing this step because of | - | ERROR: PROC |
I try different modification of model and dataset (smaller number of observations) but the program generate the ERRORS and NOTE e.g.
ERROR: Optimization routine cannot improve the function value.
ERROR: G matrix is not positive definite. HPLMIXED does not support this in the current release.
ERROR: Newton-Raphson with Ridging optimization cannot be completed.
ERROR: Model is too large to be fit by PROC HPMIXED in a reasonable amount of time on this system.
NOTE: At least one element of the gradient is greater than 1e-3. The GCONV= option modifies the relative gradient convergence criterion and lowering its value might help to reduce the gradient.
NOTE: The estimated G matrix is not positive definite.
What can I do to use different variance-covariance matrix for this dataset?
Example model:
proc HPMIXED data =Y3.year7;
class treat location year variety;
model yield = treat /s;
random location /s;
random variety /s;
random location /solution subject=variety type=un;
random variety /solution subject=treat type=un;
random location /solution subject=treat type=un;
random variety*treat*location /solution;
random year year*variety treat*year location*year treat*year*variety treat*location*year location*year*variety location*year*variety*treat;
run;
Thank you for any help.
Dear KN-L,
Usually when i have a big dataset with a lot of effect to model, as it is your case,
1* I make a first round with "hpmixed" without any variance-covariance structure. With "ods output" statement, i have a good estimation of all the parameters of the model.
2* For the second round, i use "mixed" with "parms" statement to give good initial value an to help the model to converge.
Here some piece of code :
/*Step ONE */
PROC HPMIXED DATA=be1 ;
CLASS site prov bloc fam ;
MODEL &character_i = ;
random site bloc(site) prov prov(site) fam(prov) fam(prov*site) ;
ods listing exclude covparms ;
ods output covparms= Varhp ;
run ; quit ;
/*Here i use the ods output to have the estimation of all the parameters */
proc transpose data= varhp out = varhpt (drop =_NAME_);
id CovParm ;
var _numeric_ ;
run ;
data _null_ ;
set varhpt;
call symput("site",site) ;
call symput("bloc_site_",bloc_site_) ;
call symput("prov",prov) ;
call symput("prov_site_",prov_site_) ;
call symput("fam_prov_",fam_prov_) ;
call symput("fam_site_prov_",fam_site_prov_) ;
call symput("Residual",Residual) ;
run ;
%put &site ;
%put &bloc_site_ ;
%put &prov ;
%put &prov_site_;
%put &fam_prov_ ;
%put &fam_site_prov_ ;
%put &Residual ;
/*Step TWO, the big model */
PROC MIXED DATA=be1 AsyCov covtest IC ;
CLASS site prov bloc fam ;
MODEL &character_i = /S outpred=&character_i COVB residual notest ;
RANDOM site bloc*site prov prov*site fam*prov fam*prov*site /S G;
ESTIMATE 'GrandMean' intercept 1 /DIVISOR= 1 ;
REPEATED / subject = intercept local type = sp(pow)(x y) ;
Parms (&site) (&bloc_site_) (&prov) (&prov_site_) (&fam_prov_) (&fam_site_prov_) (25) (0.8) (&Residual);
run ; quit ;
Cheers,
JB
You are trying to fit some models that are not supported by some of the procedures. I also recommend that you greatly simplify your model before proceeding. Your example model is almost certainly over-parameterized, and probably extremely over-parameterized. For instance, having a random location main effect and a locationxtreatment interaction with UNstructed matrix is not identifiable (try VC instead). Same comment for other complexities. I expect that several model terms are not identifiable. Start with a very simple model, perhaps with no random effects. Then add those that are essential for the experimental design. These will mostly be variance-component terms.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.