About Brendan42

Brendan42 · ‎05-11-2022

Dear @SteveDenham, Thank you again for your response. 1. Stage 1 or 2 My code is for stage 2, stage 1 appeared to work. I've posted both stage 1 and stage 2 of my code below with comments for clarification. Maybe something notable for stage 1: if I used proc mixed without the model statement: singular=1e-8 I will occasionally get the "Infinite Likelihood" error. I never get the error otherwise or in glimmix. I don't think this is due to duplicates, even though that's normally the case. I got the idea to use singular=1e-8 from a thread post by @jiltao. https://communities.sas.com/t5/SAS-Programming/Infinite-Likelihood-error-Proc-Mixed/m-p/803817#M316523 Other than scouring the data by hand, I used tabulate to make sure I only have one observation per treatment combination per year. proc tabulate data=dry_full; class year amend cut rep; table (amend*cut*rep)*(N), year ALL; run; 2. REML You're right, of course. The code I pasted isn't using REML, but I have been using REML up until recently. I've entered the "just try anything" phase of the analysis, and changing the method was the latest thing. 3. The last of the 42 must be the residual variance and is not included in the code (yes/no ?) The 42nd covariance parameter is included in the code and set to 1 (hold=42). It isn't obvious from the code I posted originally because I had imported parmsdata from a previous run. It should be more clear below. The large blocks of parms values have 20 values each (20 treatment combinations). The first block of 20 are for the Shukla stability intercepts, and the second block of 20 are for the Shukla stability slopes. There is one additional parms value before the blocks (random year) and one after (residual). 4. Is there anything in the Macholdt paper that explains the values that they plug into the stage 1 parms statement? Not to my knowledge, and I don't think it's always necessary to set parms values in stage 1. It looks to me as though they wanted to guarantee non-negative AR(1) values since there isn't a "nobounds" option anywhere in their code. This wasn't a problem in my data. Here's another appendix screenshot, from a different paper. Note that they're also using glimmix in stage 2, but they aren't estimating Shukla slopes. Another appendix 5. Results from splitting the data for stage 2 into two parts: estimates/treatments and the covariance matrix The results are identical when the data are split, SAS does not seem to be confused by having the same dataset passed to "ldata" and "data". 6. Results from specifying parms values based on the hacky regression estimates I re-ran the data with very specific parms estimates that were based exactly on (1) a linear model of the per-decade Shukla estimates and (2) a quadratic model of the same estimates. The linear model results in: ERROR: Values given in PARMS statement are not feasible. The quadratic model has insufficient memory, so I'll re-run it this weekend (we have SAS studio with an 8gb limit... I can get 16gb over the weekend if I ask nicely). 7. Full code with comments and explanations as to parm bounds. Stage 1 /* I do not use the parms statement at all in stage 1.*/ /* This is a split plot experiment (cut main, amend sub)*/ /* I used SP(POW) for the autocorrelation in the first two random statements*/ /* because we have "missing" (fallow) years (unequal time steps)*/ /* I used csh for the autocorrelation in the third random statement because*/ /* I needed to account for heteroscedasticity more so than an AR1 structure (IMO).*/ /*SP(POW) doesn't seem to work nicely with heterogeneous variances.*/ /* CSH seemed to be a good compromise since other structures (UN, ANTE) */ /* resulted in insufficient memory errors*/ Proc glimmix Data=DRY_full ; ods output covparms=cp2 lsmeans=dryyield_full_csh; Class rep year amend cut; Model tr1_yield=cut|amend|year /ddfm=kr ; random year/subject=rep type=SP(POW)(year_num); random year/subject=rep*cut type=SP(POW)(year_num); random year/subject=rep*cut*amend type=csh; Lsmeans cut*amend*year/cov; run; /* The data are modified to get new variables "t", "st", and conform to the lin(1) naming conventions*/ Data dryyield_full_csh_changed; Set dryyield_full_csh; Row=_N_; Rename cov1-cov572=col1-col572; Parm=1; t=year-1990; st=sqrt(t); Run; Stage 2 Proc glimmix Data=dryyield_full_csh_changed plots=pearsonpanel(conditional marginal) method=rspl; Class year amend cut cult ; Model estimate=amend|cut|t cult /s ddfm=kr cl noint; nloptions maxiter=1000 GCONV=0; COVTEST /cl; Random year; /*Results in 1 covariance est*/ Random int/sub=year group=amend*cut; /*20 covariance est*/ Random st/sub=year group=amend*cut; /*20 covariance est*/ random _residual_/subject=int type=lin(1) ldata=dry_full; /*1 covariance est*/ Parms /*This is the "year" starting value*/ (.01) /*These are the Shukla intercept starting values*/ /*My guesses presented here came from the "hacky" regression*/ /*The two 0.00000000 values may be an issue, */ /*the regression estimated them as being negative. */ /*Removing the lower bound did not help for them.*/ /*The quadratic version may actually help if so (all were positive)*/ /*There are 20 intercept values, 20 slope values*/ (0.06804396)(0.00000000)(0.03140187)(0.11450479) (0.11665396)(0.16136292)(0.00823989)(0.00894313) (0.02657896)(0.02634313)(0.24657283)(0.05338596) (0.03234812)(0.03043496)(0.06711858)(0.11232925) (0.01128913)(0.00000000)(0.03359046)(0.03478354) /*These are the Shukla slope starting values*/ (-0.00332221)(00.00006651)(-0.00180324)(-0.00668694) (-0.00559049)(-0.00787263)(-0.00045049)(-0.00053659) (-0.00149999)(-0.00078679)(-0.01335398)(-0.00303279) (-0.00172644)(-0.00167246)(-0.00325155)(-0.00501563) (-0.00021841)(00.00032859)(-0.00184381)(-0.00143606) /*This is the covariance parameter associated with the ldata covariance matrix*/ /*It is held at unity as per the papers I've read on the this 2-stage method*/ /*i.e. hold=42, it is the 42nd value*/ (1)/ hold=42 nobound lowerb= /*The variances have a lower bound of zero*/ 0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, /*The slopes for the Shukla stability estimates can be negative*/ .,0,.,., .,.,.,., .,.,.,., .,.,.,., .,0,.,., /*This value is held at 1, so a lowerbound probably doesn't matter...*/ 0; lsmeans amend*cut; Run; This above code results in "parms not feasible". If we comment out the "nobound" option and use (0.01) for all parms except lin(1), we get: NOBOUND commented out of the code If we then import these covariance estimates (parmsdata=cp2) and include NOBOUNDS we get: Generic parms values, NOBOUND active I'll try the quadratic version over the weekend when I have more memory. Thanks very much for taking a look at this, it's been driving me crazy...

Brendan42 · ‎05-09-2022

Thanks very much for your response, Steve. I'm re-running my model based on some of the comments you made and will post some output later today (I hope). 1. Redundancy Random int/sub=year; Random int/sub=year group=amend*cut; From what I've read, this is how Shukla's stability is calculated in proc mixed/glimmix based on the SAS code presented in Piepho 1999 (not shown) as well as appendices like from Macholdt 2019 (screenshot of that below, Dr. Piepho is a co-author on the paper). Macholdt 2019 2. Row in the class statement This was leftover from me switching to glimmix from mixed. I've removed it for the new runs, thanks very much for catching it. 3. ldata I'll rerun the analysis after splitting the data into two parts. Dr. Macholdt doesn't do this in either of the appendices I've seen, though (see above picture). This is my first time using this structure so I've just been following other Shukla papers and the SAS help page. i.e. The matrix is positive definite, and the parm value is held at 1. It's very possible that there's a modification I need to make here, I just don't know what it would be. 4. Is it possible that the second RANDOM statement should be: random t/sub=year group=amend*cut; I don't think so. To get the linear trend in variance we need to use the square root of continuous time. Also, I think this would calculate the trend in Shukla stability but wouldn't give an "intercept" for it, so calculating the Shukla stability variance at time=t would not be possible (afaik). The picture below shows the ASREML code that I'm trying to copy with mixed/glimmix. I believed that at(ct):year # Shukla stability variance (intercept) at(ct):year:sqrt_t # Trend in Shukla stability variance (slope) should be equivalent to Random int/sub=year group=amend*cut; /*Intercept*/ Random st/sub=year group=amend*cut; /*Slope*/ The picture below is in 2 parts. Stage 1 Stage 2 Looking at this code again, it's not clear to me how they incorporated the "predicted.value" column into the "data" dataset. It isn't clear if the dataset in stage 2 has different dimensions (rows) than stage 1... I'll email the author about this... 5. What happens if the variance component for st is greater than zero? It depends. In a model where all variance components have the lower bound of 0 the st variance components will either have a positive value with SE and CI >0 or be zero (implying that they may be negative). Removing the boundary constraint on the st variance components that were 0 usually results in Hessian not positive definite No standard errors/CI for some of the covariance parameters df=0 or df=1 for the fixed effects estimates (New) enormous gradients While the covariance estimates shouldn't be trusted in these cases (afaik) they do look reasonable: many of the st variances become negative and at about the magnitude I would have expected. I think I have an idea of the magnitude I should expect for these slopes because I divided the data into separate decades as a hacky way to view the variance trends. The trend itself doesn't look linear, but higher-order polynomial trends don't seem to work either. References Macholdt J, Piepho H-P and Honermeier B, Mineral NPK and manure fertilisation affecting the yield stability of winter wheat: Results from a long-term field experiment. Eur J Agron; 102(14-22 DOI Electronic Resource Number (2019). Macholdt J, Hadasch S, Piepho HP, Reckling M, Taghizadeh-Toosi A and Christensen BT, Yield variability trends of winter wheat and spring barley grown during 1932–2019 in the Askov Long-term Experiment. Field Crops Res; 264(1-14 DOI Electronic Resource Number (2021). Piepho HP, Stability Analysis Using the SAS System. Agron J; 91(1): 154-160 DOI Electronic Resource Number (1999).

Brendan42 · ‎05-06-2022

Hello! How do I code a linear g-side variance function in mixed/glimmix. Specifically, the g-side covariance should be able to increase or decrease linearly with time. My attempt is below, and I can't tell if my code is wrong OR I have too many random effects OR my parms starting values are poor OR other... Data description: 30 years of wheat yield data 4 amendment treatments 5 cut treatments 4 blocks cult is cultivar, they change through time (common in long term agricultural experiments) organized as split plot, cut is the main effect, amendment is the subplot effect A time variable that is (confusingly) present as continuous time (t), categorical time (years), and the square-root of t (st). This isn't my design, I'm following other examples from the literature Some missing data here and there, but overall we're missing only 8/580 data points in the second stage This is the second stage of a Shukla stability analysis (Piepho 1999) The first stage accounts for the autocorrelation and split-plot design and returns year X treatment lsmeans and the corresponding covariance matrix The second stage "imports" the lsmean estimates and covariance matrix from the first stage The goal is to get g-side variance estimates along with an estimate of a slope (does the variance increase or decrease over the years). If this is possible in mixed/glimmix, is it done the way shown below? The code in bold represents the intercept and slopes for Shukla stability estimates (or at least that's what I would like). The strategy is to run the model twice. In the first model we see which covariance slopes are essentially zero and then we unbound those covariance estimates so they can become negative. The code below imports the covariance estimates from the first run. Proc glimmix Data=dry_full plots=pearsonpanel(conditional marginal) method=RMPL; Ods output covparms=cp3; Class year amend cut row cult ; Model estimate=amend|cut|t cult /s ddfm=kr cl noint; nloptions maxiter=1000 GCONV=0; COVTEST /cl; Random int/sub=year; Random int/sub=year group=amend*cut; Random st/sub=year group=amend*cut; random _residual_/subject=int type=lin(1) ldata=dry_full; Parms /hold=42 parmsdata=cp2 nobound lowerb=0, /*Shukla intercept estimates should have a lower bound of zero */ 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, /*Some Shukla slopes are allowed to be negative*/ 0,0,.,., .,.,.,., .,.,.,., .,.,.,0, 0,0,.,., 0; */; covtest / cl; lsmeans amend*cut; Run; The errors I get are (depending on tweaks I make to the code): Hessian not positive definite No standard errors/CI for some of the covariance parameters df=0 or df=1 for the fixed effects estimates I assume the model is overparameterized, but if it's a coding issue or just poor starting parameters I can fix then that would be great... Piepho 1999 provides a source for the 2-step procedure with SAS code in mixed (but no variance trend) Macholdt 2021 shows the code to include a linear variance tend in ASREML Hadasch 2020 claims to use the method but provides no code Thanks very much! Refences Hadasch S, Laidig F, Macholdt J, Bönecke E and Piepho HJFCR, Trends in mean performance and stability of winter wheat and winter rye yields in a long-term series of variety trials. 252(107792 DOI Electronic Resource Number (2020). Macholdt J, Hadasch S, Piepho HP, Reckling M, Taghizadeh-Toosi A and Christensen BT, Yield variability trends of winter wheat and spring barley grown during 1932–2019 in the Askov Long-term Experiment. Field Crops Res; 264(1-14 DOI Electronic Resource Number (2021). Piepho HP, Stability Analysis Using the SAS System. Agron J; 91(1): 154-160 DOI Electronic Resource Number (1999).

Brendan42 · ‎03-24-2022

Sorry to bump an old post, but adding singular=1e-8 option worked for me. I'm working with a large multiyear agricultural dataset (30 years, 2 sites, 4 blocks, 4 levels treatment 1, 5 levels treatment 2). Spit plot design. I split the data into 4 subsets to make the analysis more manageable (sites analyzed separately, and then around 2004 there was a break in the data due to a summer fallow). The models were much faster this way, and AR1 and ARH1 covariance structures remained options because there was no year gap. The random effects structure gets a little complicated (https://link.springer.com/article/10.1007/s13593-021-00681-4), but 3 of the 4 datasets ran very well with the same code. The fourth dataset just would not run without the infinite likelihood error on iteration 1. No duplicate records that I could find. Adding singular=1e-8 got it to run, and the results look reasonable. Thanks!

Online Status	Offline
Date Last Visited	‎06-06-2022 09:35 PM

Re: How do we include a linear g-side variance function in mixed/glimm...

Re: How do we include a linear g-side variance function in mixed/glimm...

How do we include a linear g-side variance function in mixed/glimmix?

Re: Infinite Likelihood error Proc Mixed

Re: Infinite Likelihood error Proc Mixed

Re: How do we include a linear g-side variance function in mixed/glimm...

Re: How do we include a linear g-side variance function in mixed/glimm...

How do we include a linear g-side variance function in mixed/glimmix?

Re: Infinite Likelihood error Proc Mixed