About Nebulus

Nebulus · ‎01-16-2020

So I'm running an experiment where I am taking counts of insects. I have 3 treatments with 10 replicates per treatment arranged in a RCBD (There are 10 blocks containing each treatment). I take repeated counts for 6 consecutive weeks. I replicate the entire experiment twice. I'm treating weeks as a repeated measure and block and experimental replicate as random effects. Here is my code. I'm getting the dreaded "Obtaining minimum variance quadratic unbiased estimates as starting values for the covariance parameters failed." error and not sure where the error lies in my coding. For sake of simplicity I'm just showing the model looking at the variable "trips_r" for now but I will be looking at 8 different insect count variables and the damage variables (poprate and roserate). (TRT = treatment, EXP= experiment replicate, WEEK= week, CAGE= experimental unit, BLOCK = block). proc sort; by TRT EXP WEEK; RUN; PROC PRINT DATA=greenhouse; run; quit; PROC MEANS stderr noprint; by TRT EXP WEEK; var adultm_p eggm_p totm_p trips_p adultm_r eggm_r totm_r thrips_r peprate roserate; output out=avg mean = Xadultm_p Xeggm_p Xtotm_p Xtrips_p Xadultm_r Xeggm_r Xtotm_r Xthrips_r Xpeprate Xroserate stderr = Sadultm_p Seggm_p Stotm_p Strips_p Sadultm_r Seggm_r Stotm_r Sthrips_r Speprate Sroserate; run; proc print data=avg; run; PROC GLIMMIX DATA=greenhouse; CLASS TRT WEEK EXP BLOCK CAGE; MODEL thrips_r = TRT|BLOCK|EXP / dist=poi link=log s; random intercept / subject=BLOCK; random WEEK / type=ar(1) subject=CAGE(TRT*BLOCK) residual; LSMEANS TRT/ PDIFF adjust=TUKEY; RUN;

Nebulus · ‎08-14-2018

Yeah I was originally not using an offset but using densities as my dependent variable. I will try the normal distribution after seeing if I can normalize the data. I think I was fixated too much on using the poisson even though my data aren't integers. Thanks again for being so generous with your time, I think this information is more than enough for me to troubleshoot my issues. I'm currently at a smaller UC school but I'll be moving to Cornell to do a post-doc in September, so if I'm still having issues hopefully the consulting there is superior to where I currently am.

Nebulus · ‎08-13-2018

Here is a sample of the input data for site 9 (which is an inland climate type) at interval 26. The log of leaf area is taken after the datalines step. input site leaf climate$ interval count leafarea previous_para; if interval = 1 then season="Warm"; else if interval =2 then season="Warm"; else if interval =3 then season="Warm"; else if interval =4 then season="Warm"; else if interval =5 then season="Warm"; else if interval =6 then season="Warm"; else if interval =7 then season="Warm"; else if interval =8 then season="Mod"; else if interval =9 then season="Cool"; else if interval =10 then season="Cool"; ect. ect. --------------------------------------------------------- 9 6992 Inland 26 0 10.767 0.708 9 6993 Inland 26 14 8.985 0.708 9 6994 Inland 26 26 10.001 0.708 9 6995 Inland 26 0 23.441 0.708 9 6996 Inland 26 67 24.972 0.708 9 6997 Inland 26 5 20.839 0.708 9 6998 Inland 26 23 19.180 0.708 9 6999 Inland 26 27 17.171 0.708 9 7000 Inland 26 96 28.089 0.708 9 7001 Inland 26 0 11.887 0.708 9 7002 Inland 26 63 18.145 0.708 9 7003 Inland 26 0 26.068 0.708 9 7004 Inland 26 0 26.864 0.708 9 7005 Inland 26 0 13.516 0.708 9 7006 Inland 26 0 15.965 0.708 9 7007 Inland 26 0 14.251 0.708 9 7008 Inland 26 26 15.475 0.708 9 7009 Inland 26 0 25.854 0.708 9 7010 Inland 26 57 42.570 0.708 9 7011 Inland 26 8 19.700 0.708 9 7012 Inland 26 35 29.558 0.708 9 7013 Inland 26 54 28.970 0.708 9 7014 Inland 26 9 20.282 0.708 9 7015 Inland 26 13 17.331 0.708 9 7016 Inland 26 42 17.435 0.708 9 7017 Inland 26 3 20.557 0.708 9 7018 Inland 26 0 27.391 0.708 9 7019 Inland 26 0 29.681 0.708 9 7020 Inland 26 0 24.170 0.708 The mean previous parasitism for each site is already calculated at a specific subsequent interval (in this case 70.8%, from site 9 during interval 25) and included in the data input so there is no need to average it again in proc means. Yes the second proc means in unnecessary. Proc Means would spit out this observation for that specific site/interval to run in the model. Obs interval season climate site previous_para totalfin areafin 232 26 Warm Inland 9 0.708 18.93 2.95681 This is the current code that works. proc glimmix data=final ; class site interval season climate; model totalfin = interval(season) season climate / offset=areafin dist=poi link=log solution; random intercept / subject=site(climate); random interval(season) / subject=site(climate) residual; random _residual_; lsmeans season climate / cl pdiff plot=meanplot adjust=tukey; run; P.S. I've tried stats counseling on campus, but all they ever provide me is stats Ph.D. students who I've found don't have the expertise yet to really solve these difficult questions, especially when programming is involved. So this is more progress than I've been able to achieve in the last month 🙂

Nebulus · ‎08-13-2018

Thanks a lot for your in depth reply and suggestions. Sections are determined by canopy height, 0-33%, 34-66%, and 67-100%. I then took 10 leaves randomly selected from each section based on a direction (0-360 degrees). This is part of my dissertation and my PI wasn't too keen on including section, but I feel a lot of the variation in counts can be captured by section because the insect likes the shaded parts of the plants and densities decline as canopy height increases. I've been using averages of 7020 observations (30 leaves x 9 sites x 26 intervals) for counts across the study period. This breaks down into 234 means that go into the model and looks like this (subset) for each interval. proc sort; by interval season climate site logarea previous_para; proc means mean n noprint; var total logarea; by interval season climate site previous_para; output out=new mean=mnqtot mnarea n=n; run; proc sort data=new; by interval season climate site previous_para; run; proc means data=new mean n noprint; var mnqtot mnarea; by interval season climate site previous_para; output out=final mean=totalfin areafin n=n; run; proc sort data=final; by interval season climate site previous_para; run; proc print data=final; run; Obs interval season climate site previous_para totalfin areafin n 10 2 Warm Coastal 1 0.660 10.67 3.09633 1 11 2 Warm Coastal 2 0.912 306.97 4.24529 1 12 2 Warm Coastal 3 0.495 514.60 3.05256 1 13 2 Warm Coastal 4 0.722 352.17 2.79876 1 14 2 Warm Coastal 5 0.426 506.83 3.52447 1 15 2 Warm Inland 8 0.648 139.67 3.43376 1 16 2 Warm Inland 9 0.801 73.43 2.91858 1 17 2 Warm Interm 6 0.680 214.57 3.32450 1 18 2 Warm Interm 7 0.429 87.10 2.69907 1 @sld wrote: Yes, interval is a factor associated with repeated measurements on sites. That would be closer to correct, but you are not using the residual option correctly. Only RANDOM statements that specify elements of the R matrix should include the residual option; RANDOM statements that specify elements of the G matrix should not. Site and repeated measures on sites are specified in G; leaves are specified in R. This point definitely may be the issue. I was interpreting the random statement in PROC GLIMMIX to be equivalent to the repeated statement in PROC MIXED, so anytime you had a repeated measurement it had to be included. For example on page 9 of the advanced techniques for fitting Mixed models you cited, it says.. "Or, suppose you have the following REPEATED statement in PROC MIXED with a repeated effect of Time: repeated Time / subject=Block type=ar(1); You can replace that statement with the following RANDOM statement in PROC GLIMMIX: random Time / subject=Block type=ar(1) residual;" Removing the random statement prevents even the most simplified of this model from converging.. So you are saying if I'm not using leaf counts, but site averages, I have no R-sided effects with my current model? Yes climate designations aren't just temperature. They are based off of southern California climate zones. Coastal (USDA Hardiness Zone 10b; Sunset Climate Zone 24), intermediate (USDA Hardiness Zone 10a; Sunset Climate Zones 20–22), and inland (USDA Hardiness Zones 9a/9b; Sunset Climate Zones 18–19). Counts are of the pest insect (whitefly). Previous parasitism is total parasitism of the whitefly by 3 parasitic wasps observed from the previous interval that could be affecting its densities at the subsequent interval. It looks like I still have a lot of reading to do on adding regressions/splines to a mixed model. Thanks!

Nebulus · ‎08-13-2018

This is my first time really dealing with complex mixed models so I appreciate the feedback. It's taken me 100s of hours of reading to get to this point, but I know the model still isn't perfect. Just to add/answer some of your comments. Yes right now I am using random _residual_ in the model, but it still seems rather overdispersed. It goes from X2/df of 90>32 when that term is included, but can only ever get close to 1 when I square root transform the data. I've tried a negbin model, but have convergence issues. 30 new leaves are sampled from plants every interval. Site=plant in my case because I only have one study plant per site.. I was taking leaves from three separate sections of the canopy, but we decided to eliminate that from the model due to a list of issues including sections not being true replicates.. 1) Yes I am using the log of leaf area. 2) Ive never heard of a structural equation model, so I will do some research into it. 3) Yes I will remove subject=climate. I originally had it as "random interval / subject=site residual" The intent was to treat interval as a repeated measure of site. I think that way was correct? 4) There is a lot more going on here with seasons I didn't really get into. I showed climate types vary by temperature and so do intervals, so if I see seasonal/climate differences they can in part be attributed to temperature differences. I can't directly model temperature because temperature is confounded by site due to a single temperature point corresponding to a site. Intervals are assigned to seasons based on mean monthly temperatures (0-15 degrees, 15.01-20, and 20.01-25). Assigning intervals to arbitrary seasons such as Winter Fall Spring Summer doesn't help us interpret seasonal effects in this case, and southern CA doesn't have traditional seasons. 5) I got rid of a climate*season interaction a while ago because p-values are huge for it. I'll have to consider an interval interaction, but I think with 26 intervals it would be difficult to interpret.

Nebulus · ‎08-12-2018

Due to the assumptions of GLMMs I know that data transformations to meet normality ect. aren't necessary. I couldn't find an answer regarding if performing variance stabilizing transformations are inappropriate however. I'm using PROC GLIMMIX for insect count data to look at the effects of Climate and Season on counts. I sampled counts on leaves (leaf area used as an offset) over the course of 26 (2 weeks apart) sample intervals. Intervals are categorized as three separate seasons based on mean temperatures. Different sites are used as replicates of Climate type. Sample intervals are a repeated measure. Parasitism rates from the previous interval was used as a covariate for observed host densities in current interval. My code looks something like this (Let me know if you see an issue here as well I'm new to GLMM and the syntax was difficult). proc glimmix data=final; class site interval season climate; model counts = interval(season) season climate previous_para / offset=leafarea dist=poi link=log s; random intercept / subject=site; random interval / subject=climate residual; /* Using this line and the next partitions variability between interval and sites*/ random site(interval) / subject=climate residual; random _residual_ /* Corrects observed overdispersion*/; lsmeans season climate / cl pdiff plot=meanplot adjust=tukey; run; Using the untransformed data my model fit looks like this. SAS Output Fit Statistics -2 Res Log Pseudo-Likelihood 481.96 Pseudo-AIC 487.96 Pseudo-AICC 488.13 Pseudo-BIC 481.96 Pseudo-CAIC 484.96 Pseudo-HQIC 481.96 Generalized Chi-Square 4454.47 Gener. Chi-Square / DF 31.15 Square root transforming I get these fit statistics. The Pearsons/df looks much better. -2 Res Log Pseudo-Likelihood 362.94 Pseudo-AIC 368.94 Pseudo-AICC 369.11 Pseudo-BIC 362.94 Pseudo-CAIC 365.94 Pseudo-HQIC 362.94 Generalized Chi-Square 169.92 Gener. Chi-Square / DF 1.19 So my question is, is square root transforming acceptable or no and why? Many Thanks!

Nebulus · ‎08-10-2018

Nebulus · ‎03-06-2018

I'm running a GLMM with a binomial error distribution on proportion parasitism data values which range 0-1 and are not transformed. I'm having a problem where I'm getting a negative intercept and thus my LSMEANS are negative, which shouldn't be the case. Unless it should be the case due to the link function...? The model code looks like this proc glimmix data=final ic=q; class interval climate season site; model parasitism = interval(season) season climate tot_nymprev / dist=bin link=logit s; random intercept / subject=site; random interval / subject=site residual; random _residual_; lsmeans season /cl diff plot=meanplot adjust=tukey; lsmeans climate /cl diff plot=meanplot adjust=tukey; run; Proc means gives me the correct values. Below is the paritial output showing the issue with the intercept and LSMEANS. Can anyone explain what is happening. I'm completely perplexed at this point. I would expect to get positive LSMEANS for parasitism rates..

Nebulus · ‎01-25-2018

Thank you!

Nebulus · ‎01-25-2018

I agree with your point about the use of contrasts. Regarding including season in the model,I can do that but all it does again is just compare differences in densities between intervals(season). I need to look at the lsmeans of climate within a specific season (e.g. coastal vs. inland within spring, summer, fall and winter). proc glimmix data=final; class site climate interval season; model totalfin = interval(season) climate previous_para / dist=poi link=log solution; random interval(season) / subject= site(climate) residual; lsmeans climate / cl pdiff plot=meanplot; lsmeans interval(season) / cl pdiff plot=meanplot; This model doesn't let me do interval(season)*climate in the model statement. I might have to just end up getting rid of interval after all and do lmeans of season*climate

Nebulus · ‎01-25-2018

I'm running an experiment where I'm interested in climate, time of year, and parasitism levels on population densities of an insect. I sample 3 climate types 26 times over the course of a year. Climate A has 5 sites, and Climates B and C each have 2 sites for 9 sites total. My code looks like this: proc glimmix data=final ; class site climate interval; model totalfin = interval climate previous_para / dist=poi link=log solution; random interval / subject= site(climate) residual; run; Model suggests there is a highly significant effects of sample interval, parasitism, but not climate (p=0.11). Looking at the raw data it appears that climate effects exists, but they vary by season and this is washing out the effects. I want to compare climate effects within each season so I assign each interval to a season. I want to use contrasts rather than grouping 26 intervals into 4 groups to maintain my df. I can look at seasonal differences on densities via these contrasts, but I'm having trouble finding a way to code the contrasts so that I can look at site(climate) differences on densities within these seasons. contrast "Spring vs. Summer" interval -1 -1 -1 -1 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 -1 -1 ; contrast "Spring vs. Fall" interval 0 0 0 0 0 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 ; contrast "Spring vs. Winter" interval 0 0 0 0 0 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 0 0 ; contrast "Summer vs. Fall" interval 1 1 1 1 1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 ; contrast "Summer vs. Winter" interval 1 1 1 1 1 0 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 1 1; contrast "Fall vs. Winter" interval 0 0 0 0 0 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0 0 0; If it comes down to it I can just assign each interval to a categorical season, but I would like to avoid this if possible. Thanks!

Nebulus · ‎08-03-2017

I have a quick question about testing the assumption for paired t-tests that " The distribution of the differences in the dependent variable between the two related groups should be approximately normally distributed". Say I have two groups one and two. The way I am supposed to do this is to create a new variable "diff=one-two" then do.. Proc Univariate data=x normal; qqplot / Normal (mu=est sigma=est); var diff; run; Is this correct? Or do i use this instead to test normality? var one two; I interpreted it as I am supposed to test the normality of the differences between groups not each group. Thanks!

Nebulus · ‎06-25-2015

Thanks all.

Nebulus · ‎06-20-2015

I'm trying to get SAS to spit out t-values for my contrasts in LSMEANS. Is there any way to achieve this? It just reports p-values for each comparison. My code is LSMEANS species*wax / pdiff stderr;

Nebulus · ‎09-19-2014

Thanks so much. You actually hit the nail on the head. The model I originally posted is the Briere-1 Model: aT(T-Tmin)(Tmax-Tmin)**(1/2) He also proposed a more generalized model Briere-2: aT(T-Tmin)(Tmax-Tmin)**(1/m), where m is a 4th parameter to estimate. Using your code and adding m I predicted m ~ 2.75. By adding this term the RSS went from 0.500 to 0.4096. So yes by modifying the exponent you tend to get a better fit of the data.

Online Status	Offline
Date Last Visited	‎01-16-2020 07:47 PM

Help With PROC GLIMMIX Error Messages

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Are variance stabilizing transformations appropriate for GLMM?

Deleted

Issues with Proc GLIMMIX Intercept/LSmeans

Re: Setting up Complicated Contrasts

Re: Setting up Complicated Contrasts

Re: Reporting T-values in LSMEANS Statement in PROC GLM

Re: Reporting T-values in LSMEANS Statement in PROC GLM

Re: Reporting T-values in LSMEANS Statement in PROC GLM

Help With PROC GLIMMIX Error Messages

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Re: Are variance stabilizing transformations appropriate for GLMM?

Are variance stabilizing transformations appropriate for GLMM?

Deleted

Issues with Proc GLIMMIX Intercept/LSmeans

Re: Setting up Complicated Contrasts

Re: Setting up Complicated Contrasts

Setting up Complicated Contrasts

Testing Paired t-test Normality Assumptions

Re: Reporting T-values in LSMEANS Statement in PROC GLM

Reporting T-values in LSMEANS Statement in PROC GLM

Re: Issues with Convergence