Solved: Proc mixed, defining data structure for desired comparison (Random eff...

Laser_Taco_ · Posted 11-15-2023 12:01 PM

Hello,

Exp details: 24 pens, 3 feed types (800, 400, 100), 8 replicates, RCBD

The purpose is to evaluate a preference for feed type (800, 400, 100).

To make the comparisons, three "treatments" were created (A = 800 vs 100, B= 800 vs 400, C =400 vs 100), the A,B, C treatments were used to allocate 2 of the 3 feeds to 1 pen in a RCBD.

For 5 days, a known amount of feed was added to each feeder/pen, and after 24 h the feed was weighed back to determine the feed disappearance. Preference was defined as ((disappearance of ___ feed (800, 400, 100; respectively))/ Total feed disappearance (combined total of the 2 feeders/pen)*100).

Giving the proportion of of each feed /pen/day.

In the proposal (was proved to me...), the Statistical Analysis is suppose to be "...preference test as the fixed effect and replicate as the random effect. LSmean and means are to be separated using pdiff with Tukey's adjustment with pen as the EU." IMHO this somewhat lacking and not very descriptive.

CODE:

input

PEN	DIET	Treatment	Rep	Feed	pref1	pref2	pref3	pref4	pref5
1	1	3	1	800	68.7500	57.6471	67.8571	57.9545	78.9157
1	3	3	1	100	31.2500	42.3529	32.1429	42.0455	21.0843
2	1	1	1	800	79.7101	44.7368	44.4444	41.3043	48.3333
2	2	1	1	400	20.2899	55.2632	55.5556	58.6957	51.6667
3	2	2	1	400	100.0000	67.5676	85.1852	80.5825	56.0847
3	3	2	1	100	0.0000	32.4324	14.8148	19.4175	43.9153
4	1	1	2	800	92.5926	57.7465	65.5172	73.9130	59.0278
4	2	1	2	400	7.4074	42.2535	34.4828	26.0870	40.9722

....

;
Run;

Proc Print;
Run;

%macro model_loop;
%let yvar1 = pref1;
%let yvar2 = pref2;
%let yvar3 = pref3;
%let yvar4 = pref4;
%let yvar5 = pref5;

%do i=1 %to 5 ;

Proc Mixed data=Exp200 NOBOUND ASYCOV method=REML;
Title "&&yvar&i";
class Pen Trt Rep Feed;
model &&yvar&i = Feed/ddfm=Kenwardroger;
random Rep / subject=Pen;
lsmeans Feed/pdiff adjust=tukey;
store out=work.glm&&yvar&i;
run;

proc plm restore=work.glm&&yvar&i;
Title "&&yvar&i PLM";
lsmeans Feed/ pdiff lines adjust=tukey;
run;

%end;
%mend model_loop;
%model_loop;
quit;

The issue I'm having is a failure to converge . I have a feeling that it has to do with the specification of the subject.

I guess I don't have a super specific question. I just don't feel like this is the correct way to analyze the data and looking for advice.

Regards,

SteveDenham · Posted 11-21-2023 08:46 AM

Rather than a ratio of each component to the total you might consider a ratio of consumption of the feed with the larger feed level to the smaller feed level for each pen. You get a single value that is indexed by the variable 'Treatment'. Or consider the difference in consumption calculated in much the same way. That might even be better as it is much more likely to collectively have residuals that are distributed normally. The lsmeans by treatment are then a measure of how much more was consumed of the higher valued feed - which is a clear measure of preference.

SteveDenham

View solution in original post

jiltao · Posted 11-15-2023 04:13 PM

It seems to me that for every Pen, you have two observations. The values for these two observations for each of Pref1-Pref5 always add up to 100. That might be a problem. You might want to reconsider how you want to arrange your data and set up your model.

Thanks,

Jill

Laser_Taco_ · Posted 11-15-2023 04:35 PM

Hi Jill,

Thank you for the response. I also think that is the issue.

Since these values are percentages that add to 100% (when combined), they aren't independent.

Any idea or suggestion on how I can re-arrange?

Regards,
Kyle

jiltao · Posted 11-16-2023 11:20 AM

One approach might be to just use one observation for each Pen.

Laser_Taco_ · Posted 11-16-2023 07:31 PM

Thank you again for the response. How might only using 1 or the observations impact the interpretation or the results?
Regards,

SteveDenham · Posted 11-17-2023 09:52 AM

I hope I understand this correctly. Each pen is randomly assigned to one of 3 treatments (A, B, C). Thus, a pen will have one value. I don't see the advantage of calculating a value for each as a proportion. Have you considered fitting the raw consumption values (which should be independent), and then calculating the ratios from the least squares means?

The other point to consider is that ratios probably do not yield normally distributed residuals, as the cut-offs at 0 and 1 may be a problem. If you are going to analyze the proportions as the dependent variable, a beta distribution for the values may be more appropriate. That would mean shifting to GLIMMIX or GEE.

And last, failure to converge. That very likely can be fixed by adding MAXITER= and MAXFUNC= options to the PROC MIXED statement, provided there are no other issues. Check both the .log and .lst file for notes, warnings and notifications that indicate that the algorithm has difficulties other than slow convergence.

SteveDenham

Laser_Taco_ · Posted 11-17-2023 11:03 AM

Hello @stevendenham,
Thank you for your response!

Yes, you are correct. There are 3 pens/block (i used the word "Rep" in my initial post) with 8 blocks in total.

The reason I have two values for each pen is due to the different consumptions from the two feeders in each pen. If I were to use only 1 value from the pen, I am unsure how I would differentiate the consumption between the two feeders within the pen (800 v 100 etc.).

To your point on proportions, I also considered this and agree especially with a handful of cases were the consumption for the feeders within a pen are 0 and 1, respectively. Admittedly, I am a bit unfamiliar with GLIMMIX and GEE, but should the code look something like this:

Proc GLIMMIX data=Exp200 ;
class Pen Trt Block Feed;
model y= Feed / ddfm=KR;
random block/ subject=Pen; **Unsure if " /subject=pen" is necessary**
lsmeans Feed/pdiff lines adjust=tukey;
run;

As for the failure to converge, I was able to get it to converge (with no errors) by removing the NOBOUND option, but the Cov Parm Estimate for Block/subject=pen was zero which is odd.

Regards,
Kyle

SteveDenham · Posted 11-20-2023 11:36 AM

~~hi Kyle @Laser_Taco_ ,~~

Just a few things that occurred to me over the weekend. In GLIMMIX, for ratios you should probably specify a distribution, and change the method to LAPLACE with tech=nrridg. So perhaps something like this:

Proc GLIMMIX data=Long_version method=laplace;
nloptions maxiter=1000 tech=nrridg;
class Pen Trt diet Block Feed instance;
model y=Trt Feed instance Feed*instance/dist=beta ddfm=KR;
random instance/subject=pen*diet  type=ar(1);
random feed/subject=pen type=chol;
random block;
lsmeans Feed/pdiff lines adjust=tukey;
run;

The first thing to do is to re-arrange the data into a long format, where there is one dependent value in each observation (here it is y). The other variables in each line would be pen, trt, diet (although this looks completely confounded with feed), block, feed and instance, which is indexed from 1 to 5. So instead of:

~~PEN~~	~~DIET~~	~~Treatment~~	~~Rep~~	~~Feed~~	~~pref1~~	~~pref2~~	~~pref3~~	~~pref4~~	~~pref5~~
1	1	3	1	~~800~~	~~68.7500~~	~~57.6471~~	~~67.8571~~	~~57.9545~~	~~78.9157~~
1	3	3	1	~~100~~	~~31.2500~~	~~42.3529~~	~~32.1429~~	~~42.0455~~	~~21.0843~~

~~You would have (for the first pen, and all other pens would follow this pattern:~~

~~PEN~~	~~DIET~~	~~Treatment~~	~~Rep~~	~~Feed~~	~~instance~~	y
1	1	3	1	~~800~~	1	~~0.6875~~
1	1	3	1	~~800~~	2	~~0.576471~~
1	1	3	1	~~800~~	3	~~0.678571~~
1	1	3	1	~~800~~	4	~~0.579545~~
1	1	3	1	~~800~~	5	~~0.789157~~
1	3	3	1	~~100~~	1	~~0.3125~~
1	3	3	1	~~100~~	2	~~0.423529~~
1	3	3	1	~~100~~	3	~~0.321429~~
1	3	3	1	~~100~~	4	~~0.420455~~
1	3	3	1	~~100~~	5	~~0.210843~~

~~And so on. Note that all of the y values have been rescaled to the (0, 1) interval. So now there are two levels of feed for each pen and five levels of instance for each pen by diet combination.~~

~~If I were better informed about how to use it, I might recommend PROC COPULA for the analysis, but that is only based on what the examples look like.~~

I thought I had a way to address this, but so long as each pref within a treatment is such that the values for the two diets within a treatment sum to 1, all of the above will have some major issues due to lack of independence. If the preference response was converted to a ratio or some other type of value, so that a single value is obtained per pen at each preference measurement (time), you should be able to improve the analysis.

SteveDenham

Laser_Taco_ · Posted 11-20-2023 04:18 PM

Hi @SteveDenham ,

I'm not sure if I understand what you mean by "converted to a ratio or some other type of value, so that a single value is obtained per pen at each preference measurement (time), you should be able to improve the analysis.".

I understand that having multiple responses from the same pen is my issue here. For converting the values to a ratio, each value per diet/pen is already a ratio == Diet__ :Total Consumed.

Regards,

SteveDenham · Posted 11-21-2023 08:46 AM

Rather than a ratio of each component to the total you might consider a ratio of consumption of the feed with the larger feed level to the smaller feed level for each pen. You get a single value that is indexed by the variable 'Treatment'. Or consider the difference in consumption calculated in much the same way. That might even be better as it is much more likely to collectively have residuals that are distributed normally. The lsmeans by treatment are then a measure of how much more was consumed of the higher valued feed - which is a clear measure of preference.

SteveDenham

Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Re: Proc mixed, defining data structure for desired comparison (Random effect and subject)

Ready to join fellow brilliant minds for the SAS Hackathon?