BookmarkSubscribeRSS Feed
GavinB
Fluorite | Level 6

I am struggling with something which I believe is of no issue, but I cannot wrap my mind around it. The following is the issue:

Lets assume I run a trial in which I measure growth of a broiler from day 0-42, measuring every 7 days. Each 7 days, I myself calculate the daily growth, simply by dividing the growth of those 7 days by 7. When I analyze bodyweight of these birds using the repeated statement, the lsmeans of the model align with the actual means that I have measured. When I analyze the daily weight gain, the lsmeans of the individual timepoints align with my actual means, but the lsmeans of the average daily weight gain do not align with the calculated 0-42 weight gain (a +- 10% difference is not abnormal). Is it still valid to use the repeated measurement statement with a variable that is not measured, but calculated?

 

 

4 REPLIES 4
ballardw
Super User

The code you are using is never wasted. It may be that you missed an option.

Some example data so that we can see what actually happens that goes with the code is best practice so we don't make lots of assumptions about what you may be doing.

 

Every time I see something about rates (per day for example) flags start going off about what may go not quite right with combining those rates.

And when you do "daily growth" does your data have "daily records"?

GavinB
Fluorite | Level 6

Example of a code that I would run: 

ods graphics /imagemap=on;
proc mixed data=performance2 plots=all;
class TT time box block; 
model DWG=TT|time/ ddfm=kenwardroger s cl influence(est effect=box iter=5) outp=mixout;
repeated time /subject=box type=un;
random INT/subject=block s cl;
store mixmodel;
run;
proc plm restore=mixmodel;
effectplot interaction(x=time sliceby=TT) / ilink clm connect;
slice TT*time/ sliceby=time pdiff=all cl ilink means plots=diffogram lines adjdfe=row adjust=simulate;
lsmeans TT/ pdiff=all adjust=simulate means lines;
run;

My time would be the timepoints at which I have measured the bodyweight. If I run this piece of code with bodyweight the lsmeans are quite close to the actual means. If I run it with daily weight gain (as I said before, I calculate this value by substracting previous BW with new BW and dividing this by the # of days) the lsmeans of the entire period (Not the individual timepoints) are quite far off...... Example of the dataset:

image.png

PaigeMiller
Diamond | Level 26

@GavinB wrote:

Example of a code that I would run: 

ods graphics /imagemap=on;
proc mixed data=performance2 plots=all;
class TT time box block; 
model DWG=TT|time/ ddfm=kenwardroger s cl influence(est effect=box iter=5) outp=mixout;
repeated time /subject=box type=un;
random INT/subject=block s cl;
store mixmodel;
run;
proc plm restore=mixmodel;
effectplot interaction(x=time sliceby=TT) / ilink clm connect;
slice TT*time/ sliceby=time pdiff=all cl ilink means plots=diffogram lines adjdfe=row adjust=simulate;
lsmeans TT/ pdiff=all adjust=simulate means lines;
run;

My time would be the timepoints at which I have measured the bodyweight. If I run this piece of code with bodyweight the lsmeans are quite close to the actual means. If I run it with daily weight gain (as I said before, I calculate this value by substracting previous BW with new BW and dividing this by the # of days) the lsmeans of the entire period (Not the individual timepoints) are quite far off...... Example of the dataset:

image.png


We don't have your data, so we can't run your code; please provide a representative portion of your data that illustrates the problem following these instructions (and not via an attached file). https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

You also haven't shown us the allegedly incorrect output and correct output, which I think is required for us to understand the issue. And you are making an incorrect assumption: "the lsmeans are quite close to the actual means", which does not have to be true in all situations, and depends on a lot of things, including the presence of "balanced data". Is there something else allegedly incorrect, or is the only thing that is incorrect that the lsmeans are not close to the actual means?

--
Paige Miller
PaigeMiller
Diamond | Level 26

There is nothing inherently wrong with using calculated variables in a modeling exercise. Thus, my conclusion is that you are doing something wrong somewhere.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 648 views
  • 0 likes
  • 3 in conversation