BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
user42
Calcite | Level 5

The concentration of a certain nutrient is measured in soil each fall after harvest in a set of research plots. I want to model the change in nutrient concentration over a number of years, but I'm inclined to express the time variable in days since the start of the study, as the number of days between measurements is not constant.

Here's the SAS code of my present approach:

 

PROC MIXED DATA = soil;
  CLASS year plot;
  MODEL concentration = daysIntoStudy / DDFM = kr;
  REPEATED year / SUB = plot TYPE = ar(1) r;
RUN; 

 

I want to indicate an autoregressive covariance structure for the residuals as the same plots are being sampled each year, and I know that the repeated effect must be categorical, but I'm not sure if this is the right way to do it given that the variable year does not explicitly appear in the model.

 

Is this correct? Also, should the predictor daysIntoStudy be specified as a random effect?

1 ACCEPTED SOLUTION

Accepted Solutions
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

It depends 🙂  This is my take:

 

Your REPEATED statement specifies two parameters: the variance among observations among subjects taken in the same year, and the covariance between two any observations (any two years) taken on the same subject. If year is not included in the MODEL statement as a fixed effect, then you are assuming that year does not affect the mean value of concentration; that year affects only the variance of concentration

 

For some of the disciplines I work with (e.g., field work in natural resources), that assumption--that mean response is not a function of year--is patently untenable. But typically we work with data observed over very few years. If we had 100 years and did not believe in global climate change, then that might be a totally different scenario.

 

Given that your study involves soils, I suspect you, too, have few years. Are you measuring each plot in multiple years? How many years? How many days since harvest do you measure? Have you plotted concentration versus days since harvest each year for each plot, and if so, are the profiles roughly parallel for years within each plot? These are things I would consider as I pondered model structure.

 

Generally for what I think might be your scenario, I would consider days_since_harvest and year as fixed effects factors, and plot as a random effect factor.

 

View solution in original post

1 REPLY 1
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

It depends 🙂  This is my take:

 

Your REPEATED statement specifies two parameters: the variance among observations among subjects taken in the same year, and the covariance between two any observations (any two years) taken on the same subject. If year is not included in the MODEL statement as a fixed effect, then you are assuming that year does not affect the mean value of concentration; that year affects only the variance of concentration

 

For some of the disciplines I work with (e.g., field work in natural resources), that assumption--that mean response is not a function of year--is patently untenable. But typically we work with data observed over very few years. If we had 100 years and did not believe in global climate change, then that might be a totally different scenario.

 

Given that your study involves soils, I suspect you, too, have few years. Are you measuring each plot in multiple years? How many years? How many days since harvest do you measure? Have you plotted concentration versus days since harvest each year for each plot, and if so, are the profiles roughly parallel for years within each plot? These are things I would consider as I pondered model structure.

 

Generally for what I think might be your scenario, I would consider days_since_harvest and year as fixed effects factors, and plot as a random effect factor.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1150 views
  • 0 likes
  • 2 in conversation