BookmarkSubscribeRSS Feed
HortIPM
Calcite | Level 5

I'm having some trouble with some data analysis and would appreciate the assistance of those more skilled than myself.  I have a dataset of population measures over time (1 datapoint each week for 20 weeks).  This is a 2x2 factorial design.  I can easily compare the sum of the data, or data within a sample date; however, i'm having difficulty constructing this with repeated measures in Glimmix (SAS 9.4).  Any assistance or direction where I can find a good reference will be greatly appreciated.  Some example data points are below.  I would include the code I've tried, but at this stage, I believe I may need to redo from start.

 

 

X  Y  T1-T20

1  1  0 0 0 2 3 0 2 3 12 3 3 4 12 8 11 14 19 8 4 6
1  1  0 0 0 1 2 0 1 1 0 4 1 0 4 0 2 6 1 5 1 2
1  1  0 0 0 0 2 4 1 1 5 5 1 3 3 1 9 6 4 0 3 1
1  1  0 0 0 1 0 3 1 4 5 2 6 1 0 5 2 8 3 0 6 1
1  2  0 0 1 1 0 1 1 3 9 5 1 5 6 14 6 11 16 4 9 2
1  2  0 0 0 2 4 1 0 6 3 1 1 5 1 3 0 3 6 0 1 2
1  2  0 0 0 4 2 0 1 6 2 1 0 0 5 1 3 0 0 2 3 0
1  2  0 0 1 0 2 1 0 4 1 0 0 0 7 3 3 6 9 1 3 1
2  1  0 0 0 0 0 5 0 2 1 6 8 1 9 11 0 15 3 6 4 2
2  1  0 0 0 0 0 3 2 6 0 4 0 2 4 0 1 5 8 4 2 4
2  1  0 0 0 0 0 2 1 0 4 2 1 0 0 3 1 2 0 0 1 0
2  1  0 0 0 0 0 0 1 0 2 0 4 2 6 9 3 6 1 2 3 1
2  2  0 0 0 0 2 1 6 1 4 1 6 2 9 0 8 4 12 3 0 6
2  2  0 0 0 0 5 1 0 0 4 2 4 3 6 1 1 0 0 2 1 0
2  2  0 0 0 0 2 0 1 1 3 0 0 1 4 2 2 2 3 1 0 0
2  2  0 0 0 0 0 0 4 0 0 6 7 0 11 0 3 6 0 2 9 2

 

5 REPLIES 5
pau13rown
Lapis Lazuli | Level 10

as you suggest, you may consider summarising the data over time using the sum, you could also consider area under the curve, depending on what the data are. Then you would not need glimmix at all, because you would collapse the repeated measures into a single observation. Your data are interesting though, a lot of 0's, if you take a nonparametric approach there is a question about how to do the interaction for the factorial design, eg:

https://faculty.washington.edu/wobbrock/pubs/chi-11.06.pdf

http://www.sciencedirect.com/science/article/pii/S002210311000034X

either way, your data may be more simple than you realise (if you analyse a summary statistic) or more complicated than you realise (if the data are nonparametric eg ranks)?

HortIPM
Calcite | Level 5
Thank you for your response.

The data points for this are insect population numbers within the experimental unit measured on a weekly basis. My general impression is that analyzing the sum at the end of the study period (20 weeks) gives me the output i need. I would appreciate your thoughts, however. In previous work I’ve used the sum, but I would like to increase my understanding for future reference.


pau13rown
Lapis Lazuli | Level 10

'numbers' or counts tends to suggest poisson regression, or negative binomial regression, and that might lead you to proc nlmixed (i tend to use nlmixed over glimmix). It might depend on how far apart the timepoints are, maybe you want analyse the max instead of the sum if the timepoints are in quick succession so the time period is easily characterised by a single value and there is no interest in the time trend. There is a temptation to simplify things here

 

edit: i notice you said weekly observations, 20 weeks total, thus max or sum may seem reasonable. If there is some precedent in the literature though i'd use it to preempt any query from a reviewer

HortIPM
Calcite | Level 5
Thanks again for your quick responses. Using sum is fairly standard for insect population data (at least in my branch of work, there’s no accounting for ecologists) I’m curious however if there’s a better way to compare effects than looking at that. Randomly distributed insect populations give a Poisson distribution and aggregated gives a negative binomial distribution (this particular insect would be random). I’m not familiar with nlmixed, and will have to do some reading. If you have a suggestion for a direction, it would be much appreciated.
pau13rown
Lapis Lazuli | Level 10

it seems to me you have the right approach [edit: using sum], although you don't need glimmix (or nlmixed), i guess you'd use proc genmod, and if you decide to include the repeated measures you can do a GEE model with the 'repeated' statement. Re gee modelling, this is not the best example, but the first suggestion in a google search, shows gee modelling for longitudinal count data, a little analogous to what you have with (i think) fixed timepoints: http://journals.sagepub.com/doi/pdf/10.1177/1094428104263672

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1351 views
  • 0 likes
  • 2 in conversation