About sld

sld · ‎02-09-2018

Assuming the current code I described in my previous reply is correct, I start to write up the results. Now I wonder how I write the codes to examine posthoc analysis when interaction effect is significant. For example, if I want to come up with p-value for the effect of one variable (e.g., Trial) at each level of another variable (e.g., SLC). Will these two lines for such simple main effects? lsmeans trial1 / diff adjust=simulate(seed=98375) lines; lsmeans group*slc / plot=meanplot(sliceby=group join cl); I would say that you do not know yet what the correct statistical model is for your study, and that writing results is premature. In addition, I suspect that you do not fully understand your statistical model or the code that implements it, and understanding is absolutely a prerequisite. Pairwise mean comparisons are legitimate but are of little use in interpreting interaction in my opinion: interactions reflect unequal pairwise comparisons, and so merely looking at pairwise comparisons misses the point of interaction. I recommend using pertinent contrasts instead. In my code suggestion, I have included contrasts that I think could be sensible. Figure out what they do, and see whether you agree that they are pertinent.

sld · ‎02-09-2018

I think the dataset I uploaded in my previous reply contained some errors. I uploaded the right one again. I think the most recently uploaded dataset may still have problems. Check SubjectID=24 in Group=2. Why did you have each participant do 10 trials? What research questions do you want to answer about trials? A: It is to obtain enough number of trials for each target sequence. Just one trial may not work. Your response implies that the multiple trials are merely subsamples--that you do not have a hypothesis about the effect of trials on the mean of the response. (For example, subjects do not get better or worse over multiple trials, or subjects do not get weary over multiple trials.) If this implication is correct, then you would not include TRIAL1 as a fixed effect in the MODEL statement. if it is not correct, then you have not given enough thought to your study and the research questions you are trying to answer. A: I am not sure what you are asking here. The variable "Trial1" reflects the order of 10 trials that were produced for each sequence . So the trial 1 was produced before trial2, trial 2 was produced before trial 3, etc. This order does not necessarily match with the numbers in the file name though. The 12 sequences were randomly mixed up in each block. Block 1 had different order of 12 sequences from block 2, the block 2 had different order of 12 sequences from block 3, etc. Did you use the same order of sequences in all 10 trials? Before you do another study of this type, please spend some time learning about crossover designs. If Trial1 levels are in fact subsamples, this is the modeling approach I would consider. I would report the second of the two models below. /* Import data from CSV */ PROC IMPORT OUT= WORK.dissertation0 DATAFILE= "Dissertation.csv" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN; /* Extract analysis data */ data dissertation; set dissertation0; where whole_1st=1; drop filenames filenames_a subjectid2 trial trial2 ; run; /* Variable transformation */ data dissertation; set dissertation; gmp_syl_log = log(gmp_syl); run; title1 "Pattern of observations"; proc tabulate data=dissertation(where=(gmp_syl ne .)); class group subjectid syl trial1 block; table group*subjectid, block*trial1*syl / misstext="X"; run; /* I think SubjectID=24 in Group 2 has bad data */ data dissertation; set dissertation; if subjectid=24 and group=2 then delete; run; title1 "Trial1 levels as random subsamples"; proc glimmix data=dissertation plots=(studentpanel boxplot(fixed student)); class group subjectid syl trial1; model gmp_syl_log = group|syl / ddfm=kr2; random intercept / subject=subjectid(group); random syl / subject=subjectid(group); random intercept / subject=trial1(subjectid group); lsmeans group*syl / plot=meanplot(sliceby=group join cl); lsmestimate group*syl "Group effect: SYL 1 v 2" 1 -1 0 -1 1 0, "Group effect: SYL 1 v 3" 1 0 -1 -1 0 1, "Group effect: SYL 2 v 3" 0 1 -1 0-1 1 / adjust=simulate(seed=29847); run; /* Variances for SYL levels appear unequal Fit the heterogeneous variances model and compare AICc values. The heterogeneous variances model provides better fit but does not alter the conclusions. */ title1 "Trial1 levels as random subsamples"; title2 "with heterogeneous SYL variances"; proc glimmix data=dissertation plots=(studentpanel boxplot(fixed student)); class group subjectid syl trial1; model gmp_syl_log = group|syl / ddfm=kr2; random intercept / subject=subjectid(group); random syl / subject=subjectid(group); random intercept / subject=trial1(subjectid group); random _residual_ / group=syl; lsmeans group*syl / plot=meanplot(sliceby=group join cl); lsmestimate group*syl "Group effect: SYL 1 v 2" 1 -1 0 -1 1 0, "Group effect: SYL 1 v 3" 1 0 -1 -1 0 1, "Group effect: SYL 2 v 3" 0 1 -1 0-1 1 / adjust=simulate(seed=29847); run; For this model, your approach to assessing normality and homogeneity of variance assumptions does not assess the appropriate values because your approach is based on residuals, which are subsamples and not true experimental units.

sld · ‎02-08-2018

Why did you have each participant do 10 trials? What research questions do you want to answer about trials? Is SLC the sequence length condition (3-, 6-, or 9-syllable)? Did you use the same order of sequences in all 10 trials? And, to confirm, the same order of sequences with all participants? In other words, there was only one order of sequences for the whole study? Do you think that order of sequences would affect the response? How did you choose which 3 sequences of the 12 to use for analysis?

sld · ‎02-08-2018

Now I'm confused about TRIAL1 and the random mixing of tasks and "experimental blocks" that have not been previously addressed. My confusion is fine, I don't need to have a clear understanding at this point, but given that I am not at all sure about your experimental design, the statistical model that I suggested might well be incorrect. There is no "perfect" model. The treatment and design structure of the model should match the experimental design. But "good enough" could be sufficient when it comes to distributional assumptions, covariance structures, etc. Good luck!

sld · ‎02-07-2018

Regarding the covariance structure, there is not specific interval and the time intervals among trials vary and depend on each individual. This is because each individual had the capability to move on to the next trial when they completed responding to the current task in each trial. In this type of unequally time spaced dataset, do you have any recommendation in terms of what code should be used to specify covariance structure? Can I just use "UN" instead of "AR(1), ARH(1) for heteroscedastic data, or SP(pow) (c-list)"? I'd say that you need to rethink what TRIAL1 represents. To this point, I have been thinking of it as a repeated measurement through time in response to a consistent treatment. But apparently each level of TRIAL1 represents a level of another factor ("task"), complicated by the fact that the response of a subject to the different tasks may be autocorrelated and that the degree of autocorrelation may vary by subject. The complexity of this study has now exceeded what we can resolve in this forum. You will want to find a statistical consultant at your institution; this problem needs a lot of discussion, and the context of the study is important and needs to be considered. If I ask one more question, I want to report the F statistics in my document. I wonder if I can assume the second degrees of freedom I need to report is written under the heading of Den DF in the second column of the SAS output. So if I write, "a significant group by SLC interaction (F(2,359)=18.14, p<.0001), the main effect of group (F(1,46.2)=48.96, p<.0001), and the main effect of SLC (F(2,359)=148.47, p<.0001) were observed." will this be a correct statement? Yes, the format for the F statistic is F(numerator df, denominator df).

sld · ‎02-07-2018

The error I received is this: ERROR: Data set WORK.R is not sorted in ascending sequence. The current BY group has SLC = 3 and the next BY group has SLC = 1. The ERROR tells you what you need to know: when you use a BY statement, the data set must be sorted in advance in the same order specified by the BY statement. Try sorting the dataset R and see if that works. In addition, I changed the covariance structure to sp(pow)(trial1) because the trials had unequally spaced time intervals. There were different numbers of fillers. The variable for distance in SP(POW)(c-list) must be numeric and it must reflect the actual "distance" represented by each level of TRIAL1. Read MIXED | REPEATED | TYPE= Using SP(POW)(TRIAL1) will not work because (1) TRIAL1 is in the CLASS statement and so is not numeric, and (2) the levels of TRIAL1 (1,2,3,...,10) do not reflect the different numbers of "fillers".

sld · ‎02-07-2018

The GLIMMIX documentation (for 14.3 anyway) has more detail. Read GLIMMIX | LSMEANS | ADJUST= then ask again if you still can't make sense of it.

sld · ‎02-06-2018

A paired t-test is a within-subjects design, so you are right on that point. But it has only one fixed effects factor with two levels and so only one comparison. Your scenario is more complicated, and the stat model cannot be a paired t-test.

sld · ‎02-06-2018

I'm glad you've found it useful. (1) I rarely use formal tests to assess normality and homogeneity of variance, relying instead on graphical assessments of residuals (and lots of analysis experience). Many (even most?) of my applied stat colleagues take the same approach. (2) I made it up. It is the seed for a random number generator used in the simulate method. Read the documentation for the LSMEANS statement in MIXED (or GLIMMIX) and GLM. (3) If you run the code, you'll see that interaction in the results. " | " is an expansion operator. Read Specification of Effects (4) No, the empirical estimate has nothing to do with whether the data follow a normal distribution. The empirical estimate and the Kenward-Rogers method are two different approaches to reducing the bias of standard errors. Read the Stroup paper I linked to in a previous message. You use one or the other; you cannot use both at the same time. Your sample size is too small for the classic empirical estimate to work. I've seen references stating that you need 2 orders of magnitude more observations than times. No, specification of subject=<> has nothing to do with empirical or KR. It has everything to do with correctly specifying your experimental design. You have 48 subjects, but you have coded them 1, ..., 24 within each of two groups. If you use "subject=subjectid" then you have told the procedure that you have only 24 subjects which is not correct. To fix that, you could either recode subjectid to have 48 unique values, or you can use "subject=subjectid(group)"; the latter approach works better with the estimation of denominator degrees of freedom and is algorithmically more efficient. Keep reading Littell et al. (5) Using GLIMMIX requires no more justification than using MIXED. GLIMMIX is capable of accommodating data with distributions other than normal, but it does normal as well and generally duplicates exactly what you would get for the same model using MIXED. (6) Ideally levels of an experimental factor are randomly assigned to experimental units, but in practice that often is impossible (for example, gender or time or your "group"). Typically we don't fret about that (we "pretend" that we have random assignment), but it is important to keep in mind that random assignment minimizes bias, and that lack of random assignment may induce bias that needs to be considered in interpretation of results. I do not see that the model needs to be adjusted. The statistical model required for the analysis of this study is fairly complicated, and certainly more complicated than you were and are prepared for. Keep reading! Then read more! Take a class, either in person or online (e.g., https://onlinecourses.science.psu.edu/stat502/). And if you are at a university, see whether you can find a statistician to consult with. It's a lot easier across a table than across the internet.

sld · ‎02-05-2018

This paper is a good resource for different covariance structures.

sld · ‎02-05-2018

The Littell et al. text is not a one-day project 🙂 I'm looking forward to the next version. Yes, I saw your message and then when I looked again, it was gone. Odd. GLIMMIX offers all the same covariance structures as does MIXED, except for the Kronecker products. And GLIMMIX has a few more than MIXED. GLIMMIX also provides the option of sandwich estimators ("empirical"), including modifications that perform better with small samples that are not available in MIXED. In my opinion, with few exceptions, GLIMMIX is a better tool than MIXED; I rarely use MIXED anymore. The empirical options does not work at all well for analysis of this data set. According to an old paper by Walt Stroup (and other sources): "This approach assumes that the number of subjects per treatment is substantially greater than the number of times of observation. When the number of observation times is equal to or greater than the number of subjects per treatment, as often happens in agricultural experiments, the empirical estimate of Var(K'b) may actually be less than the model-based estimate and the resulting test-statistics may be wildly inflated." Using ddfm=kr2 works much better. Try both and see. (In GLIMMIX, empirical=mbn works well, too, but not the classic empirical.) You need to specify the existence of 48 subjects rather than 24, which you can do by specifying subject=subjectid(group) rather than subject=subjectid If the response mean could theoretically change with levels of Trial1, then Trial1 needs to be included as a fixed effect in the MODEL statement. Here is what I would consider as a first pass, not necessarily final: /* Import data from CSV */ PROC IMPORT OUT= WORK.dissertation DATAFILE= "Dissertation.csv" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN; /* Delete unfilled columns and unfilled rows */ data dissertation; set dissertation; drop var20-var56; where whole_1st= 1; run; /* Variable transformation */ data dissertation; set dissertation; gmp_syl_log = log(gmp_syl); gmp_syl_sqrt = sqrt(gmp_syl); run; title1 "Coding check"; proc freq data=dissertation; table group subjectid slc block trial1; run; title1 "Pattern of observations"; proc tabulate data=dissertation(where=(gmp_syl ne .)); class group subjectid slc trial1 block; table group*subjectid, slc*trial1 / misstext="X"; run; proc sort data=dissertation; by group subjectid slc trial1; run; title1 "Observed data"; proc sgpanel data=dissertation noautolegend; panelby slc group / columns=2; series x=trial1 y=gmp_syl / group=subjectid markers; run; title1 "Observed data, log scale"; proc sgpanel data=dissertation noautolegend; panelby slc group / columns=2; series x=trial1 y=gmp_syl_log / group=subjectid markers; run; proc sgpanel data=dissertation; panelby group subjectid / columns=6; series x=trial1 y=gmp_syl_log / group=slc markers; run; title1 "AR(1) among TRIAL1 levels using MIXED"; proc mixed data=dissertation covtest ;*plots=all; class group subjectid slc trial1; model gmp_syl_log = group|slc|trial1 / ddfm=kr2; random intercept / subject=subjectid(group); repeated trial1 / subject=slc*subjectid(group) type=ar(1); lsmeans trial1 / diff adjust=simulate(seed=98375); lsmeans group*slc ; run; title1 "AR(1) among TRIAL1 levels using GLIMMIX"; proc glimmix data=dissertation plots=(studentpanel boxplot(fixed student)); class group subjectid slc trial1; model gmp_syl_log = group|slc|trial1 / ddfm=kr2; random intercept / subject=subjectid(group); random trial1 / subject=slc*subjectid(group) type=ar(1) residual; lsmeans trial1 / diff adjust=simulate(seed=98375) lines; lsmeans group*slc / plot=meanplot(sliceby=group join cl); lsmestimate group*slc "Group effect: SLC 1 v 2" 1 -1 0 -1 1 0, "Group effect: SLC 1 v 3" 1 0 -1 -1 0 1, "Group effect: SLC 2 v 3" 0 1 -1 0-1 1 / adjust=simulate(seed=29847); run; This model has three hierarchical design levels: Subject (assigned randomly to Group), Session within Subject (assigned--not randomly--to SLC), and RepeatedMeasure nested within Session within Subject (assigned--not randomly--to Trial1). gmp_syl is strongly skewed, but also appears to have an upper bound that plays a bit of havoc with the distribution. I do not know how this variable was obtained, but this upper bound is something to consider. On the log-transformed scale, there is no substantial evidence of heteroscedasticity. AR(1) was “best” relative to CS, ARH(1), AR(1)+RE, AR(1) by SLC, TOEP based on AICc. I show code for both MIXED and GLIMMIX (which has some nice features that are not available in MIXED). Feel free to ask questions after you have studied the syntax, consulted the documentation and the Littell et al. text, and such. Have fun!

sld · ‎02-04-2018

One solution is to replace cells in Excel containing "." with blank cells (containing nothing). SAS will assign missing values to those fields.

sld · ‎02-04-2018

I moved the post from https://communities.sas.com/t5/SAS-Statistical-Procedures/PROC-GLM-repeated-measures-one-class-two-models/m-p/433653#M22817 to keep threads tidy. I would not use the GLM procedure. I think in terms of mixed models, and the model for this study could be constructed as a split-plot design, with whole plots (sessions) in blocks (subjects) with repeated measures. This format would allow you to accommodate the order of assignment of food treatments to sessions, as well as the temporal correlation among the three measurement times. Maybe TIME would work well with a regression model, but there are only 3 levels and there's not much you can do other than a linear relationship. Maybe that would be appropriate for these data, maybe not. For mixed models in SAS, SAS for Mixed Models, 2nd ed is an invaluable resource; an updated version ( SAS® for Mixed Models: An Introduction) is supposed to be released soon. If you run mixed models, you will benefit from reading it. At a more advanced level, Generalized Linear Mixed Models: Modern Concepts, Methods and Applications is also invaluable.

sld · ‎02-04-2018

Thank you for the additional information. Were the SLC levels applied in the same order to each participant? First try attaching the data as a csv file, in case anyone else wants to participate. For mixed models in SAS, SAS for Mixed Models, 2nd ed is an invaluable resource; an updated version ( SAS® for Mixed Models: An Introduction) is supposed to be released soon. If you run mixed models, you will benefit from reading it.

sld · ‎02-04-2018

Yes, you're right, it does look fine. Pasting the output into the message does not format well. You are still using TYPE=ARH(1) in the REPEATED statement. Try something simpler. ARH(1) estimates a variance for each level of Trial1 (of which there are 10), and you do not have enough subjects (n=24) to estimate all the parameters specified by the model. Edit: Or it is still possible that your model is not specified correctly. For example, I suspect that you have 24 x 3 x 2 = 1440 subjects, but the syntax you are using, given the way you have coded the variable levels, tells the procedure that you have only 24. But you have not described the experimental design, so I am just guessing.

Online Status	Offline
Date Last Visited	‎01-22-2021 05:52 PM

Re: The appropriate econometric model when the dependent variable is p...

Re: Repeated measures analysis with SAS: specifying the variable

Re: MIxed models fixed and random effects

Re: generalized and general linear mixed effects model on RBD

Re: MIxed models fixed and random effects

Re: generalized and general linear mixed effects model on RBD

Re: generalized and general linear mixed effects model on RBD

Re: Longitudinal growth model using proc mixed

Re: Longitudinal growth model using proc mixed

Re: Interpreting PROC GLIMMIX output

Fix for blurry editor fonts in Windows 10

Re: Clarification needed for glimmix covariance parameters test

Re: Clarification needed for glimmix covariance parameters test

Re: The appropriate econometric model when the dependent variable is p...

Re: Requesting aid in understanding how to use SAS to build a multiple...

Re: lsmeans "adjust=" not working

Re: The appropriate econometric model when the dependent variable is p...

Re: Repeated measures analysis with SAS: specifying the variable

Re: generalized and general linear mixed effects model on RBD

Re: MIxed models fixed and random effects

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: proc mixed /diff; Differences of Least Squares Means output speci...

Re: error when running repeat statement

Re: error when running repeat statement

Re: error when running repeat statement

Re: ERROR: Variable x in list does not match type prescribed for this ...

Re: Two within subject factors, repeated measures ANOVA

Re: error when running repeat statement

Re: error when running repeat statement