Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- proc mixed repeated measures no SEs or DFs

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 08-07-2017 10:04 AM
(2684 views)

I am trying to run a repeated measures model in proc mixed with a large data set (~2.5 million observations) using the following code.

proc mixed data=IBMmods.ac method=REML;

class year source month id;

model DO = source / ddfm=kr solution;

repeated month / subject=id type=cs;

random year;

run;

The model runs and the output indicates that convergence criteria were met but when I look at the Solutions for Fixed Effects the model has produced parameter estimates, albeit strange ones, but the SE for each estimate is 0 as are the degrees of freedom. t-values and p-values are not produce. I've tried running the model with different covariance structures with the same result. If I omit the random statement, the model runs fine and I get estimates that make sense with their SEs and DFs. I've also tried running the model with a subset of the data (~270,000 observations) with the same results. Any help or insight would be greatly appreciated. I've attached a dummy data set so that you can see the structure of the data I'm working with. Thanks.

- Tags:
- DF
- proc mixed
- SE

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It sounds to me like one of your variables is always missing or constant.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You have typos in your example dataset, but I'll presume that's not the case in the actual dataset. (id 8 and 23 are assigned to both source 1 and 2, and I am guessing that each id should be associated with only one source.)

In your example dataset, each id is associated with only one source and only one year, and there are four repeated measures on each id (one for each of four months). Consequently, id is nested within year. Your current code specifies that id, year and month are random effects factors, and that source is a fixed effects factor. Because neither year or month are in the MODEL statement, you are assuming that the mean of DO does not vary by year or by month: year and month affect only the variance of DO. Your current code specifies that year and id are *crossed* random effects factors, but most of the year x id combinations have no data:

```
proc tabulate data=test;
class id year month source;
table source*id, year*month;
run;
```

I suspect that these missing combinations may be the source of your estimation problem, but I am not sure.

Assuming that my interpretation of your study design is correct, this is the model I would first consider:

```
proc mixed data=test;
class source id year month;
model DO = source;
random intercept source / subject=year;
repeated month / subject=id(year source) type=cs;
run;
```

I definitely would ponder whether year and/or month should be fixed effects factors rather than random effects factors, but your actual data set may have many more years and/or months than is evident in your example data set. In another thread, I made comments on the year random or fixed topic here: https://communities.sas.com/t5/SAS-Statistical-Procedures/How-to-analyze-a-split-plot-study-with-yea...

HTH

Edited: I change the RANDOM syntax to one that likely works better with big datasets.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

No worries. Thanks for looking into this. I appreciate any suggestions that may lead to appropriate estimates.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Me, too 🙂

There might be clues in the actual output or log, if you would like to post those.

Is the large number of observations due to many, many id levels? How many years, and how many months?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is all that's displayed in the log when I run the model. It didn't appear that there was anything idicating what the issue might be.

3 proc mixed data=IBMmods.actest method=REML;

NOTE: Writing HTML Body file: sashtml.htm

4 class year source month id;

5 model DO = source / ddfm=kr solution;

6 repeated month / subject=id type=cs;

7 random year;

8 run;

WARNING: Class levels for ID are not printed because of excessive size.

WARNING: ODS graphics with more than 5000 points have been suppressed. Use the PLOTS(MAXPOINTS= ) option in the PROC MIXED

statement to change or override the cutoff.

NOTE: Convergence criteria met.

NOTE: PROCEDURE MIXED used (Total process time):

real time 36.96 seconds

cpu time 36.45 seconds

And the model output is attached.

There are many ids in the model spanning 4 months for each of 27 years. Individuals are different for each level of year and source. Could the large number of individuals cause problems when trying to look at the random effect of year?

I also just noticed in this output that while it says that there are 731939 IDs in the class level information, there is only one subject in the dimensions category. Additionally the output indicates that all the observations are attributed that one subject. Any thoughts on why this is happening?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hmm.

It seems odd that the parameter estimates for intercept and source have 0 SE and 0 df, and yet the overall test of source in the Type III Tests table does not look unusual--*except* for denom df = 973 which strikes me as much too small.

Is each ID coded uniquely, as in your example dataset?

Should you have four months of data for each ID? (731939 IDs time 4 months does not equal 2443672 observations, but no missing values are reported.)

I'm beginning to suspect a structural problem with the dataset, perhaps only because I don't have any other ideas.

If you haven't already, I'd compute descriptive statistics to follow up on Paige's comment about one of the variables being always missing or constant.

For your model with REPEATED / TYPE=CS, the code below is a different parameterization of the same model (as long as the CS parameter is not negative). I'd try it, and see if I got the same results.

```
proc mixed data=test;
class source id year month;
model y = source / ddfm=kr solution;
random intercept / subject=year;
random intercept / subject=id(year source);
run;
```

And there's always SAS Tech Support!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.