turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Multilevel models in SAS

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-27-2015 06:38 AM

I am building a 3 level model in SAS using the Mixed procedure. In SAS, apparently, the diagnostic plot for testing normality is only made for the level 1 residuals. I was able to request for the level 2 residuals using the "output SolutionR" option in the first random statement but when I try this in the second random statement I get the same output as the first one. I'll be grateful if someone can tell me how to request for the level 2 and 3 residuals simultaneously so that I can make the necessary plots for my model diagnostics. Thanks in advance. |

Accepted Solutions

Solution

07-28-2015
09:17 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2015 09:17 AM

Thank you Steve for your help. What you are suggesting though, I think will underestimate the variability already because then the first random effect is gone which means the variability due to each student is ignored and since the estimates all depend on the covariance matrix in a multilevel model (especially in the frequentist setting), this changes everything else and I'll probably get a wrong impression of the distribution of the residuals for the schools.

I was actually able to get an idea from a SAS consultant, Stephen Mistler. He suggested the residuals for both random effects are included in the one provided as output in L2Resid and left a small code to separate both of them. I did make a very slight change to finally obtain what I wanted though. I am going to leave the code in case someone else encounters similar problem.

data Le2Random;

set Le2Resid;

if ^missing(studentid) & ^missing(schoolid);

run;

data Le3Random;

set Le2Resid;

if ^missing(schoolid) & missing(studentid) & Estimate ^=0;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-27-2015 08:31 AM

Could you share your PROC MIXED code? We might be able to optimize it enough to get the residuals you need.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-27-2015 08:52 AM

Thank you Steve. Well, this is not real code and data (Bristol University) but this is very similar to it;

data exam;

input SchoolID $2. StudentID Nor_examscore constant_1 STANDLRT stugender Schgender sch_ave_sc schvr bandst;

datalines;

1. 143 0.2613242 1 0.6190592 1 1 0.1661752 2 1

1. 145 0.1340672 1 0.2058022 1 1 0.1661752 2 2

1. 142 -1.723882 1 -1.364576 0 1 0.1661752 2 3

1. 141 0.9675862 1 0.2058022 1 1 0.1661752 2 2

1. 138 0.5443412 1 0.3711052 1 1 0.1661752 2 2

1. 155 1.7348992 1 2.1894372 0 1 0.1661752 2 1

1. 158 1.0396082 1 -1.116621 0 1 0.1661752 2 3

1. 115 -0.129085 1 -1.033970 0 1 0.1661752 2 2

1. 117 -0.939378 1 -0.538061 1 1 0.1661752 2 2

1. 113 -1.219486 1 -1.447227 0 1 0.1661752 2 3

1. 112 2.4086922 1 2.4373912 0 1 0.1661752 2 1

1. 137 0.6107292 1 2.1067862 0 1 0.1661752 2 1

1. 134 -1.836687 1 0.0404992 0 1 0.1661752 2 2

1. 124 -0.129085 1 1.1976192 0 1 0.1661752 2 1

.....

;

proc mixed data=exam COVTEST plots=all METHOD=ml;

class stugender (ref=first) SchoolID StudentID schvr bandst;

model Nor_examscore=stugender sch_ave_sc schvr bandst /s influence (iter=4);

random intercept /subject=StudentID(SchoolID) solution;

random intercept/subject=SchoolID solution;

ods output SolutionR=L2Resid;

run;

I am looking for a way to get out the residuals for the second random statement for SchoolID. Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2015 07:52 AM

I don't know whether this is a good method or not, but if you comment out the first random statement, I think the variance at the StudentID level would be fit in the residual variance, leaving only the SchoolID as the random effect. That should at least give you an idea of the shape of the distribution and whether there are any schools, which as a whole, are contributing a large part to the variance. I also think this is a case where your ML approach is probably superior to a REML approach,.

I am very curious if this approach works, so please post back what happens.

Steve Denham

Solution

07-28-2015
09:17 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2015 09:17 AM

Thank you Steve for your help. What you are suggesting though, I think will underestimate the variability already because then the first random effect is gone which means the variability due to each student is ignored and since the estimates all depend on the covariance matrix in a multilevel model (especially in the frequentist setting), this changes everything else and I'll probably get a wrong impression of the distribution of the residuals for the schools.

I was actually able to get an idea from a SAS consultant, Stephen Mistler. He suggested the residuals for both random effects are included in the one provided as output in L2Resid and left a small code to separate both of them. I did make a very slight change to finally obtain what I wanted though. I am going to leave the code in case someone else encounters similar problem.

data Le2Random;

set Le2Resid;

if ^missing(studentid) & ^missing(schoolid);

run;

data Le3Random;

set Le2Resid;

if ^missing(schoolid) & missing(studentid) & Estimate ^=0;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2015 09:40 AM

Hah! So the values are in the dataset, but obscure. It's kind of like getting all of the differences between lsmeans in a factorial design with repeated measures. They are all in the dataset, getting the ones of interest is what takes some programming.

Thanks to you and Stephen Mistler for the code. I will use it myself--quite soon in fact. Give yourself an answered correctly on this one!

Steve Denham