BookmarkSubscribeRSS Feed
eunkum
Calcite | Level 5

I am comparing the set of residuals from the multivariate model to the set composed from separate univariate models. I want to examine if the use of univariate or multivariate models results in different levels of bias in estimated group (teacher) level residuals. I need to create absolute bias, sum of difference between teacher level residual  in tch_res from the residuals from the Multivariate model and the univariate model. The first half of the code generates the multivariate data. I tried to save the residuals from the Multivariate Multilevel model by using the code in bold. But I could not save the residuals. The residuals are blank in the data. Can you please see the code highlighted in bold and let me know how I can fix it? 

 

 


/*still need to add code to loop through the sample sizes*/
%let nt = 15; /*number of teachers*/
%let nb = 10; /*number of students*/
%let n1=3; /*number of outcomes*/

%let nanb = &nt*&nb;

/*teacher cov - this a cov matrix from the corr matrix*/
%let tcov = {.25 .175 .0225,
.175 .25 .0225,
.0225 .0225 .0625};

/*student cov*/
%let scov = {1 .2 .0225,
.2 1 .0225,
.0225 .0225 1};


/*mean vector*/
%let mv= {0,0,0};

proc iml;
gmv={50 50 30}; /*grand mean vector*/

ynew=j(&nanb,&n1,0); /*this is for the new outcomes, should be as long as the stu*tch*/
y=j(&nb,&n1,0);
tch_res=j(&nt,&n1,0);
stu_res=j(&nb,&n1,0);

call randseed(4321589);

do A = 1 to &nt;
tch_res[A,]= (RandNormal(1,&mv,&tcov)); /* tchres matrix*/

do B = 1 to &nb;
stu_res[B,]= (RandNormal(1,&mv,&scov)); /*stures matrix*/


do C = 1 to &n1;
if C = 1 then do;
y[B,1] = gmv[1,1] + tch_res[A,1] + stu_res[B,1];
end;
if C = 2 then do;
y[B,2] = gmv[1,2] + tch_res[A,1] + stu_res[B,2];
end;
if C = 3 then do;
y[B,3] = gmv[1,3] + tch_res[A,1] + stu_res[B,3];
end;

end;


end;
yy = round(y,.01);
tn = &nb*(A-1)+1;
tend = &nb*A;
ynew[(tn:tend)`,1:&n1] = yy;
y=J(&nb,&n1,0);

end;

id1 = (1:&nb)`; *print id1;
idk = repeat(id1,&nt); *print idk;

id2 = (1:&nt)`;
idt = repeat(id2,&nb); *print id21;
call sort(idt,1); *print idt;


/*save tch resids and stu resids in files*/
tchres= tch_res;
stures= stu_res;

create tchresids from tchres;append from tchres;close tchresids;
create sturesids from stures;append from stures;close sturesids;

ydone = idt||idk||ynew;

/*create data set to use in proc mixed*/
c = {idt idk y1 y2 y3};
create testdat from ydone [colname=c]; append from ydone; close testdat;
quit;

/*proc print data=testdat (obs=100); run; */




data newdat;
set testdat;
y=y1; wave=1; cwave=wave; output;
y=y2; wave=2; cwave=wave; output;
y=y3; wave=3; cwave=wave; output;

keep idt idk y wave cwave;
run;

proc print data=newdat width=uniform; run;


/*run the multivariate multilevel model*/
proc mixed data=newdat covtest;
class idt idk cwave;
model y = cwave / noint s residual;
random cwave / sub=idt type=un g gcorr solution;
repeated cwave / sub=idk(idt) type=un r rcorr;

ods output covparms=cp fitstatistics=fit;
run;


/*calc icc from covparm table*/
data cparms;set cp;
if subject = ' ' then subject = 'resid';
run;

proc transpose data=cp out=cp_trans;
var estimate;
id covparm subject;
run;

/* icc*/
data cp;
set cp_trans;
iccT1 = (UN_1_1_IDT)/(UN_1_1_IDT + UN_1_1_IDK_IDT_);
iccS1 = (UN_1_1_IDK_IDT_)/(UN_1_1_IDT + UN_1_1_IDK_IDT_);
iccT2 = (UN_2_2_IDT)/(UN_2_2_IDT + UN_2_2_IDK_IDT_);
iccS2 = (UN_2_2_IDK_IDT_)/(UN_2_2_IDT + UN_2_2_IDK_IDT_);
iccT3 = (UN_3_3_IDT)/(UN_3_3_IDT + UN_3_3_IDK_IDT_);
iccS3 = (UN_3_3_IDK_IDT_)/(UN_3_3_IDT + UN_3_3_IDK_IDT_);
run;

/*data generation check*/
PROC univariate data=testdat;
var y1 y2 y3;
run;

PROC corr data=testdat cov;
var y1 y2 y3;
run;

PROC univariate data=tchresids;
var COL1 COL2 COL3;
run;

PROC univariate data=sturesids;
var COL1 COL2 COL3;
run;

/*Apply independent univariate models*/
/*Math*/
proc mixed data=testdat covtest;
class idt idk;
model y1 = / s residual outp=mathresid;
random intercept / sub=idt type=un solution;
random intercept / sub=idk(idt) type=un solution;

ods output covparms=cpy1 fitstatistics=fity1;
run;

proc transpose data=cpy1 out=cpy1_trans;
var estimate;
id covparm subject;
run;

/*Read*/
proc mixed data=testdat covtest;
class idt idk;
model y2 = / s residual outp=readresid;
random intercept / sub=idt type=vc solution;
random intercept / sub=idk(idt) type=vc solution;

ods output covparms=cpy2 fitstatistics=fity2;
run;

/*Motivation*/
proc mixed data=testdat covtest;
class idt idk;
model y3 = / s residual outp=motresid;
random intercept / sub=idt type=vc solution;
random intercept / sub=idk(idt) type=vc solution;

ods output covparms=cpy3 fitstatistics=fity3;
run;

/RQ1/
/*Absolute bias, sum of difference between teacher level residual in tch_res from the resdiuals from MM model and the uniM models
sum of(udj - uhatdj) divided by N*/

/*save the residuals from the Multivariate multilevel model*/

PROC sort data=newdat;
by cwave;
run;

Data MMresids (keep=idt idk cwave MMmathresid MMreadresid MMmotresid);
set newdat;
if cwave='1' then MMmathresid=Resid;
if cwave='2' then MMreadresid=Resid;
if cwave='3' then MMmotresid=Resid;
run;

proc sort data=MMresids;
by idt idk;
run;

Data MMreadresid;
set MMresids;
if cwave='2';
run;

Data MMmathresid;
set MMresids;
if cwave='1';
run;

Data MMmotresid;
set MMresids;
if cwave='3';
run;

Data MMresids1;
merge MMreadresid (keep=MMreadresid) MMmathresid (keep=MMmathresid) MMmotresid (keep=MMmotresid);
run;

/*create a new file with the residuals*/
Data Residuals;
Merge MMresids1 mathresid (rename=(resid=mathUtchres) keep=resid idk idt) readresid
(rename=(resid=readUtchres) keep=resid) motresid (rename=(resid=motUtchres) keep=resid);
run;

Data Bias; /*this is the new file, bias is the sum of the difference for all teachers*/
set Residuals; /*this is the file with the data*/
Readdif= abs(MMreadresid - readUtchres);
Mathdif= abs(MMmathresid - mathUtchres);
Motdif = abs(MMmotresid - motUtchres);
run;

proc summary Data=Bias;
var Readdif Mathdif Motdif;
output out=Biassum sum=;
output out=Bias1 mean=;
run;

/* AVERAGE BIAS, sum of all the bias across 1000 replications, for each udj, divided by 1000*/

Data AveBias;
set Bias1;
bias=sum(of bias1-bias1000);
averagebias=(sum_range)/1000;
run;


PROC corr data=Bias;
var MMreadresid readUtchres;
run;

PROC corr data=Bias;
var MMmathresid mathUtchres;
run;

PROC corr data=Bias;
var MMmotresid motUtchres;
run;

PROC corr data=Bias;
var MMreadresid readUtchres;
run;

PROC corr data=Bias;
var MMmathresid mathUtchres;
run;

PROC corr data=Bias;
var MMmotresid motUtchres;
run;




 

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Please save us some work. Show us the SASLOG. Are there any errors, or warnings, or variables that SAS says are uninitialized? Show us a portion of the output data set that doesn't have what you want.

--
Paige Miller
eunkum
Calcite | Level 5
Thank you for your response. Here is the SAS log. I've attached the data. MMmathresid, MMreadresid, and MMmotresid are blank in the data. 
 
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 Data MMresids (keep=idt idk cwave MMmathresid MMreadresid MMmotresid);
74 set newdat;
75 if cwave='1' then MMmathresid=Resid;
76 if cwave='2' then MMreadresid=Resid;
77 if cwave='3' then MMmotresid=Resid;
78 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
75:10 76:10 77:10
NOTE: Variable Resid is uninitialized.
NOTE: There were 450 observations read from the data set WORK.NEWDAT.
NOTE: The data set WORK.MMRESIDS has 450 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
 
 
79
80 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
93
PaigeMiller
Diamond | Level 26

So the next thing for you to do is actually look at the values in MMresids. You are trying to convert character to numeric, and obviously this isn't working. So what do you see in the data set MMresids?

 

Adding: it doesn't appear that you are creating a variable RESID in the code anywhere. So that is most likely the reason RESID is uninitialized.

--
Paige Miller
eunkum
Calcite | Level 5

Thank you for your response. It is very useful. 

I've attached the values in the MMresid. The values for MMmathresid, MMreadresid, and MMmotresid are 0. 

Can I create a variable RESID as follows? Is it right that I use 1 for MMmathresid, 2 for MMreadresid, and 3 for MMmotresid? 

Data MMresids (keep=idt idk cwave MMmathresid MMreadresid MMmotresid);
set newdat;
if cwave='1' then MMmathresid='1';
if cwave='2' then MMreadresid='2';
if cwave='3' then MMmotresid='3';
run;

 

PaigeMiller
Diamond | Level 26

I've attached the values in the MMresid. The values for MMmathresid, MMreadresid, and MMmotresid are 0. 

 

Actually, I want YOU to look at MMRESIDS, and the code that is used to create it. I don't see 0, I see missing values.

 

There is no RESID value in there, so you can't assign the RESID values to MMmathresid or MMreadresid or MMmotresid.

 

So, RESID does not appear to be created anywhere in your code that I can see. If that is wrong, please show me. If RESID doesn't exist, then your code fails.

 

But how to fix it? I don't know, that's really a little beyond my understanding of what you are doing and how you are doing it. So please answer the question ... where do the residuals get calculated.

--
Paige Miller
eunkum
Calcite | Level 5
I created Resid by adding "outp=Resid" (Please see the first code). I've attached the data "newdat".
In the "newdat", there is Resid. But when I run the second code below, resid is not initialized.

proc mixed data=newdat covtest;
class idt idk cwave;
model y = cwave / noint s residual outp=Resid;
random cwave / sub=idt type=un g gcorr solution;
repeated cwave / sub=idk(idt) type=un r rcorr;

ods output covparms=cp fitstatistics=fit;
run;

PROC sort data=newdat;
by cwave;
run;

Data MMresids (keep=idt idk cwave MMmathresid MMreadresid MMmotresid);
set newdat;
if cwave='1' then MMmathresid=Resid;
if cwave='2' then MMreadresid=Resid;
if cwave='3' then MMmotresid=Resid;
run;



 

PaigeMiller
Diamond | Level 26

When you paste your screen captures into Microsoft Word, they become extremely difficult to read. Please don't do that any more. Just click on the Photos icon and attach your screen capture that way.

 

It does not appear that you are showing me MMResids or NEWDAT. But, I still want YOU to look at NEWDAT and see what the values of the variable named RESID are in there. You will probably find out that a variable named RESID is not in there. Is that correct? If it is correct, then you can't use NEWDAT to work with variable RESID.

 

I created Resid by adding "outp=Resid" (Please see the first code).

 

Okay, now we seem to have found the problem. OUTP=RESID creates a data set named RESID, it does not create a variable named RESID in NEWDAT. You need to merge the residuals in data set RESID into NEWDAT, or maybe just use data set RESID.

 

My point here is that you can do a lot of this debugging yourself by looking at the SASLOG and looking at the contents of your data sets. You would have seen immediately that variable RESID is not in MMresids and variable RESID is not in NEWDAT.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1888 views
  • 0 likes
  • 2 in conversation