Hi everyone,
I'm stuck with the following problem-
QS-The CARE Network conducted a clinical trial called “TReating Children to Prevent EXacerbations of Asthma (TREXA).” Examine the publication by Martinez et al (Lancet 2011) to get an overview of the study. The study design is a 2 ´ 2 factorial, and three outcome variables are provided (change in height, change in exhaled nitric oxide, and change in forced expiratory volume). Perform a MANOVA that includes checking normal assumptions, checking homogeneity of variance-covariance matrices, and providing appropriate confidence intervals for important effects
dataset- Please see the attachment.
Sas codes I ran and the error message I got are as follows-(Attachment)
data TREXA;
set "TREXA.sas7bdat";
label height_change='Change in Height (cm)'
eno_change='Change in Exhaled Nitric Oxide (ppb)'
fev1_change='Change in Forced Expiratory Volume (L/min)';
run;
Proc sort data=TREXA;
by drug_arm ;run;
proc univariate data=TREXA plots normal;
by drug_arm;
var height_change eno_change fev1_change;
histogram height_change eno_change fev1_change;
qqplot height_change eno_change fev1_change;
title "TREXA";
title2 'Histograms, Stem-and-Leaf Plots, Box Plots, and Q-Q Plots for Examining Marginal Distributions';
run;
*one way manova-;
proc iml;
***********************************************************************
* The intent of this program is to construct d_squared = one-half *
* times the statistical distance of each observation vector from the *
* mean vector. If the observation vectors follow a p-variate normal *
* distribution, then the d_squared values will follow a chi-square *
* distribution with p degrees of freedom. *
***********************************************************************;
start TREXA;
use TREXA;
read all var {height_change eno_change fev1_change} into x;
n=nrow(x);
xbar_prime=x[+,]/n;
deviations=x-(j(n,1,1)*xbar_prime);
s=(1/(n-1))*deviations`*deviations;
s_inverse=inv(s);
d_squared=0.5*vecdiag(deviations*s_inverse*deviations`);
create chisq_qqplot from d_squared [colname='d_squared'];
append from d_squared;
close chisq_qqplot;
finish TREXA;
run TREXA;
error-
999 proc iml;
NOTE: IML Ready
1000
***********************************************************
1000! ************
1001 * The intent of this program is to construct d_squared =
1001! one-half *
1002 * times the statistical distance of each observation
1002! vector from the *
1003 * mean vector. If the observation vectors follow a
1003! p-variate normal *
1004 * distribution, then the d_squared values will follow a
1004! chi-square *
1005 * distribution with p degrees of freedom.
1005! *
1006 **********************************************************
1006! *************;
1007 start TREXA;
1008 use TREXA;
1009 read all var {height_change eno_change fev1_change} into x
1009! ;
1010 n=nrow(x);
1011 xbar_prime=x[+,]/n;
1012 deviations=x-(j(n,1,1)*xbar_prime);
1013 s=(1/(n-1))*deviations`*deviations;
1014 s_inverse=inv(s);
1015 d_squared=0.5*vecdiag(deviations*s_inverse*deviations`);
1016 create chisq_qqplot from d_squared [colname='d_squared'];
1017 append from d_squared;
1018 close chisq_qqplot;
1019 finish TREXA;
NOTE: Module TREXA defined.
1020 run TREXA;
ERROR: (execution) Invalid argument or operand; contains
missing values.
operation : * at line 1013 column 24
operands : _TEM1004, deviations
_TEM1004 3 rows 288 cols (numeric)
deviations 288 rows 3 cols (numeric)
statement : ASSIGN at line 1013 column 1
traceback : module TREXA at line 1013 column 1
NOTE: Paused in module TREXA.
Can't figure out how to fix the problem.What to write in the "start" and 'Use' comment? Any help will be highly appreciated.Thanks!
ASR.
The ERROR message in the log clearly states that the data contains missing values. The person who wrote this code did not write it to accomodate missing values.
You can conduct MANOVA analyses by using SAS procedures, such as GLM, which handle missing values. If you want to use this code, you can extract only the nonmissing observations by foillowing the blog post "Complete cases: How to perform listwise deletion in SAS." For example, you might try something like this:
start TREXA;
use TREXA;
read all var {height_change eno_change fev1_change} into x;
/* handle missing values. See
http://blogs.sas.com/content/iml/2015/02/23/complete-cases.html */
/* return rows that have no missing values */
start CompleteCases(X);
return( loc(countmiss(X, "row")=0) );
finish;
/* exclude any row with a missing value */
start ExtractCompleteCases(X);
idx = CompleteCases(X);
if ncol(idx)>0 then
return( X[idx, ] );
else
return( {} );
finish;
X = ExtractCompleteCases(X); /* overwrite X with only complete cases */
/* ...continue program... */
n=nrow(x);
...
If you do decide to analyze the data with a PROC rather than with IML, I highly recommend you read:
http://blogs.sas.com/content/sastraining/2011/02/02/the-punchline-manova-or-a-mixed-model/
If you have missing values, you are much better off using PROC MIXED rather than using PROC GLM.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.