Multivariate normality

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Multivariate normality

[ Edited ]

Hello all,

I am using macro program to check multinormality test. (%multnormal) I can get a table with several columns as a output. I need only multivariate tests values or p-values of "Mardia Skewness" "Mardia Kurtosis" and  "Henze-Zirkler T" in to a vector and use this vector for other calculations in proc iml; Please anybody tell me how can I do this...I spend 4,5 days to figure this out but still cant understand.

Thank you in advanced.

This is my program;

 

%macro multnorm (
data=_last_ , /* input data set */
var= , /* REQUIRED: variables for test */
/* May NOT be a list e.g. var1-var10 */
plot=yes , /* create chi-square plot? */
hires=yes /* high resolution plot? */
);
options nonotes;
%let lastds=&syslast;

/* Verify that VAR= option is specified */
%if &var= %then %do;
%put ERROR: Specify test variables in the VAR= argument;
%goto exit;
%end;

/* Parse VAR= list */
%let _i=1;
%do %while (%scan(&var,&_i) ne %str() );
%let arg&_i=%scan(&var,&_i);
%let _i=%eval(&_i+1);
%end;
%let nvar=%eval(&_i-1);

/* Remove observations with missing values */
%put MULTNORM: Removing observations with missing values...;
data _nomiss;
set &data;
if nmiss(of &var )=0;
run;

/* Quit if covariance matrix is singular */
%let singular=nonsingular;
%put MULTNORM: Checking for singularity of covariance matrix...;
proc princomp data=_nomiss outstat=_evals noprint;
var &var ;
run;
%if &syserr=3000 %then %do;
%put MULTNORM: PROC PRINCOMP required for singularity check.;
%put %str( Covariance matrix not checked for singularity.);
%goto findproc;
%end;
data _null_;
set _evals;
where _TYPE_='EIGENVAL';
if round(min(of &var ),1e-8)<=0 then do;
put 'ERROR: Covariance matrix is singular.';
call symput('singular','singular');
end;
run;
%if &singular=singular %then %goto exit;

%findproc:
/* Is IML or MODEL available for analysis? */
%let mult=yes; %let multtext=%str( and Multivariate);
%put MULTNORM: Checking for necessary procedures...;
proc iml; quit;
%if &syserr=0 %then %goto iml;
proc model; quit;
%if &syserr=0 and
(%substr(&sysvlong,1,9)>=6.09.0450 and %substr(&sysvlong,3,2) ne 10)
%then %goto model;
%put MULTNORM: SAS/ETS PROC MODEL with NORMAL option or SAS/IML is required;
%put %str( to perform tests of multivariate normality. Univariate);
%put %str( normality tests will be done.);
%let mult=no; %let multtext=;
%goto univar;


%iml:
proc iml;
reset;
use _nomiss; read all var {&var} into _x; /* input data */

/* compute mahalanobis distances */
_n=nrow(_x); _p=ncol(_x);
_c=_x-j(_n,1)*_x[:,]; /* centered variables */
_s=(_c`*_c)/_n; /* covariance matrix */
_rij=_c*inv(_s)*_c`; /* mahalanobis angles */

/* get values for probability plot and output to data set */
%if &plot=yes %then %do;
_d=vecdiag(_rij#(_n-1)/_n); /* squared mahalanobis distances */
_rank=ranktie(_d); /* ranks of distances */
_chi=cinv((_rank-.5)/_n,_p); /* chi-square quantiles */
_chiplot=_d||_chi;
create _chiplot from _chiplot [colname={'MAHDIST' 'CHISQ'}];
append from _chiplot;
%end;

/* Mardia tests based on multivariate skewness and kurtosis */
_b1p=(_rij##3)[+,+]/(_n##2); /* skewness */
_b2p=trace(_rij##2)/_n; /* kurtosis */
_k=(_p+1)#(_n+1)#(_n+3)/(_n#((_n+1)#(_p+1)-6)); /* small sample correction */
_b1pchi=_b1p#_n#_k/6; /* skewness test statistic */
_b1pdf=_p#(_p+1)#(_p+2)/6; /* and df */
_b2pnorm=(_b2p-_p#(_p+2))/sqrt(8#_p#(_p+2)/_n); /* kurtosis test statistic */
_probb1p=1-probchi(_b1pchi,_b1pdf); /* skewness p-value */
_probb2p=2*(1-probnorm(abs(_b2pnorm))); /* kurtosis p-value */

/* output results to data sets */
_names={"Mardia Skewness","Mardia Kurtosis"};
create _names from _names [colname='TEST'];
append from _names;
_probs=(_n||_b1p||_b1pchi||_probb1p) // (_n||_b2p||_b2pnorm||_probb2p);
create _values from _probs [colname={'N' 'VALUE' 'STAT' 'PROB'}];
append from _probs;
quit;

data _mult;
merge _names _values;
run;


%univar:
/* get univariate test results */
proc univariate data=_nomiss noprint;
var &var;
output out=_stat normal=&var ;
output out=_prob probn=&var ;
output out=_n n=&var ;
run;

data _univ;
set _stat _prob _n;
run;

proc transpose data=_univ name=variable
out=_tuniv(rename=(col1=stat col2=prob col3=n));
var &var ;
run;

data _both;
length test $15.;
set _tuniv
%if &mult=yes %then _mult;;
if test='' then if n<=2000 then test='Shapiro-Wilk';
else test='Kolmogorov';
run;

proc print data=_both noobs split='/';
var variable n test %if &mult=yes %then value;
stat prob;
format prob pvalue.;
title "MULTNORM macro: Univariate&multtext Normality Tests";
label variable="Variable"
test="Test" %if &mult=yes %then
value="Multivariate/Skewness &/Kurtosis";
stat="Test/Statistic/Value"
prob="p-value";
run;
%if &plot=yes %then
%if &mult=yes %then %goto plotstep;
%else %goto plot;
%else %goto exit;


%model:
/* Multivariate and Univariate tests with MODEL */
proc model data=_nomiss;
%do _i=1 %to &nvar;
&&arg&_i = _a;
%end;
fit &var / normal;
title "MULTNORM macro: Univariate&multtext Normality Tests";
run;
%if &plot ne yes %then %goto exit;


%plot:
/* compute values for chi-square Q-Q plot */
proc princomp data=_nomiss std out=_chiplot noprint;
var &var ;
run;
%if &syserr=3000 %then %do;
%put ERROR: PROC PRINCOMP in SAS/STAT needed to do plot.;
%goto exit;
%end;
data _chiplot;
set _chiplot;
mahdist=uss(of prin1-prin&nvar );
keep mahdist;
run;
proc rank data=_chiplot out=_chiplot;
var mahdist;
ranks rdist;
run;
data _chiplot;
set _chiplot nobs=_n;
chisq=cinv((rdist-.5)/_n,&nvar);
keep mahdist chisq;
run;


%plotstep:
/* Create a chi-square Q-Q plot
NOTE: Very large sample size is required for chi-square asymptotics
unless the number of variables is very small.
*/
%if &hires=yes %then proc gplot data=_chiplot;
%else proc plot data=_chiplot;;
plot mahdist*chisq;
label mahdist="Squared Distance"
chisq="Chi-square quantile";
title "MULTNORM macro: Chi-square Q-Q plot";
run;
quit;
%if &syserr=3000 %then %do;
%put MULTNORM: PROC PLOT will be used instead.;
%let hires=no;
%goto plotstep;
%end;


%exit:
options notes _last_=&lastds;
title;
%mend;
data cork;
input n e s w @@;
cards;
72 66 76 77 91 79 100 75
60 53 66 63 56 68 47 50
56 57 64 58 79 65 70 61
41 29 36 38 81 80 68 58
32 32 35 36 78 55 67 60
30 35 34 26 46 38 37 38
39 39 31 27 39 35 34 37
42 43 31 25 32 30 30 32
37 40 31 25 60 50 67 54
33 29 27 36 35 37 48 39
32 30 34 28 39 36 39 31
63 45 74 63 50 34 37 40
54 46 60 52 43 37 39 50
47 51 52 43 48 54 57 43
;

%multnorm(data=cork,var=n e s w);

?????

????

 

 

I HAVE ATTACHED MY OUTPUT. 

Thank you for any help.


OUTPUT.png

Accepted Solutions
Solution
‎11-30-2016 10:15 AM
Super User
Posts: 9,875

Re: Multivariate normality

Check PG's post. 
You don't need write any code. Just enter WORK library and open _BOTH dataset .

View solution in original post


All Replies
Respected Advisor
Posts: 4,820

Re: Multivariate normality

Maybe if you posted a sample of the macro output dataset to show its structure and the format required for your analysis someone could tell you how to proceed from one to the other.

PG
Occasional Contributor
Posts: 10

Re: Multivariate normality

Dear PG,

Thank you for your comment. I already posted.

Respected Advisor
Posts: 4,820

Re: Multivariate normality

Running the macro creates dataset _both (attached) which contains those values in a simple layout, except for "Henze-Zirkler T". Where do you want to go from there?

PG
Attachment
Occasional Contributor
Posts: 10

Re: Multivariate normality

Thank you so much PG. Achually I need all three test. I am just going to calculate the power of all three test in simulation.

Thanks again.

Super User
Posts: 9,875

Re: Multivariate normality

This macro is taken from support.sas.com ?
I remember @Rick write a blog about it before.


Check dataset _names which contains the P value you want.


create _names from _names [colname='TEST'];
append from _names;
_probs=(_n||_b1p||_b1pchi||_probb1p) // (_n||_b2p||_b2pnorm||_probb2p);
create _values from _probs [colname={'N' 'VALUE' 'STAT' 'PROB'}];
append from _probs;
SAS Super FREQ
Posts: 3,630

Re: Multivariate normality

For statistical and graphical tests of multivariate normality, see the article "Testing data for multivariate normality."  The article also discusses the macro and shows the output.

Occasional Contributor
Posts: 10

Re: Multivariate normality

Dr.Rick, 

Thank you very much for the information. Your article helped me a lot to understand and get the Macro program from support.sas.com.

Still I am in struggling to select only p value of three mult. normality tests and put into a matrix. I will check it again.

Thank you.

Occasional Contributor
Posts: 10

Re: Multivariate normality

Ksharp,
Thank you very much. But it gives me an error:
 
ASSIGN at line 360 column 3
361 create _values from _probs [colname={'N' 'VALUE' 'STAT' 'PROB'}];
ERROR: Matrix _probs has not been set to a value.
 
 I am not sure I missed something. Please let me know if you can figure it out.
Thank you.
Super User
Posts: 9,875

Re: Multivariate normality

use


ods trace on;
%multinorm()
ods trace off;


check the name of ods table you want.
And use 

ods output  xxx=want;
..

to get it.

Occasional Contributor
Posts: 10

Re: Multivariate normality

Dear kSharp;
If you don't mind, please can you explain little bit more. Write code...Thank you..
Solution
‎11-30-2016 10:15 AM
Super User
Posts: 9,875

Re: Multivariate normality

Check PG's post. 
You don't need write any code. Just enter WORK library and open _BOTH dataset .

Occasional Contributor
Posts: 10

Re: Multivariate normality

[ Edited ]

Hi Ksharp,

Thank you so much for the help. I am sorry for the  late reply. I am out of the town for the vacation. I will check this once I go back.

Again, thank you so much.

Occasional Contributor
Posts: 10

Re: Multivariate normality

Dear PG,
Thank you very much for your help. I greatly appreciate it.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 746 views
  • 1 like
  • 4 in conversation