BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
brownster
Calcite | Level 5

I am trying to do a correlation matrix between the missingness of different variables. For instance, how often/what's the correlation that gender and ethnicity are missing together. How would I go about this? I know how to do a correlation matrix between variables, but a little stuck on how to compare the missingness of each variable. Thanks!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I think you are looking at association not correlation. Correlation is a specific calculation.

 

One way may be to look at distribution such as in proc freq. Create a format for non-missing / missing.

 

An example:

proc format ;
value nm
. = 'Missing'
other='Not missing'
;
/*if the variable is charcter*/
value $nm
' '= 'Missing'
other='Not missing'
;

data example;
   input age sex $;
datalines;
25 f
21 m
.  f
.  f
25 .
30 .
35 .
;

proc freq data=example;
   tables age*sex / missing;
   format age nm. sex $nm. ;
run;

The above is the examine the data. You could do a chi-square test for actual difference of distribution, some difference or odds ratios and relative risks as well.

 

Or don't format the variables if they have a small number of values and just use the MISSING option in the Tables statement so the missing values are considered.

View solution in original post

3 REPLIES 3
ballardw
Super User

I think you are looking at association not correlation. Correlation is a specific calculation.

 

One way may be to look at distribution such as in proc freq. Create a format for non-missing / missing.

 

An example:

proc format ;
value nm
. = 'Missing'
other='Not missing'
;
/*if the variable is charcter*/
value $nm
' '= 'Missing'
other='Not missing'
;

data example;
   input age sex $;
datalines;
25 f
21 m
.  f
.  f
25 .
30 .
35 .
;

proc freq data=example;
   tables age*sex / missing;
   format age nm. sex $nm. ;
run;

The above is the examine the data. You could do a chi-square test for actual difference of distribution, some difference or odds ratios and relative risks as well.

 

Or don't format the variables if they have a small number of values and just use the MISSING option in the Tables statement so the missing values are considered.

Reeza
Super User
ods select MissPattern;
proc mi data=Sashelp.Heart nimpute=0;
var AgeAtStart Height Weight Diastolic 
    Systolic MRW Smoking Cholesterol;
run;

https://blogs.sas.com/content/iml/2016/04/18/patterns-of-missing-data-in-sas.html

 

This is a simple approach. 

FreelanceReinh
Jade | Level 19

Hello @brownster,

 

You can also use PROC CORR to compute a correlation matrix from indicator variables defined as 1 if the original variable is missing, else 0, because the Pearson correlation coefficient of such variables equals the phi coefficient (a measure of association that you would normally compute with PROC FREQ).

 

Example using SASHELP.HEART:

/* Create table with information about missing levels of variables */

ods output nlevels=nlev;
proc freq data=sashelp.heart nlevels;
tables _all_ / noprint;
run;

/* Create indicator variables (1 if missing, else 0) for each variable with one or more missings */

proc sql noprint;
select cat('missing(',trim(tablevar),') as ',tablevar), tablevar
       into :indvars separated by ', ', :indvarlist separated by ' '
from nlev
where nmisslevels;

create table want as
select &indvars
from sashelp.heart;
quit;

/* Compute phi coefficients (=Pearson correl. coeff.) for all pairs of indicator variables */

proc corr data=want;
var &indvarlist;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 572 views
  • 1 like
  • 4 in conversation