Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
QY_Liu
Obsidian | Level 7

Dear all, I would like to calculate the semi-partial correlation for complex survey data using PROC SURVEYREG. I have categorical independent variables. I appreciate your help if you have any suggestions.  

9 REPLIES 9
SteveDenham
Jade | Level 19

I think this might get you part-way there.  In PROC SURVEYREG, you can get the covariance matrix (COVB) as an output.  Save it using ODS OUTPUT as well as the parameter estimates.  Convert the standard errors in the estimates table to standard deviations (stderr*sqrt(df)).  Now you have all the parts for calculating the correlation matrix of the parameters.  

 

To get a semi-partial correlation seems much more involved.  You have to calculate the covariance holding one of the factors fixed, and with multiple levels of several variables, this gets difficult. Perhaps using the correlation matrix of the least squares means would provide something, as they are calculated holding all other factors constant.

 

SteveDenham

QY_Liu
Obsidian | Level 7

Thank you, Steve!

My model includes categorical independent variables, the covariance matrix looks like the following picture. If not the zero col/row I can convert it to the correlation matrix.

QY_Liu_0-1600224912413.png

 

SteveDenham
Jade | Level 19

Try adding the NOINT option to the MODEL statement.  If you have no fixed effect interactions, I believe that should eliminate the zeroes introduced by the non-full rank parameterization.

 

SteveDenham

QY_Liu
Obsidian | Level 7

Thank you, Steve!

I added NOINT option in the MODEL, but I still got zero, because they are categorical variables.

QY_Liu_0-1600282135933.png

 

 

SteveDenham
Jade | Level 19

Shoot.  That means we need to rethink the whole process. You should have the sample weights for each level of every variable, or at least should be able to calculate them.  Could you use PROC CORR with a WEIGHT and a PARTIAL statemen?.  You would want to partial out levels (which means that each level of a variable would need its own name), which could lead to a lot of values - there would be 30 for the data listed in the picture.  I think I might be reaching on this, though.  I am sure there would be a matrix solution via PROC IML, but I am not the one for a solution of that kind.  Calling @Rick_SAS  or @IanWakeling or any others that answer posts in the IML community.

 

SteveDenham

QY_Liu
Obsidian | Level 7

Thank you, Steve.

I can use PROC CORR to get the correlation matrix.

proc corr data = work.a nosimple noprob vardef=weight;
weight weight;

QY_Liu_0-1600497166652.png

 

QY_Liu
Obsidian | Level 7

I can use PROC CORR, VAR y, WITH x1 and PARTIAL(co-variates x2 x3), to get Pearson partial correlation coefficient between y and x controlling x2 and x3. What would you suggest for the next step?

Thanks,
Qingyun 

 

SteveDenham
Jade | Level 19

I am going to take a step backwards, to SURVEYREG.  Based on the information  here: http://faculty.cas.usf.edu/mbrannick/regression/Partial.html , a semipartial correlation is the correlation between the residual for one variable and the raw data for another.  With the OUTPUT statement in SURVEYREG, you can get the residuals you want by fitting the independent variables one at a time, and looking at the correlation with the outcomes of the other independent variables, making sure to use the weights as you have here.

 

At least that is what it looks like to me.

 

SteveDenham

QY_Liu
Obsidian | Level 7

Thank you, Steve. I will get back to you if I have any updates. I really appreciate your help. Qingyun

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1850 views
  • 0 likes
  • 2 in conversation