Regular Learner
Posts: 1

# PCA panel data in SAS

I am a PhD student working with panel data and I am writing to ask because I am really confused whether I should use PCA to measure my CEO greed measure (which is an independent variable when looking at it's effect on firm performance or moderator when looking at its effect on the entrepreneurial orientation and firm performance relationship).  In the paper, When More Is Not Enough: Executive Greed and Its Influence on Shareholder Wealth published in journal of management published by Haynes et al. 2014, that I am following which they also have panel data they measured CEO greed as a result of PCA of three proxies. I ran PCA using SAS and I got factors as well as eigenvalues for each of the three proxies, should I multiply this factor by each original standardised variable value and sum them up to get the final CEO greed measure to use in the regression (fixed-effect panel data regression)? Or do I multiply the eigenvalues by the original standardised variable to get the final CEO greed measure in the regression? I discussed this with my professor and I noticed in one of your replies that you noted the same issue my supervisor told me which is that by using an index as a result of PCA you lose the variations that might be seen by each proxy. However, what if the proxies are highly multi collinear after I run the correlation matrix then I cannot put them in the final fixed effect regression equation as separate variables? Also how can perform a PCA in panel data? Do I get separate PCA values for each firm in each year? Or a value to use for all firms in all years? Could you please help me. Below is the code I wrote. Thank you

/*Principal component analysis for firm size*/

PROC Princomp DATA=year.mergedind simple METHOD=Prin PRIORS=one mineigen=1 ROTATE=varimax round SCREE CORR MSA RES;

var firm_size1 firm_size2 firm_size3;

by gvkey year;

Run;

/*Factor multiplication by each variable of the overall construct firm size*/

Data year.mergedind;

Set year.mergedind;

F1_size=sum(*firm_size1, *firm_size2, *firm_size3);

Run;

SAS Super FREQ
Posts: 4,237

## Re: PCA panel data in SAS

It sounds like you have some fundamental methematical questions about PCA.  If your library has a copy of A User's Guide to Principal Components by J. Edward Jackson (1991), I highly recommend it. It is easy to read and very applied.

I can't answer all your questions, but I think there are some problems with your SAS code. First, that 'proc princomp' syntax seems to match PROC FACTOR, not PRINCOMP.  The DATA step also contains errors, I believe.

My advice is to start with a basic PCA. Since your variable appear to be comparable, you might want to use the PCA of the covariance matrix. This will eliminate the scaling issue as well as the rotation transformation.  Then perform a regression on the 3 PCA factors and compare the results and predicted values with a regression analysis on the original variables. You should be able to figure out how the two regressions are related.

Lastly, if you want the most help from this forum, I encourage you to either include data or use data from the PROC PRINCOMP documentation or from SASHELP data sets.  That way you can ask specific questions and we can reproduce what you are seeing.

Discussion stats