Implementing Exploratory Factor Analysis in SAS: Step-by-Step Demonstration- Part 2

In the previous post, I focused on the conceptual foundations of Exploratory Factor Analysis (EFA). While understanding the theory is essential, EFA truly comes to life when applied to real data.

In this follow-up post, I move from concept to practice by demonstrating how to implement- Exploratory Factor Analysis using SAS. Through a step-by-step example, we will show how to determine the number of factors, extract and rotate factors, and interpret the results. This practical walk- through is designed to reinforce the ideas discussed earlier and help you apply EFA in your own analytical work.

This example analyzes socio-economic data from Harman (1976) consisting of five variables: total population (Population), median years of schooling (School), total employment (Employment), professional services (Services), and median house value (HouseValue). Using the EFA procedure in SAS Viya, I demonstrate how to perform exploratory common factor analysis.

Training Model in SAS Studio

Launch SAS Studio and submit following program to start a CAS session and assign libraries.

cas;

Libname myCASlib cas;

Note: Input data must be in a CAS table that is accessible in your CAS session. You must refer to this table by using a two-level name. The first level must be a CAS engine libref, and the second level must be the table name.

As part of the analysis, the first step is to identify how many underlying factors are needed to explain the variation in the data. The following statements invoke the EFA procedure:

proc efa data=myCASlib.SocioEconomics method=none;

nfactors type=eigenvalue;

nfactors type=proportion threshold=0.9;

run;

When using PROC EFA to decide how many latent dimensions to retain, you specify one or more NFACTORS statements. Each statement represents a different rule or criterion for choosing the number of factors.

In this example, we use two such criteria. The first is the eigenvalue criterion, which looks at the eigenvalues of the reduced correlation matrix—that is, the correlation matrix where the diagonal values are replaced with prior communality estimates. By default, any eigenvalue greater than 1 is taken as evidence of a meaningful latent factor. The second criterion used in this example is the proportion criterion. This approach looks at how much of the common variance in the data is explained as factors are added one by one. Using the THRESHOLD= option, we set the cutoff to 0.9, meaning we want to retain the smallest number of factors that together explain at least 90% of the common variance in the dataset. By default, PROC EFA uses squared multiple correlations (SMCs) as prior communality estimates, placing them on the diagonal of the correlation matrix before computing eigenvalues. Alternative priors can be specified using the PRIORS= option.

When the program runs successfully, it produces a summary of the input data, including the number of observations used and simple statistics such as the mean and standard deviation for each variable included in the analysis.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

In this analysis, both criteria indicate that two latent factors are sufficient to describe the data. When multiple criteria are specified, PROC EFA combines their results to determine the final number of factors, using the minimum number suggested by the active criteria by default.

A footnote in the output summarizes the minimum, maximum, mean, and median number of factors across all criteria. Since both criteria in this case suggest the same number of factors, all summary values are identical, leading to the final conclusion that two factors should be retained.

The eigenvalues of the reduced correlation matrix are then computed.

03_MS_Eigen-value-Matrix_no.of-Factors.png

In this example, the two largest positive eigenvalues together explain 101.31% of the common variance. This can occur because the reduced correlation matrix is not required to be positive definite, allowing for small negative eigenvalues. This pattern provides additional support for the conclusion that no more than two common factors may be sufficient to describe the data.

With the number of factors now determined, we move to the next step of the analysis and extract two common factors using EFA procedure. Submit the following code to perform the factor extraction.

proc efa data=myCASlib.socioeconomics nfactors=2

method=principal rotate=promax reorder;

run;

The NFACTORS=2 option specifies that two factors are to be extracted. The METHOD=PRINCIPAL option, which is the default, performs principal factor extraction. Other extraction methods—such as alpha factor analysis, maximum likelihood, or iterated principal factor analysis—can be used by specifying the appropriate option.

By default, PROC EFA uses squared multiple correlations as prior communality estimates, though alternative priors can be specified using the PRIORS= option.

The factors initially extracted using this method are orthogonal. To improve interpretability, the ROTATE=PROMAX option is used to rotate the factors. Promax rotation begins with an orthogonal varimax prerotation, followed by an oblique Procrustes rotation. A different prerotation method—orthogonal or oblique can be specified using the PREROTATE= option.

Finally, the REORDER option arranges the output so that variables are listed according to the factor on which they have the highest absolute loading, making the results easier to interpret.

Note: When you specify the PRIORS= ONES and METHODS= PRINCIPAL options in the PROC EFA statement, the procedure performs a principal component analysis (PCA).

The successful execution of the code first outputs the summary of important information about the input data and the simple statistics of the analysis variables. To start the interpretation of the output I will first look at the prior communalities table that represent the initial estimate of the common variance for each analysis variable. These values will be used to compute the reduced correlation matrix which is the observed correlation matrix for the analysis variables in which diagonal entries are replaced by the prior communalities.

High values (close to 1) indicate that most of the variable’s variance is expected to be explained by common factors, which is ideal for factor analysis. Next, you examine the eigenvalues of the reduced correlation matrix. The first two eigenvalues are positive and large while the rest are near zero or negative. Also, it indicates that factors 1 and 2 together explain nearly 101% of the common variance and this strongly supports retaining two common factors.

Next, you see the factor pattern with the two extracted factors. The Services variable has the largest loading on the first factor, and the Population variable has the smallest. The Population and Employment variables have the largest positive loadings on the second factor, and the HouseValue and School variables have the large negative loadings.

The variance explained by each factor is calculated as sum of squared factor loadings for that factor. So, for factor 1 variance explained will be calculated as-

Factor 1 variance= (0.87899)² + (0.74215)² + ………….+ (0.62533)² = 2.734282722

Factor 2 variance = (−0.15847)²+(−0.57806)² + …………+ (0.76621)² = 1.716065401

Table B1

Now, what are these final communality estimates?

Well, it represents the proportion of variance in each observed variable that is explained by the retained common factors (here, two factors), after the factor extraction is completed. Unlike prior communalities, which are starting guesses (SMCs), final communalities are model-based results. So, final communality is the sum of squared loadings across all retained factors for that variable. In the current example, let’s see how this can be estimated for the variable Population.

Final communality for Population= (0.62533)² + (0.76621)² = 0.97811334 and so on. The total is simply the sum of final communalities across all variables and is estimated to be 4.450370. This represents the total variance explained by the two common factors across all five variables.

After extracting the initial factors, the next step is factor rotation, which is performed to make the factor structure easier to interpret. The goal of rotation is to achieve a simple structure, where each variable loads strongly on one factor and has near-zero loadings on the others.

In this example, a promax rotation with a varimax prerotation is specified. As part of this process, PROC EFA first applies an orthogonal varimax rotation. This step rotates the initial factor loading matrix by post multiplying it with an orthogonal transformation matrix that satisfies the varimax criterion. The idea is to encourage large loadings to become larger and small loadings to approach zero within each factor. Users do not compute this matrix manually; it is derived internally by EFA procedure based on the extracted loadings and the selected rotation criterion. The orthogonal transformation matrix shown below defines how the original factors are rotated in multidimensional space without changing the total variance explained. Applying this transformation yields the varimax-rotated factor pattern, also shown below.

Note: To obtain the rotated factor pattern matrix through manual computation, multiply the unrotated factor pattern matrix by the orthogonal rotation (transformation) matrix, thereby applying the corresponding linear transformation to the loading structure.

The rotated pattern reveals a much clearer structure: HouseValue, School, and Services load strongly on Factor 1, indicating that this factor represents a common socioeconomic dimension related to education, housing, and services. Population and Employment load strongly on Factor 2, suggesting a separate dimension associated with population size and labor activity. Services shows a stronger loading on the first factor than on the second, though both loadings are considerable. This indicates that Services demonstrates factorial complexity. This clearer separation of variables across factors is precisely the motivation for rotation. By improving interpretability without altering the underlying explanatory power of the model, rotation allows the latent dimensions to be more meaningfully labeled and communicated to stakeholders.

In an orthogonal factor solution, such as the varimax-rotated solution, the interpretation is straightforward: the factor loadings can be read as correlations between variables and factors. For example, in the varimax solution, HouseValue has a loading of about 0.94 on Factor 1, which can be interpreted as a strong correlation between that variable and the first factor. In contrast, Population has a loading of about 0.02 on Factor 1, indicating virtually no relationship with that factor. This direct “correlation interpretation” is possible because orthogonal factors are uncorrelated with one another.

Another important effect of varimax rotation is how it redistributes variance across factors. Before rotation, the two factors explain 2.73 and 1.72 units (Table B1) of common variance, respectively. After varimax rotation, the variance becomes more evenly balanced — approximately 2.35 and 2.10.

This more even spread is typical of orthogonal rotations: they aim to simplify interpretation by making factor structures cleaner and more balanced. Most importantly, rotation does not change how much total variance the factors explain. The combined variance explained by the two factors remains the same before and after rotation. The same invariance holds for the communalities of the variables. Rotation simply redistributes variance across factors; it does not increase or decrease the total amount of variance explained.

You might be wondering why do we see varimax when we apecified ROTATE=PROMAX option to perform a promax rotation. This is because PROMAX is a two-stage rotation and varimax is the first step. Varimax is orthogonal, so the factors are uncorrelated at this stage. Because varimax is an internal prerotation step, its results appear in the output even though the final solution is oblique. The varimax solution is then used as the starting point for the promax rotation. Promax allows the factors to become correlated, which is often more realistic in social and behavioral data.

Next, you will notice two additional matrices: the Procrustes Target Matrix and the Procrustes Transformation Matrix. You might naturally ask — why do we need a Procrustes target at all?

The procrustes target matrix represents the ideal loading pattern that the EFA procedure wants the final oblique solution to approximate. Each row corresponds to a variable and each column to a factor. The entries indicate where a variable is expected to load strongly (values close to 1) and where it should load weakly (values close to 0).

In this example:

HouseValue and School are targeted to load almost entirely on Factor 1.
Population and Employment are targeted to load almost entirely on Factor 2.
Services is allowed to load on both factors, reflecting its more “bridging” role between the two dimensions.

Once this ideal pattern is defined, the procedure computes the procrustes transformation matrix. This matrix is the actual linear transformation that rotates the varimax solution so that it aligns as closely as possible with the target matrix.

However, during an oblique rotation, factor variances must remain fixed at 1. For that reason, the transformation matrix is normalized before being applied. The normalized oblique transformation matrix is the one ultimately used to produce the final promax-rotated solution. This normalized transformation matrix is also below.

Using the normalized oblique transformation matrix produces the promax-rotated factor solution. Unlike the initial (unrotated) solution and the varimax solution, the promax solution allows the factors to become correlated.

However, once promax rotation is applied, the factors are no longer constrained to remain perpendicular. As a result, they can tilt toward one another, introducing correlation. In this example, the Interfactor Correlations table shows that the two factors now have a correlation of 0.20.

This indicates a modest positive relationship between the two underlying dimensions. Conceptually, it suggests that the latent constructs represented by these factors are related, but still distinct.

A correlation of 0.20 suggests-more developed areas often attract population and employment. But size alone does not fully determine development. A region can be large without being highly developed. A region can be relatively developed without being extremely large. So, the two factors are related but not redundant.

In orthogonal solutions (such as varimax), factor loadings can be interpreted directly as correlations between variables and factors. However, this interpretation does not hold for oblique solutions like promax.

When factors are allowed to correlate, the pattern matrix no longer contains simple correlations. Instead, it contains regression coefficients that represent the unique contribution of each factor to a variable. To interpret actual correlations between variables and factors, you must look at the factor structure matrix.

In this example, the structure matrix shows a very strong correlation (0.986) between Population and Factor 2. This value can legitimately be interpreted as a correlation coefficient. By contrast, the corresponding value in the pattern matrix is 1.002 which clearly cannot represent a correlation, since correlations must lie between −1 and 1.

When factors are uncorrelated (as in an orthogonal varimax solution), interpreting variance explained is straightforward. The variance explained by each factor simply adds up to the total common variance. However, once we move to an oblique rotation like promax, the situation changes. Because the factors are now correlated, part of the variance they explain overlaps. This makes it less straightforward to attribute variance uniquely to each factor.

14_MS_Variance-Explained_Ignoring_Eliminating-other-factors.png

In the table titled ‘Variance Explained by Each Factor, Ignoring Other Factors’, each factor’s contribution is calculated without adjusting for overlap with the other factor. In other words, we are counting the variance each factor explains on its own, even if some of that variance is also explained by the other factor. Because the factors are correlated, this shared variance gets counted more than once. That is why the total does not equal the total communality (4.45). The overlap between the factors causes double counting.

2.45 + 2.20 ≠ 4.45

On the other hand, the table titled ‘Variance Explained by Each Factor, Eliminating Other Factors’, represents the unique variances explained by the factors, after removing the variances explained by the other factors. So, in the current example, Factor 1 explains 2.25 units of variance after removing what Factor 2 already explains. This gives a clearer picture of each factor’s independent importance.

Last but not least, factor rotation affects interpretation, not explanation. It enhances clarity by reshaping the factor structure, but it does not change the total variance explained for any variable. As shown in the table below, the communalities remain identical across the initial, varimax, and promax solutions, with the total communality still equal to 4.45.

Key Takeaways

PROC EFA provides a transparent and powerful framework for uncovering latent structure in data. The goal is not just to extract factors, but to arrive at a solution that is both mathematically sound and substantively meaningful. And that is exactly what this demo was designed to illustrate.

In this demo, we began by determining that a two-factor solution was appropriate for the data. From there, we extracted the factors and examined the initial (unrotated) solution. While mathematically correct, the initial solution was not easily interpretable i.e., variables loaded across factors in a way that lacked clear structure.

To improve interpretability, we first applied orthogonal rotation (Varimax). This redistributed the loadings to achieve a cleaner structure while keeping the factors uncorrelated. Next, we moved to oblique rotation (Promax), allowing the factors to correlate. The Promax procedure internally:

Starts with the Varimax solution.
Constructs a Procrustes target by exponentiating loadings.
Computes a transformation matrix.
Normalizes it to preserve unit factor variances.
Produces correlated factors.

After Promax rotation, the factors became correlated and the factor pattern matrix no longer represented correlations. The factor structure matrix had to be used to interpret variable–factor correlations.

Implementing factor analysis in SAS Viya provides a structured yet flexible approach to uncovering latent patterns in data. With the right combination of extraction and rotation techniques, one can transform complex relationships into clear, interpretable insights that support better decision-making.

References:

SAS Documentation

Share this content with customers and partners:

SAS Community Article

Find more articles from SAS Global Enablement and Learning here.