About wmespi

wmespi · ‎11-30-2020

One more thing I would like to add is that the negative eigenvalues are the initial ones calculated from the reduced correlation matrix (the diagonals are not 1). The 11th eigenvalue is the first of these initially calculated eigenvalues to turn negative. Coincidentally, SAS will use no more than 10 factors, even if I specify 12 or 20 or whatever. My gut is saying this is because it might lead to a heywood or ultraheywood case, so SAS prohibits this from happening. I did try specifying HEYWOOD and ULTRAHEYWOOD in the PROC FACTOR statement, but it still would only retain a max of 10 factors regardless of what value I put in for nfactors. This leads to me to believe that the negative values are not necessarily an issue, but instead provide me an upperbound for the number of factors I could potentially have (in my case 10). I have not yet found any literature to support this, but I will do some digging. Now, after it gets through the initial estimates, I've found that with some values of nfactors, I will end up with a quasi heywood case where a communality for a variable is greater than its estimated reliability. I read that this should be met with just as much skepticism as an ultraheywood case so I figured there were a few options: remove the offending variable(s) or find the subset of factor structures where the quasi heywood case does not occur. Ultimately, all any model with a number of factors >4 ended up having this quasi heywood case. This then suggests to me that any model with 5 or more factors is not plausible given my data, and any hypothetical models I explore moving forward will be with 4 or fewer factors. As a quick note, I did try removing the offending variable, but then another one exceeded its reliability upon re-running the program. I also figured changing the model structure was generally more favorable than dropping data points.

wmespi · ‎11-30-2020

Also, with the negative eigenvalues I was expecting to have a heywood or ultra heywood case but that does not appear to be the situation. I would attach the table but it is too large to get a decent screenshot and the formatting gets messed up when I paste it here. At the top of the table is does say Prior Communality Estimates: SMC, so maybe there is a different communality estimate I should be using?

wmespi · ‎11-30-2020

In the discussion of negative eigenvalues with EFA, I have found the following links. Stack Exchange (top response): https://stats.stackexchange.com/questions/97802/how-to-correctly-interpret-a-parallel-analysis-in-exploratory-factor-analysis Gently Clarifying the Application of Horn’s Parallel Analysis to Principal Component Analysis Versus Factor Analysis: https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1026&context=commhealth_fac In the paper he states that eigenvalues in factor analysis can be negative. I am still mulling over what he is asserting as the main difference between the stopping criteria for parallel analysis with PCA vs FA but I will read over it some more.

wmespi · ‎11-30-2020

Thank you for your help. To clarify the code I am using is: proc factor data=sport_polychoric rotate=quartimin method=uls; run; where sport_polychoric is my polychoric correlation matrix. I was under the impression that if I set my method to PRINCIPAL then I would no longer be doing a factor analysis and would instead be running a principal component analysis. I think the diagonals of the correlation matrix not being 1 is a central assumption of factor analysis. Thank you for your assistance and I will look over that link you shared.

wmespi · ‎11-30-2020

Responded to Rick_SAS and this post because I am unsure if you get a notification if I don't reply to your specific post. ^

wmespi · ‎11-30-2020

Is the fact that all my eigenvalues in the PRINCOMP output are non-negative sufficient evidence for my matrix being full rank? Thank you for your help! Do you have any suggestions for testing if my matrix is ill-conditioned? This link (https://mathworld.wolfram.com/ConditionNumber.html) explains how to check for it, but I am unsure of how to do the Singular Value Decomposition in SAS and then conduct this test.

wmespi · ‎11-30-2020

Hi, Sorry I have been out of the office for the weekend. There are a few reasons I am doing exploratory factor analysis (as opposed to principal component analysis): I want to generate hypothetical models for the underlying factor structure of the items on the questionnaire I want to determine if there are any problematic items that do not load strongly onto any factors or load strongly onto multiple factors. These items could be subject for review and/or removal. I will then test these hypothetical models with confirmatory factor analysis. A previous study suggests that a 3 factor model is expected, but the scale of data we are working with is much larger, so we want to see if our findings will be in agreement. To address some of the previous responses: That makes intuitive sense that you would want multicollinearity to be present in the data for factor analysis to be of use. I think maybe I was confusing that with one variable being a linear combination of another variable(s). I will take out the MSA option, but I had originally included it so that I could see the measure of sampling adequacy. Why would the inclusion/exclusion of this metric change how my correlation matrix is calculated? I had tried using the PARALLEL option as well as setting NFACTORS=PARALLEL, but I do not think SAS Enterprise Guide 7.1 supports these features. The eigenvalues are generated from PROC FACTOR. Those are the initial eigenvalue estimates for 22 factors, not the final estimates for the x number of retained factors. The options passed into the EFA macro are sport_data (the name of the raw dataset), columns (the variables I want to analyze using factor analysis), and method (the method of factor analysis, in this case unweighted least squares). After running PROC PRINCOMP on my data I obtain the following results: To be perfectly honest, I am not really sure what to make of this or how I use this process to determine if the matrix is full rank. As an additional note, I passed the raw data into the process, not the polychoric correlation matrix. The image below is the result of when I run the process with the polychoric correlation matrix.

wmespi · ‎11-27-2020

The reason I am investigating multicollinearity is because I read that is not something you want in an EFA model and is potentially a cause of negative eigenvalues. I was wondering if SAS has an easy mechanism for checking for multicollinearity in your data so that I could remove any potentially problematic questionnaire items and see if that makes my correlation matrix then becomes Positive Definite. Aside from that, I was just wondering what are the general protocols when dealing with negative eigenvalues in an EFA model. Do they affect the integrity of the model? They are only present in the initial factorization (# of factors = # of items). If these negative eigenvalues are not of consequence, then this lets me proceed to my next question of deciding the appropriate number of factors to retain. I saw the typical methods are the scree plot and kaiser criteria, but these are not considered to be incredibly robust. I wanted to incorporate parallel analysis using the macro I had originally posted, but I was not sure if it can be used with ordinal data. Lastly, as a way of supporting the usage of factor analysis, I wanted to calculate the KMO measure and conduct Bartlett's test of Sphericity. However, again with both of these I am not sure if having ordinal data messes up the assumptions used to calculate them. The KMO I can at least calculate in SAS using my polychoric correlation matrix and unweighted least squares method but Bartlett's (I believe) can only be calculated with the maximum likelihood estimation method which does not support polychoric correlation matrices.

wmespi · ‎11-25-2020

The method in the parallel analysis macro is unspecified for the simulation component (so I believe that will default for PCA). For calculating the actual eigenvalues in the parallel analysis and in the efa macro, the method is unweighted least squares (ULS).

wmespi · ‎11-25-2020

To clarify, are you asking me to find and post data that is similar to my own? I will look and see what I can find. Is there any other context I can provide that would be helpful to you?

wmespi · ‎11-25-2020

Ad an addendum, I have just received word that Singular Value Decomposition is a promising lead for determining multicollinearity. However, when I look to do this in SAS all I see is code that relies on proc iml. If this is not accessible to me, are there other ways of going about this?

wmespi · ‎11-25-2020

Hi, This is my first time posting, so 1) please forgive my mistakes and 2) hit me with any suggestions for how I can help you help me better! For reference, I am using SAS Enterprise. I have recently been thrown into a project involving factor analysis. Because of the nature of the work, I will not be able to share any data. In lieu of that, I will try to walk you through the procedure. The idea is that we are trying to examine the psychometric properties of a certain instrument (22 questions) that has been administered to a novel population. The instrument data (ignoring demographic and identifying fields) takes the form of an nx22 matrix where every field can take on an integer value from 1 to 5 (ordinal data). The paper we are more or less following can be seen here: https://europepmc.org/article/pmc/5046963 Unfortunately the above paper does not provide a table of all of their eigenvalues so I cannot see if that was an issue they ran into. I also looked at the original paper that this instrument was based on (https://pubmed.ncbi.nlm.nih.gov/15921473/), but it did not provide much guidance on the subject either. In short, I am trying to run exploratory factor analysis on polychoric correlations, but some of my eigenvalues are less than zero (image of preliminary eigenvalues attached below). I have gone through a substantial amount of the literature but it seems rather foggy when it comes to polychoric correlations (my data is ordinal so Pearson's correlations are inappropriate). It is worth noting, however, that I do not have any Heywood or Ultra Heywood situations where my communalities are >=1. Although I will likely only end up retaining around 3-4 factors, I am assuming this implication of negative variance is problematic for the interpretability of my model. The code I used for the factoring is as follows: %macro efa(sport_data,columns,method); *Create polychoric correlations; *Delete noprint option to display coefficient alpha; ods graphics on; proc corr data=&sport_data polychoric outplc=sport_polychoric nomiss alpha noprint; run; *Conduct Factor Analysis; *Calculate KMO measure; *Estimation Method: Unweighted leat squares; *Rotation: Quartimin; *Minimum Eigenvalue: 1; proc factor data=sport_polychoric rotate=quartimin method=&method mineigen=1 scree corr msa; title &sport_data; run; %mend; One source (https://www.tandfonline.com/doi/pdf/10.1080/10705511.2020.1735393) I have found outlined four potential reasons why the correlation matrix may be indefinite. 1) the number of observations is less than the number of items 2) not all correlations are based on the same number of cases 3) the variables are not linearly independent 4) there are items with 0 variance We have a much larger sample size n = 6,547 (after performing data cleaning and list-wise deletion). So numbers 1 and 2 should not be an issue. I ran proc means on all of my items and verified that none of them have a zero variance. 3 is the only one I am uncertain about, and here lies my first question. Is there an efficient way to test for multicollinearity in SAS? I would ideally be able to test every item in the survey with respect to the other items. I presume I would then delete any problematic items. My second question is, if the above are not the cause of my negative eigenvalues, what else could it be? Would some smoothing measures be appropriate, and if so, how would I go about implementing them in SAS? Assuming I can resolve the negative eigenvalue issue, how are my assumptions when it comes to things like Bartlett's Test for Sphericity and the KMO measure affected? Are these even applicable when it comes to polychoric correlations, or are there other more appropriate measures? Lastly, I am curious about parallel analysis and both its applicability and implementation. I have a macro (attached below), that I believe works and have adjusted slightly so that the actual/non-simulated eigenvalues are calculated using a polychoric correlation. I am not sure if with parallel analysis, I should be feeding it a polychoric correlation or not. I apologize for the lengthy post, but this is my first time ever using/learning about factor analysis. I would greatly appreciate your help and suggestions! *Macro for conducting Parallel Analysis; %macro parallel(data=_LAST_, var=_NUMERIC_,niter=1000, statistic=Median,method=uls); /*--------------------------------------* | Macro Parallel | | Parameters | | data = dataset to be analyzed | | (default: _LAST_) | | var = variables to be analyzed | | (default: _NUMERIC_) | | niter= number of simulated datasets | | to create (default: 1000) | | statistic = statistic used to | | summarized eigenvalues | | (default: Median. Other | | possible values: P90, | | P95, P99) | | Output | | Graph of actual vs. simulated | | eigenvalues | *--------------------------------------*/ data _temp; set &data; keep &var; run; /* obtain number of observations and variables in dataset */ ods output Attributes=Params; ods listing close; proc contents data=_temp ; run; ods listing; data _NULL_; set Params; if Label2 eq 'Observations' then call symput('Nobs',Trim(Left(nValue2))); else if Label2 eq 'Variables' then call symput('NVar',Trim(Left(nValue2))); run; /* create polychoric matrix */ proc corr data=_temp polychoric outplc=_temp noprint; run; /* obtain eigenvalues for actual data */ proc factor data=_temp method=&method nfact=&nvar nprint outstat=E1(where=(_TYPE_ = 'EIGENVAL')); var &var; run; data E1; set E1; array A1{&nvar} &var; array A2{&nvar} X1-X&nvar; do J = 1 to &nvar; A2{J} = A1{J}; end; keep X1-X&nvar; run; /* generate simulated datasets and obtain eigenvalues */ %DO K = 1 %TO &niter; data raw; array X {&nvar} X1-X&nvar; keep X1-X&nvar; do N = 1 to &nobs; do I = 1 to &nvar; X{I} = rannor(-1); end; output; end; run; /* create polychoric matrix */ /*proc corr data=raw polychoric outplc=raw noprint;*/ /* run;*/ proc factor data=raw nfact=&nvar noprint outstat=E(where=(_TYPE_ ='EIGENVAL')); var X1-X&nvar; proc append base=Eigen data=E(keep=X1-X&nvar); run; %END; /* summarize eigenvalues for simulated datasets */ proc means data=Eigen noprint; var X1-X&nvar; output out=Simulated(keep=X1-X&nvar) &statistic=; proc datasets nolist; delete Eigen; proc transpose data=E1 out=E1; run; proc transpose data=Simulated out=Simulated; run; /* plot actual vs. simulated eigenvalues */ data plotdata; length Type $ 9; Position+1; if Position eq (&nvar + 1) then Position = 1; set E1(IN=A) Simulated(IN=B); if A then Type = 'Actual'; if B then Type = 'Simulated'; rename Col1 = Eigenvalue; run; title height=1.5 "Parallel Analysis - &statistic Simulated Eigenvalues"; title2 height=1 "&nvar Variables, &niter Iterations, &nobs Observations"; proc print data = plotdata; run; symbol1 interpol = join value=diamond height=1 line=1 color=blue ; symbol2 interpol = join value=circle height=1 line=3 color=red ; proc gplot data = plotdata; plot Eigenvalue * Position = Type; run; quit; %mend parallel;

Online Status	Offline
Date Last Visited	‎12-01-2020 12:27 PM

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

Re: What do I do when my exploratory factor analysis has negative eige...

What do I do when my exploratory factor analysis has negative eigenval...