turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to do factor analysis on dummy variables?

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-25-2012 12:15 PM

I have a few dummy variables as follow, and I need to do factor analysis on them. Anyone can help me out with this?

x1 x2 x3 x4

1 1 0 1

1 0 1 1

0 0 1 0

Accepted Solutions

Solution

07-03-2017
09:12 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-27-2012 09:01 AM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-26-2012 09:31 AM

I think it is rather meaningless to use a technique like Factor Analysis (which was designed for continuous variables) on dummy variables. I can't imagine what the interpretation of the results would be.

Depending on what you are trying to do, something like Correspondence Analysis (designed for categorical variables) might be a better choice.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-26-2012 10:51 AM

I agree with PaigeMiller. What is the purpose of your analysis? To find relationships among the variables, or to identify patterns or groups among your observations?

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-26-2012 11:59 AM

If you believe the underneath indicators are continuous latent variables, then you can create a matrix of tetrachoric correlations and use that matrix for your factor analysis. See the following SAS thread: https://communities.sas.com/thread/34748

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to sounpra

04-26-2012 12:10 PM

It is hard for me to see how dummy variables can represent a situation where "the underneath indicators are continuous latent variables".

Even if such a situation exists, we don't know if the original poster's dummy variables qualify to represent continuous latent variables.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

08-17-2016 01:42 AM

hi i need your help.

my one variable is dummy. the variable is latent and has five dimension. each dimension has 5 to 6 items. now i want to do a factoranalysis of that varible to check that these items under each dimension can be taken as it is or remove form the list to make the instrument reliable to measure the variable. and the variable is Corporate social responsibility.

the other variable is competition which is meaured through HH index.

kindly suggest me to move on.....

thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

08-17-2016 01:50 AM

hi i need your help.

my one variable is dummy. the variable is latent and has five dimension. each dimension has 5 to 6 items. now i want to do a factoranalysis of that varible to check that these items under each dimension can be taken as it is or remove form the list to make the instrument reliable to measure the variable. and the variable is Corporate social responsibility.

the other variable is competition which is meaured through HH index.

kindly suggest me to move on.....

thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-26-2012 09:39 PM

oh jeez, I might get some heat for this, but I think that as long as you can have some direction for the variables or items that allow you to come up with a reasonable interpretation for the solution, then it's fine to run a principal component analysis or even a factor analysis (with an extraction different than maximum likelihood) with the usual proc factor, because the analysis is exploratory and descriptive. You are not testing anything, all you want to know is whether there are some natural groupings among the items or variables. The factor solution should give you an indication of that.

However, the situation is more complicated if you want to do a confirmatory factor analysis. In that case a specified model for the variables is tested and unfortunately I do not think that proc calis offers the most up-to-date methodology for "easily" testing those models with variables that are not continuous (and normally distributed), unless you have a huge sample size. You might have to use Mplus for that, yikes.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-26-2012 11:10 PM

Here is a simple technique (mostly graphical) for exploring the similarity between binary variables, assuming you have more than 4 variables and not too many observations. It is illustrated with purely random Bernouilli trials.

data test(drop=_;

array x{8};

call streaminit(1233);

do id = 1 to 50;

do _i = 1 to dim(x);

x{_i} = rand("BERNOULLI", 0.25);

end;

output;

end;

run;

proc transpose data=test out=ttest prefix=id;

var x:;

id id;

run;

proc distance data=ttest method=braycurtis out=braytest;

var anominal(id:/absent=0);

id _NAME_;

run;

/* Cluster analysis of the variables. Similarity is illustrated on a dendrogram. */

proc cluster data=braytest method=average outtree=trtest print=0;

id _NAME_;

run;

proc tree horizontal data=trtest;

run;

/* 2-dimension representation of the similarity between variables. */

proc distance data=ttest method=dice out=dicetest;

var anominal(id:/absent=0);

id _NAME_;

run;

ods graphics on;

proc mds data=dicetest out=mdstest fit=distance dim=2;

id _NAME_;

run;

The same approach, without the transposition, can be applied to explore the similarity between observations.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

04-27-2012 08:45 AM

Lots of good ideas in this thread, if only we knew what the original poster really wanted to do with his dummy variables.

--

Paige Miller

Paige Miller

Solution

07-03-2017
09:12 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeTurner

04-27-2012 09:01 AM