BookmarkSubscribeRSS Feed
SteveDenham
Jade | Level 19

True!  And it can incorporate 's work regarding validation, using the CROSSVAL options.

I was still afraid that 11K right-hand side variables could overwhelm even PLS, but given your endorsement for that size, I will defer.  It is the method for use.

And if the OP is interested in variable reduction, they can use the output as an exploratory tool in identifying strongly determining factors and strongly redundant variables.

Steve Denham

PaigeMiller
Diamond | Level 26

Well, I didn't say the OP has enough hardware resources, but that's a different issue. PLS doesn't require inverting a matrix if you use the NIPALS Algorithm (which is the default in PROC PLS), so it doesn't really require huge amounts of memory, and its pretty fast. If her machine can handle it, then that's the way to go.

--
Paige Miller
Ksharp
Super User

Hi Steve,

I also am considering another way :  Cronbach’s Coefficient Alpha

What do you say ? Make some sense ?

Xia Keshan

art297
Opal | Level 21

I don't think so. Chronbach's Alpha is a measure if internal consistency. I would start with a factor analysis but, rather than try to simply eliminate variables, would attempt to group like measures together (e.g., sum or mean of the items in a group) in order to reduce the number of variables.

Ksharp
Super User

How about  PROC GLMSELECT  to select subset of variables ?

SteveDenham
Jade | Level 19

GLMSELECT doesn't support methods such as LASSO, only stepwise-like methods, so it runs into the same problems I outlined in my earlier post.

Steve Denham

gergely_batho
SAS Employee

GLMSELECT has LASSO!

SteveDenham
Jade | Level 19

That is good news.

Steve Denham

call_me_elaine
Calcite | Level 5

Thanks so much for your reply. You just gave me so much information and I need some time to understand it since I am totally totally a new user. At least I know stepwise is not the solution.:)

call_me_elaine
Calcite | Level 5

Thanks so much for your reply. You just gave me so much information and I need some time to understand it since I am totally totally a new user. At least I know stepwise is not the solution.

BTW, what is OP short for?

art297
Opal | Level 21

OP is short for Original Poster.

I'm not a statistician, but have taken numerous statistic's courses.  I think that factor analysis, or PCP, will be a good first step to reduce your number of variables.  I.e., you could create combined scores that take into account collections of grouped measures. However, that would require business knowledge to determine whether the results (i.e., what to combine) seem to make sense.

PG's proposal was simply to create a model on a sample of your data and see if it held up on another sample of your data.

call_me_elaine
Calcite | Level 5

Yes that's what I am thinking now. Do the variable reduction first. I tried PCA today and there are 160 components in the model where MSE is 0.0106. However, I didn't get the coefficient of these components, so I don't know how to explain it.

I guess what PG means is the N fold cross-validation.

Thanks,

Elaine

stat_sas
Ammonite | Level 13

If you are looking for interpretation based on the original variables as key drivers then on the same lines as suggested by Art there is a procedure proc varclus that provides groups of variables which are internally highly correlated but have very small association with variables belong to other groups. You can use your experience along with information provided by proc varclus to select few number of variables for further analysis.

call_me_elaine
Calcite | Level 5

I did try PCA and used the first 160 components. The MSE is 0.0106. But I didn't get the coefficient of these components and what are these components. I got the results but just don't know how to explain it.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 28 replies
  • 2572 views
  • 5 likes
  • 9 in conversation