BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Manny43
Calcite | Level 5

Hi,

I'm working with a number of biomarkers (about 16) and I need to create one variable as and index variable. What's the best way to do this?

for example, say I have variables 'var1 to var16' and I'm trying to compute one variable called var_index.

Would factor analysis be appropriate?

Thanks in advance.

M.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I agree with Reeza, logistic regression would certainly be a good option. Look at variable selection methods, but do not follow their advice blindly! You could also explore your data with simple partition tree analysis in JMP. It relies on fewer assumptions than logistic regression. Another option is discriminant analysis, it also offers variable selection methods (in STEPDISC, limited but nonetheless useful).

If you can SEE a pattern in the data but no simple method will find that pattern, then you could go all the way and try neural networks. In last resort only.

hth

PG

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

It depends what your biomarkers are going be an index to. Ultimately, do you want to identify a subset of biomarkers that will signal the presence or absence of some condition? Or do you want a proxy for some difficult to measure quantity (a continuous variable)? Please tell us more. - PG

PG
Manny43
Calcite | Level 5

Thanks for your response. I need to create a physiological dysregulation index based on a set of biomarkers that include lipids, glucose levels, BMI and so on. I'd like to identify a subset of biomarkers out of the 16 variables that will signal the presence or absence of physiological dysregulation.

I hope that clarified the question.

Thanks,

M.

Reeza
Super User

I would suggest reading the literature and seeing what methods others in your field are using. I don't know enough about biomathematics but I'd consider logistic regression based on how you framed your question.  There may be other considerations however.

PGStats
Opal | Level 21

I agree with Reeza, logistic regression would certainly be a good option. Look at variable selection methods, but do not follow their advice blindly! You could also explore your data with simple partition tree analysis in JMP. It relies on fewer assumptions than logistic regression. Another option is discriminant analysis, it also offers variable selection methods (in STEPDISC, limited but nonetheless useful).

If you can SEE a pattern in the data but no simple method will find that pattern, then you could go all the way and try neural networks. In last resort only.

hth

PG

PG
Manny43
Calcite | Level 5

Thanks for your answers...I'll take the advice.

Cheers,

M

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 988 views
  • 8 likes
  • 3 in conversation