Hi,
I'm working with a number of biomarkers (about 16) and I need to create one variable as and index variable. What's the best way to do this?
for example, say I have variables 'var1 to var16' and I'm trying to compute one variable called var_index.
Would factor analysis be appropriate?
Thanks in advance.
M.
I agree with Reeza, logistic regression would certainly be a good option. Look at variable selection methods, but do not follow their advice blindly! You could also explore your data with simple partition tree analysis in JMP. It relies on fewer assumptions than logistic regression. Another option is discriminant analysis, it also offers variable selection methods (in STEPDISC, limited but nonetheless useful).
If you can SEE a pattern in the data but no simple method will find that pattern, then you could go all the way and try neural networks. In last resort only.
hth
PG
It depends what your biomarkers are going be an index to. Ultimately, do you want to identify a subset of biomarkers that will signal the presence or absence of some condition? Or do you want a proxy for some difficult to measure quantity (a continuous variable)? Please tell us more. - PG
Thanks for your response. I need to create a physiological dysregulation index based on a set of biomarkers that include lipids, glucose levels, BMI and so on. I'd like to identify a subset of biomarkers out of the 16 variables that will signal the presence or absence of physiological dysregulation.
I hope that clarified the question.
Thanks,
M.
I would suggest reading the literature and seeing what methods others in your field are using. I don't know enough about biomathematics but I'd consider logistic regression based on how you framed your question. There may be other considerations however.
I agree with Reeza, logistic regression would certainly be a good option. Look at variable selection methods, but do not follow their advice blindly! You could also explore your data with simple partition tree analysis in JMP. It relies on fewer assumptions than logistic regression. Another option is discriminant analysis, it also offers variable selection methods (in STEPDISC, limited but nonetheless useful).
If you can SEE a pattern in the data but no simple method will find that pattern, then you could go all the way and try neural networks. In last resort only.
hth
PG
Thanks for your answers...I'll take the advice.
Cheers,
M
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.