Good morning,
I am looking to program the elbow method in order to know how many clusters to select to dichotomize my quantitative variable, could anyone help me?
Thanks in advance,
Sincerely,
I'm not aware of any programming of the elbow method in SAS. But maybe others know how it can be done.
However, here are discussions about determining the number of clusters, both of which indicate that there is no universally agreed upon method.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_introclus_sect010.htm
There is also the very simple idea of treating continuous variables as continuous variables, instead of categories, in whatever analysis you want to do, which is easier to do than creating clusters.
<Pedantic mode: ON>
Dichotomous means two. So you have already decided there will be two clusters if you "dichotomize" anything.
<Pendantic mode: OFF>
So the question would be where the breakpoint should be. I would imagine Proc Freq might give an idea if there is anything really worth treating as a "cluster"
@alexandraIFCT wrote:
Good morning,
I am looking to program the elbow method in order to know how many clusters to select to dichotomize my quantitative variable, could anyone help me?
Thanks in advance,
Sincerely,
Excuse me I used the wrong term, I don't necessarily want to make 2 groups, I wanted to use the proc fastclus to determine clusters but you have to put a number of clusters you want and that's where I don't know how to choose.
People sometimes present a very narrow view of the problem ... "how do I determine the number of clusters?" I encourage you to present a wider view of the problem: "how do I determine the number of clusters if I want to perform analyses such as _____________ and ______________ on the clusters for data coming from the field of __________ "?
Context makes a difference. Depending on what you are doing, I could see different answers.
I have a biological marker on which I would like to carry out a prognostic analysis of survival, this marker is a continuous variable but medical interpretation is difficult on a continuous variable, hence my desire to make groups.
Thank you, that's very helpful to me. Some thoughts
Again, I don't work in your field and don't now what the norms are for this type of analysis, but I like the first choice above best, unless I felt I could sell people on the second choice, in which case I would do that (especiallly if the model predicted better using a continuous rather the discrete variable).
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.