06-26-2012 01:16 PM
I am looking for an expression for the calculation of a cluster's center. Is the sample mean vector an expression for the center? For example, if there are three variables in your X vector, [X1, X2, X3] then the mean vector or center is the average of each one [X1mean, X2mean, X3mean], and therefore the sample mean vector is calculated by the average of each of the 3 variables?
In any case, I am very confused on where to find an expression for the calculating the center of each cluster. Any assistance would be greatly appreciated.
06-30-2012 06:31 AM
Not quite. The calculation is similar to what you've proposed except that centroids are expressed in n-dimensional space, in this case 3 dimensions. So, if your axes are x, y and z there would be three separate calculations:
The centroid is then the mean of these points, a point that minimizes both the mean distance from the centroid and the mean squared distance...minimizing intra-cluster variance.