About Bootsy1

Bootsy1 · ‎07-13-2022

the first paper was written when the K-means method was not yet widely known and, except for the mechanical interpretation, was probably not so different from. You can see that the focus is in the efficiency of the algorithm due to resource scarcity at the time. The second paper is an attempt to avoid predefining the number of clusters. Their number is found studying the potential wells of a gravitational field where the point inter-distance is a negative exponential of a Manhattan distance. My previous attempts with a plain Newtonian force was hindered by the "black hole" phenomenon due to points so near that the gravitational force becomes too great making them to start drifting through the space.

Bootsy1 · ‎07-12-2022

on the wake of my eightieth birthday, I led a bit forward the solution of an old obsession of mine: substantive clustering, that is finding clusters when they naturally are in the data body. The problem started boiling in my mind when I was very young (see attachment 1) because all Euclidean Space based algorithms are amenable to a mechanical interpretation: the moment of inertia of the data body. The moment of inertia makes you to think to a "mechanics-style" framing of the problem. For many years I tried to find substantive clusters searching for potential wells in a gravitational field dictated by the data themselves as if points were asteroids. I already communicated something about to this community. However, I didn't fully succeed in my search because of "black holes" because too near points are endowed with an unlimited gravitational force. The current state of my research efforts shows that using a weighted Convolution of the sample's Density with the Laplace Distribution one can have a kind of Gravitational Field without singularities, the infamous black holes. You find details in (2). Hope somebody will continue my work because I have no hope to find a better solution. For me, the problem is closed. I'm now trying to address a different problem. Following Lancaster, I want to provide users with an improved Conjont Analysis Model where Price is NOT a Conjoint factor. My experience in some hundred cases so far shows that Products seem anelastic when price is considered a Conjoint Factor. I'm going to publish this result also within the ASA Community and within some Linkedin groups

Bootsy1 · ‎08-12-2021

continuing my research about the the number of clusters, I found that the gravitational approach can be better described as in the attachment. The Gravitational Force Field accurately describes where the mass is laud in the space so that it is possible to view the gravitational force as an indicator of the local distribution density. Comments are very welcome. Ulderico.

Bootsy1 · ‎04-19-2021

I'm going to upload pdf docs in the Community's workspace You sould be able to get them right away

Bootsy1 · ‎04-17-2021

I find that the main challenges of Clustering are two: 1. one acts on a sample. This entails monumental consequences. Different samples share no points with probability almost 1. So that you can never claim replicability in clustering if you follow any of the many extant algorithms that go on sequentially. Only if you act on "central points", actually local means, you can claim replicability. 2. sequential methods reach a solution, of course. However, you never know how much the solution is far form the optimal one. Going parallel has two advantages: 1. you find "central points", that is points that have many surrounding ones so that they don't move during iterations. Central points are local means that have a surrounding subsample, aka cluster. This makes their standard error to be much less than the standard deviation that measures the variability of single points. So that, if you follow the "any point is good" approach, where all points are equivalent, you are exposed to the variability of sigma, while if you act on "central points", actually local means, you face a much smaller variability, actually a fraction of the sample standard deviation. That's why central points remain stable during iterations 2. you avoid the worst of sequential problems, where the solution varies with the point you start from. Because you act on points that have a high variability, the first point decides which one the solution will be. In my opinion, the method should be parallel.

Bootsy1 · ‎04-16-2021

here are the links for pdf documents. https://drive.google.com/file/d/14gt5AjNdmyAwKz5Tul00RIrZy0O3n1-S/view?usp=sharing https://drive.google.com/file/d/1ReFpJUGSAzz2xPVxbRjtYa-HxkKAPupz/view?usp=sharing they should be virus free. I'm using Malware software that seems very powerful. Ulderico.

Bootsy1 · ‎04-16-2021

I uploaded both documents into Google Drive. Here they are https://drive.google.com/file/d/19maSnXSdXtql61tshGGLnvwgvF6W8c04/view?usp=sharing https://drive.google.com/file/d/1lJd9f96w_J41BmvW8CQWNTZ5gPph4Q9P/view?usp=sharing thank you for your interest. Ulderico.

Bootsy1 · ‎04-16-2021

I'm attaching a document where a new promising strategy for finding the number of clusters is explained along with a sample commented SAS code. Grateful for any comments Ulderico Santarelli

Online Status	Offline
Date Last Visited	‎02-11-2023 02:56 PM

Re: FINDING THE NUMBER OF CLUSTERS BEFORE CLUSTERING

FINDING THE NUMBER OF CLUSTERS BEFORE CLUSTERING

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

in search of the number of clusters

Re: FINDING THE NUMBER OF CLUSTERS BEFORE CLUSTERING

FINDING THE NUMBER OF CLUSTERS BEFORE CLUSTERING

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

Re: in search of the number of clusters

in search of the number of clusters