Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Te...

LKlein88 · Posted 03-30-2015 03:20 PM

I'm trying to write a formula that would do the following:

1) Identify which points are new clusters within given conditions (those conditions are when the Euclidean Distance is sufficiently small (<600), and when the time difference is sufficiently large (>15). The conditional statements are not the current issue, but renaming the BaseClusterID variable is.

2) I want to be able to create a new cluster ID name for each cluster for which this condition holds (i.e. ID points 3, 223, 10344, and 16078 all are satisfied by the above conditions, so I'd want them all named a different cluster ID (Cluster 1, 2, 3, and 4).

3) Every sequential ID point which satisfies this condition falls in the same cluster (so points 4 and 5 are in the same cluster as point 3).

I wanted to know if it was possible to achieve this renaming of clusters with Do loops and Arrays. Any assistance or direction would be much appreciated.

Here is a sample of what I have and what I am looking for:

Dataset One (what I have):

ID BaseClusterID DeltaTime EucDistance

1 cluster_0 3 70

2 cluster_0 1 4000

3 cluster_0 22 25

4 cluster_0 2 80

5 cluster_0 2 200

...

Dataset Two (what I'm looking for):

ID BaseClusterID DeltaTime EucDistance

1 cluster_0 3 70

2 cluster_0 1 4000

3 cluster_1 22 25

4 cluster_1 2 80

5 cluster_1 2 200

...

ballardw · Posted 03-30-2015 03:31 PM

I think you'll need to provide a bit more information about the input data. From your dataset one I have no way/reason to tell that ID value 4 should be a different cluster than ID 1. I have to assume there are some groups of coordinates that are used as the base and another set compared with those and possibly there is a rule about which base(?) coordinates are considered when deciding which cluster value assignment is considered.

LKlein88 · Posted 03-30-2015 03:46 PM

I understand these concerns and appreciate the prompt response. So for this project, I want these data to be grouped based on proximity in time and space.

So image we're talking about points 1 - 9. Points 3 - 9 form a cluster. But if there was missing data between points 3 and 4, there might be a larger gap in time. It is still evident there is a cluster there, however, as all points are sufficiently close to one another. I'm looking to detect that Point 3 is the first point in this cluster and to read in that all other points are sufficiently close in time and space to say they are also points in this cluster.

ballardw · Posted 03-30-2015 03:53 PM

Without explicit data for coordinates I think I would approach this using Proc Fastclus. Possibly looking at creating the potential geographic clusters first and then applying the time element afterwards.

LKlein88 · Posted 03-30-2015 04:06 PM

I'll definitely look into Proc Fastclus, thank you much. I should also mention that I do have explicit data for the coordinates, and these take place after running an ST-DBSCAN analysis. I'm just wondering if it is possible to rename points 3 - 9 using Do loops/ Arrays.

ballardw · Posted 03-30-2015 04:57 PM

If you have a single column likc Cluster assigning values is easy. You should note that in FASTCLUS by default it will generate values like CLUSTER1, CLUSTER2 etc. to identify the groups of coordinates it recommends as a cluster. So your loop/array may not be needed.

LKlein88 · Posted 03-30-2015 05:20 PM

I understand automating the function is a quick and simple way to identify the groups of coordinates. However, this is post-cluster scan analysis for any datum that might've fallen through the cracks, an aspect that I feel is best handled through manual detection.

Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Re: Help Requested: Grouping Data with Do Loops/ Arrays based on Spatio-Temporal Coordinates

Registration is open