Changes

Jump to navigation Jump to search
The within-cluster variance (and so F-stat and variance explained) revealed an issue with the data that had to be fixed: The Python HCA script forces the decomposition of multitons into singletons at the end of its run! We want to stop the HCA when we have every location in a separate point, rather than artificially forcing startups with the same location into separate points. This issue likely doesn't affect the maximum R2 method, but does affect the heuristic method(s) that rely on layer indices.
I pushed through the change and reran everything (it took a couple of hours).
=====Trying to find the elbow=====
 
The objective is to apply the [https://en.wikipedia.org/wiki/Elbow_method_(clustering) Elbow Method], which involves finding the [https://en.wikipedia.org/wiki/Knee_of_a_curve Knee of the curve] of either the F-statistic or variance explained.
====Fixing the layer index====

Navigation menu