Changes

Urban Start-up Agglomeration and Venture Capital Investment (view source)

Revision as of 13:23, 7 November 2020

1,160 bytes removed , 13:23, 7 November 2020

This section explores whether we could implement the '''actual''' elbow method (see https://en.wikipedia.org/wiki/Elbow_method_(clustering) ). The answer is that we might be able to, at least for some sub-sample of our data, but that it likely doesn't give us what we want.

I used distances calculated by ST_Distance and calculated the '''variance explained''' using the equation below. The between-group variance is undefined for the first layer, as it has <math>k=1</math> and <math>\bar{Y}_{i\cdot} = \bar{Y}</math> (i.e., a its single all-encompassing hull so its centroid the overall mean) and its variance is then <math>n_i(0)^2/(0)</math>.

I then calculated forward differences, and added one to the answer, as using central differences left truncates the data. (An inspection of the data revealed that it is vastly more likely that the 'correct' answer is found at the left end of the data than the right. Also central first difference bridge the observation, which can lead to misidentification of monotonicity.) Specifically, I used:

~~:<math> f'(x) = f(x + 1) - f(x) </math>~~

~~:<math> f''(x) = f(x+2) - 2 f(x+1) + f(x)</math>~~

I then used f'(x) to determine the layer index from which the variance explained was monotonic (i.e., there was no change in sign in f'(x) in higher layer indices), found the layer index <math>i</math> at which <math>varexp_i = min(varexp)</math>, and marked <math>i+1</math> as the elbow layer.

=====Background=====

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,612

edits

Changes

Urban Start-up Agglomeration and Venture Capital Investment (view source)

Revision as of 13:23, 7 November 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools