Changes

Jump to navigation Jump to search
no edit summary
This algorithm has terrible time-performance characteristics, so we make the assumption that we can divide a large number of points with k-means and then solve those subproblems.
In other words, we make the simplifying assumption that the Enclosing Circle Algorithm has [https://en.wikipedia.org/wiki/Optimal_substructure Optimal Substructure].
 
== Structure of Program ==
* TODO
== Parameters ==
* in <code>circles.py</code>:
** <code>PATH_SEPARATOR</code>: the string that separates parts of the filename for both input and output files. For example, an input could look like "St. Louis#MO#2017#0.tsv" for PATH_SEPARATOR = '#'
** <code>ITERATIONS</code>: the number of iterations to attempt for each <code>k</code> to find minimum for that <code>k</code>
** <code>MIN_POINTS_PER_CIRCLE</code> (AKA <code>n</code>): the minimum number of data points that must be included in a circle
** <code>TIMEOUT_MINUTES</code>: maximum running time of a parallel instance of the algorithm
** <code>SPLIT_THRESHOLD</code>: if a dataset has more than this threshold of data points, it will be split via k-means
** <code>EXECUTABLE_INSTANCE_PATH</code>: the path to circles.py
** <code>OUTJOINER_INSTANCE_PATH</code>: the path to outjoiner.py
** <code>DATA_DIRECTORY</code>: the input directory
** <code>OUTPUT_DIRECTORY</code>: the directory to write the outputs of circle.py to
** <code>GENERATE_REPORTS</code>: whether or not to call outjoiner.py (writes reports on the output of circles.py)
** <code>REPORT_DIRECTORY</code>: the directory to write reports to
 
== Example Usage ==
226

edits

Navigation menu