Changes

Jump to navigation Jump to search
no edit summary
The bundle contains:
*[http://www.edegan.com/repository/MatchLocations.pl MatchLocations.pl] - The main script and initializes and processes the matching requests
*[http://www.edegan.com/repository/Match::GNS.pm Match::GNS.pm ] - Interface to the GNS reference data (see below)*[http://www.edegan.com/repository/Match::Patent.pm Match::Patent.pm ] - Interface to the Patent Location data (see below)*[http://www.edegan.com/repository/Match::Gram.pm Match::Gram.pm ] - Custom NGram Module*[http://www.edegan.com/repository/Match::LCS.pm Match::LCS.pm ] - A standard LCS Module
*[http://www.edegan.com/repository/Match::PostalCodes.pm Match::PostalCodes.pm]
*[http://www.edegan.com/repository/Match::CleanStrings.pm Match::CleanStrings.pm ] - Provides string cleaning routines for both the reference and source data*[http://www.edegan.com/repository/PatentLocations-Stopwords.txt PatentLocations-Stopwords.txt ] - A Stop Word file (tab delimited)
==Reference Data==
To reconsile multiple matches the following process is undertaken:
*If there are both P and A matches and more than one of either P and/or A matches, then determine the P-A pair with the shortest distance between then using a [http://en.wikipedia.org/wiki/Haversine_formula Haversine formula ] distance calculation based on the GNS reported longitudes and latitudes.(Note that the Haversine formula is implemented in the Match::GNS.pm module and is the most accurate method over short-distances, where other methods, like the great-circle method, suffer from compounded rounding error problems.)
*If there are multiple P matches but no A matches, take the one that was arrived at first.
*If there are multiple A matches but no P matches, take the one that was arrived at first.
Anonymous user

Navigation menu