Changes

Jump to navigation Jump to search
m
no edit summary
The candidate reference string with the highest gram and LCS scores, assuming that these scores meet the decision threshold is then selected as the closest match. If no candidate reference string meets the decision threshold the source string is left unmatched. The decision thresholds are configured in the MatchLocations.pl script, and sets of Ngram/LCS matchings, using different character sets, gram lengths and decision thresholds, are performed sequentially, with the currently unmatched source strings used as input for each round.
===Reconsiling Reconciling Multiple Matches===
In a small number of cases it is possible that the source string will achieve more than one A (Area) or P (Place) match. For example suppose the string "Glouchester Street Cambridge Cambridgeshire" where considered. This could concievably produce two P matches and one A match with the token matching algorithm detailed above.
Anonymous user

Navigation menu