Changes

Jump to navigation Jump to search
2 bytes removed ,  12:33, 6 October 2020
no edit summary
|Has project status=Proposed
}}
<onlyinclude>[[The Matcher (Tool)]] is a tool to match and merge datasets using company names as identifiers. It is written in perl and implements both normalization and fuzzy matching techniques. The normalization methods include 'Hall' and others used by the [[NBER Patent Data]] project, and the fuzzy matching supports a range of techniques (Ngram, LCS, etc.) that can be used to generate candidate lists for human processing or machine learning, as well as threshold-based cut-offs.</onlyinclude>
==Code==

Navigation menu