Changes

Jump to navigation Jump to search
no edit summary
Extracting features from Surnames entails encoding the frequency of [http://en.wikipedia.org/wiki/Ngram n-grams ] and other features such as the string length. Recall that 1-grams are letters or characters, also called unigrams, 2-grams are called bigrams or digraphs, and 3-grams are called trigrams. In some applications entire words, sentences or other tokens are used as grams.
==Assumption of Independence of Features==
Anonymous user

Navigation menu