Changes

Ecosystem Organization Classifier (view source)

Revision as of 16:05, 30 March 2019

431 bytes added , 16:05, 30 March 2019

There are two obvious classification methods for the processing the textual descriptions. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency (TF-IDF) to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses a shallow 2 layer neural network to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.) We are going to be trying both approaches.

====Code built already====

We have previously used bag-of-words in the [[Demo Day Page Google Classifier]] and in early versions of the [[Industry Classifier]]. Later versions of the [[Industry Classifier]] were based on our [[Deep Text Classifier]] project.

====First data====

For the first data, we are going to use organization descriptions from Crunchbase. Run this code on '''crunchbase3''' (see [[Crunchbase Database]]):

==Related Projects==

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,612

edits

Changes

Ecosystem Organization Classifier (view source)

Revision as of 16:05, 30 March 2019

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools