Difference between revisions of "Ecosystem Organization Classifier"

Project
Ecosystem Organization Classifier
Project Information
Has title	Ecosystem Organization Classifier
Has start date
Has deadline date
Has project status	Active
Is dependent on	Crunchbase Database, VentureXpert Database
Does subsume	Defining Incubators, Incubator Seed Data, Incubators in Five Ecosystems
	Copyright © 2019 edegan.com. All Rights Reserved.

Revision as of 14:56, 30 March 2019

Introduction

The purpose of this project is to build a classifier, which takes the description of an ecosystem organization (i.e., a startup, a venture capitalist, an incubator, etc.) and either correctly classifies the organization's type or correctly classifies incubators vs. non-incubators.

Text Processing

There are two obvious classification methods for the processing the textual descriptions. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency (TF-IDF) to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses a shallow 2 layer neural network to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.) We are going to be trying both approaches.

Related Projects

Subsumed Projects: Defining Incubators, Incubator Seed Data, Incubators in Five Ecosystems

This project is dependent on: Crunchbase Database, VentureXpert Database

@@ Line 12: / Line 12: @@
 ===Text Processing===
-There are two possible classification methods for the processing the text of target HTML pages. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses shallow 2 layer neural networks to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.)
+There are two obvious classification methods for the processing the textual descriptions. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency (TF-IDF) to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses a shallow 2 layer neural network to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.) We are going to be trying both approaches.
 ==Related Projects==

Difference between revisions of "Ecosystem Organization Classifier"

Revision as of 14:56, 30 March 2019

Introduction

Text Processing

Related Projects

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools