Accelerator Demo Day

From edegan.com
Revision as of 15:21, 18 July 2018 by Leminh.ams (talk | contribs)
Jump to navigation Jump to search


McNair Project
Accelerator Demo Day
Project logo 02.png
Project Information
Project Title Accelerator Demo Day
Owner Minh Le
Start Date 06/18/2018
Deadline
Primary Billing
Notes
Has project status Active
Subsumes: Demo Day Page Parser, Demo Day Page Google Classifier
Copyright © 2016 edegan.com. All Rights Reserved.


Project

This project that utilizes Selenium and Machine Learning to get good candidate web pages and classify webpages as a demo day page containing a list of cohort companies, currently using scikit learn's random forest model and a bag of words approach

Code Location

The source code and relevant files for the project can be found here:

E:\McNair\Projects\Accelerator Demo Day\

Development Notes

The Crawler Functionality

To be updated

The Classifier

Input (Features)

The input (features) right now is the frequency of X_NUMBER of words appearing in each documents. The word choice is hand selected.

Idea: Create a matrix with the first col being the file BiBTex, and the following columns are the words, and the value at (file, word) is the frequency of that word in the file. Then, split the matrix into an array of row vectors, and each vector is then feed into the RNN)

Reading resources

http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf