Changes

Jump to navigation Jump to search
no edit summary
7/13 - Pulled descriptions and industry tags from crunchbase to match with the UUIDs we already have. Results of these tables are in the Industry Classifier update folder of Accelerators\Summer 2018. This morning, I read Christy and Yang's wiki project pages. To start, I tried to figure out how to best build a new coding system for the industry flags that are given in crunchbase. Looking at the old code in FinalIndustryClassifier_command.py, I can't find the file that builds the neural net. There are many versions floating around and I'm not sure which is the correct one.
 
---------------------------------------------------------------------------------------------
7/16 - Figured out which file is capable of rewriting the Classifier.pkl file and how all the code and test files go together. I built a small training and test data set to work with, and I got IndustryClassifierCOPY.py to run on my data. I had to fix many index and key issues in parts of the code, which is not commented at all. With 10 industry categories and 970 training data points, I think the accuracy rate is around 30%. I tried to run the code on a bigger training data set, hoping that the accuracy rate would come up, but I got error messages back.
7/17 - Tried to run test data through FinalIndustryClassifier.py but it doesn't work even though the same file works in IndustryClassifierCOPY.py. The crunchbase descriptions are longer than the old ones from venturexpert, and I'm thinking that the accuracy rate may come up if I give the model more data and use these long descriptions. I talked with Wei and we tried to figure out the details of sklearn and the code together. See [[Industry Classifier]] for the exact files I've been using.
 
7/18 - Organized code into a new file called IndusryClassifierTEST.py. I cut out about half of the code, around 100 lines, in the original file that was repetitive/unused, and confirmed that my new file gives me the same results as the old file.
145

edits

Navigation menu