Changes

Jump to navigation Jump to search
[[Kyran Adams]] [[Work Logs]] [[Kyran Adams (Work Log)|(log page)]]
2018-04-16: Still working through using auto-generated features. It takes forever. :/I reduced the number of words looked at to about 3000. This makes it a lot faster, and seems like it should still be accurate, because the most frequent words are words like "demo" and "accelerator". I also switched from using beautiful soup for text extraction to [html2text https://github.com/aaronsw/html2text].
2018-04-16: I think I'm going to transition from using hand-picked feature words to automatically generated features. [http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html This webpage] has a good example. I could also use n-grams, instead of unigrams. I might also consider using a SVM instead of a random forest, or a combination of the two.
226

edits

Navigation menu