Changes

Jump to navigation Jump to search
<onlyinclude>
[[Kyran Adams]] [[Work Logs]] [[Kyran Adams (Work Log)|(log page)]]
 
2018-04-23: So auto-generated features actually reduces accuracy, probably because there isn't enough data. I've gone back to my hand picked features and I'm just focusing on making the dataset larger.
2018-04-16: Still working through using auto-generated features. It takes forever. :/ I reduced the number of words looked at to about 3000. This makes it a lot faster, and seems like it should still be accurate, because the most frequent words are words like "demo" and "accelerator". I also switched from using beautiful soup for text extraction to [https://github.com/aaronsw/html2text html2text]. I might consider using [https://nlp.stanford.edu/IR-book/html/htmledition/sublinear-tf-scaling-1.html Sublinear tf scaling] (parameter in the tf model).
226

edits

Navigation menu