Changes

Jump to navigation Jump to search
no edit summary
*[https://docs.scrapy.org/en/latest/index.html Scrapy]
: Similar to Beautiful Soup, but has more extensive and efficient functionality. Extracts HTML data by creating "selectors" specified by CSS or XPath expressions.
 
==== pix2code ====
* [https://github.com/tonybeltramelli/pix2code pix2Code]
: Github repo that contains original reference implementation of pix2code architecture. See above pix2code paper.[https://www.youtube.com/watch?v=pqKeXkhFA3I&feature=youtu.be Video] demo of trained neural network.  * [https://github.com/fjbriones/pix2code2 pix2code2]: An attempt to improve pix2code through the use of autoencoders between the two LSTM layers. * [https://github.com/emilwallner/Screenshot-to-code Screenshot-to-code]: Another version of pix2code with a Bootstrap version that converts web page screenshots to HTML, with the potential to generalize on new design mock-ups.  * [https://github.com/andrewsoohwanlee/pix2code-pytorch pix2code PyTorch]: pix2code implemented in PyTorch, also not ready for general usage yet.           
=== DFS Encoding ===
: In conjunction with tensorflow, Keras will support the deep learning components of the project. Is required for pix2code.
*[https://github.com/ziyan/spider SVM Classifier Training Algorithm ]
: From the Yao, Zuo paper, this Github repo contains an algorithm for labelling the collected dataset using clustering, training the SVM with the labeled dataset, and using SVM model to extract content from new webpages. Implemented in JavaScript, CoffeeScript, and Python.
 
* [https://www.h5py.org/ H5PY]
: The h5py package can be used to store large amounts of numerical data, and integrates well with NumPy
65

edits

Navigation menu