Difference between revisions of "Listing Page Plugin Spec"

Project
Listing Page Plugin Spec
Project Information
Has title	Listing Page Plugin Spec
Has owner	Rex Bone
Has start date
Has deadline date
Has project status	Active
	Copyright © 2019 edegan.com. All Rights Reserved.

Revision as of 15:24, 1 April 2019

Plugin Overview

Faced with the problem of no standardization across incubator and accelerator websites, there is a design feasibility question concerning automating the extraction of information. A browser plugin with user guidance could serve as a fundamental first step towards total mechanization of the process. See LP_Extractor_Protocol for a comprehensive introduction to potential methods.

The focus of this design is to create a tool which allows for the quick identification of HTML markings on a webpage and subsequent reduction to a DSL for useful data extraction. Multiple options will be considered, including allowing the user to visually 'draw' a grid, either via dragging or marking vertices, and mouse-over. Attention will be given to potentially viable technical resources as well as usability.

Current List of sites to examine: Accelerator List

(E:\projects\Kauffman Incubator Project\02 Identify the client listing page\Listing Page Classifier)

Sample Webpage:

Image taken from 500kobe.com

Technical Specifications

HTML Layout Variations

HTML tree structure differs by site and web developer preference. A look at examples of accelerator websites reveals the following methods of organizing company data:

Each "views-row" tag represents a starting extraction point

div tag, class parameter

In certain cases, the existing style guide may be utilized for ease of extraction.

If the user highlights one section, the label can be extracted and then used to locate the remaining start-ups.

Browser Choice

Firefox
Chrome
Internet Explorer

Programming Language & Frameworks

Python
Node.js

User Input Styles

Drag + Drop
Marking Vertices
Mouse-Over

Current Problems

"Infinite Scroll" webpages: Potentially impossible to account for incubator websites which display company lists in an infinite scroll style. Would require multiple instances of user input.

@@ Line 21: / Line 21: @@
 ==Technical Specifications==
+===HTML Layout Variations===
+HTML tree structure differs by site and web developer preference. A look at examples of accelerator websites reveals the following methods of organizing company data:
+[[File:Divclassexample.PNG|thumb|Each "views-row" tag represents a starting extraction point]]
+# div tag, class parameter
+In certain cases, the existing style guide may be utilized for ease of extraction.
+If the user highlights one section, the label can be extracted and then used to locate the remaining start-ups.
+===Browser Choice===
+*Firefox
+*Chrome
+*Internet Explorer
+===Programming Language & Frameworks===
+*Python
+*Node.js
 ===User Input Styles===
@@ Line 26: / Line 44: @@
 * Marking Vertices
 * Mouse-Over
-===Browser Considerations===
-*Firefox
-*Chrome
-*Version Control
-===Language Considerations===
 ===Current Problems===
 * "Infinite Scroll" webpages: Potentially impossible to account for incubator websites which display company lists in an infinite scroll style. Would require multiple instances of user input.

Difference between revisions of "Listing Page Plugin Spec"

Revision as of 15:24, 1 April 2019

Contents

Plugin Overview

Technical Specifications

HTML Layout Variations

Browser Choice

Programming Language & Frameworks

User Input Styles

Current Problems

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools