Changes

Jump to navigation Jump to search
[[Listing Page Classifier Progress|Progress Log (updated on 4/15/2019)]]
====Site Map Generator====
Performing 2 following algorithms to find all internal links of a web page with 2 given user inputs: homepage url and depth
[[File:WebPageTree.png|900px]]
We treat each internal page as a tree node. Each node can have multiple children. Taking the above picture as an example, the homepage is the first tree node that we will be given as an inputto our function, and this treed node it has 4 children: page 1, page 2, page 3, and page 4Given the above intuition, we have built 2 following algorithms to find all internal links of a web page with 2 given user inputs: homepage url and depth
*Breadth-First Search(BFS)approach:we examine all pages (nodes) at the same depth before going down to the next depth.
E:\projects\listing page identifier\Internal_Link\Internal_url_BFS.py
*Depth-First Search (DFS) approach: we visit a page(node) and all its children on the current path will be visited before we examining this page neighbor node.For example, assuming the furthest depth a user wants to go is 2, we will start with our homepage and then examine its first children page 1, then visiting page 1's children.
E:\projects\listing page identifier\Internal_Link\Internal_url_DFS.py
227

edits

Navigation menu