Changes

LinkedIn Crawler (Python) (view source)

Revision as of 11:59, 5 April 2017

1,189 bytes added , 11:59, 5 April 2017

no edit summary

E:\McNair\Projects\LinkedIn Crawler

The ~~main script for this crawler is~~code from the original Summer 2016 Project can be found in: ~~crawl_linkedin~~web_crawler\linkedin The next section will provide details on the construction and functionality of the scripts located in the linkedin directory.py

The old documentation said that the programs/scripts (see details below) are located on our [[Software Repository|Bonobo Git Server]].

branch: researcher/linkedin

directory: /linkedin

==Accounts==

pass: McNair2017

Real Account:

email: ed.edgan@rice.edu

pass: This area has intentionally been left blank.

=LinkedIn Scripts=

==Overview==

This section provides a file by file breakdown of the contents of the folder located at:

E:\McNair\Projects\LinkedIn Crawler\web_crawler\linkedin

The main script to run is:

run_linkedin_crawler.py

==crawlererror.py==

This script is a simple class construction for error messages. It is used in other scripts to raise errors to the user when errors with the crawler occur. Please continue.

==linked_in_crawler.py==

This script constructs a class that provides navigation functionality around the traditional LinkedIn site. The beginning section lists some global xpaths that will be used by Selenium throughout the process. These xpaths are used to locate elements within the HTML. The following are some important functions to keep in mind when designing original programs using this code.

=== login(self, username, password)===

This function takes a username and password, and logs in to LinkedIn.

==Functionality==

Peterjalbert

Bureaucrats, Administrators (Semantic MediaWiki), Administrators

479

edits

Changes

LinkedIn Crawler (Python) (view source)

Revision as of 11:59, 5 April 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools