Changes

Google Scholar Crawler (view source)

Revision as of 18:37, 28 November 2017

1,470 bytes added , 18:37, 28 November 2017

no edit summary

==Code Written for McNair== ===downloadPDFs.py=======Overview====This code exists in E:\McNair\Software\Google_Scholar_Crawler\downloadPDFs.py. This program takes in a key term to search and a number of pages to search on. It seeks information about the papers in this search. It depends on Selenium due to Google Scholar's blocking of traditional crawling. It runs somewhat slowly to prevent getting blocked by the website. ====How to Use====Before you run the program, you should build a file directory that you want all the results to go in. Inside of this directory, you should create a folder called "BibTeX." For example, I could make a folder in E:\McNair\Projects\Patent_Thickets called "My_Crawl." Inside of My_Crawl I should make sure I have a "BibTeX" folder. You should also choose a search terms and how many pages you want to search. Open the program downloadPDFs.py in Komodo. At the very end of the program, type: ''main(your query, your output directory, your num pages)'' Replace "your query" with the search term you want (like "patent thickets", making sure to include quotes around the term). Replace "your output directory" with the output directory you want these files to go to. Still using my example above, I would type "E:\McNair\Projects\Patent_Thickets\My_Crawl", making sure to include the quotes around the directory. Finally, replace "your num pages" with the number of pages you want to search. Click the play button in the top center of the screen.

ChristyW

272

edits

Changes

Google Scholar Crawler (view source)

Revision as of 18:37, 28 November 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools