Changes

Jump to navigation Jump to search
====Using Python files====
'''To use STEP1_crawl.py''':
INPUT: a list of company names (or anything) you would like to find websites for by searching on google
OUTPUT: a list of company names and the top X number of results from google
1. Change LIST_FILEPATH in line 26 to be the name of the file that contains the list of things you would like to search.
'''To use STEP2_findcorrecturl.py''':
INPUT: output file from STEP1
OUTPUT: a file formatted the same as the output of STEP1, but URLs that do not match over the threshold value you set will be replaced with "no match"
1. Change file f to be the output file name from STEP1. Change g to be the desired name of the output file for this part.
145

edits

Navigation menu