Changes

Jump to navigation Jump to search
no edit summary
This python script takes a textfile of company names, and uses the Crunchbase Snapshot to determine the founder names of each company. If Crunchbase does not have the records of the founder, it is unlikely that a generic search on LinkedIn will provide any useful results. The script returns a new textfile with each company name replaced with "CompanyName Founder FounderName" for each founder of the company listed in the Crunchbase Snapshot. This new textfile can then be used directly with the LinkedIn Crawler to generate accurate search results, and retrieve accurate html pages.
 
The following lists the functionality of functions in the format_founders.py script.
 
===create_pickle()===
This function creates a pickled python dictionary of the Crunchbase Snapshot, people.csv. If a different dataset should be used in the future, one should pickle a dictionary in a similar fashion to this function, and then use that pickled result in the next function to reformat your queries.
 
===reformat(pathname, output_filename)===
This function takes a textfile pathname and an output filename, and converts the textfile to a searchable term by using the data from the pickled Crunchbase Snapshot. The new textfile with the corrected queries are saved to the output filename.
===Results with Accelerator Data===

Navigation menu