Changes

Jump to navigation Jump to search
no edit summary
<font size="5">'''==Fall 2016'''2017===</fontonlyinclude>
09/27/2016 15:00-18:00: Set up Staff wiki page, work [[Peter Jalbert]] [[Work Logs]] [[Peter Jalbert (Work Log)|(log page; registered for Slack, Microsoft Remote Desktop; downloaded Selenium on personal computer, read Selenium docs. Created wiki page for Moroccan Web Driver Project.)]]
09/29/2016 15:002017-12-18:0021: Re-enroll in Microsoft Remote Desktop with proper authentication, set up Selenium environment and Komodo IDE on Remote Desktop, wrote program using Selenium that goes Last minute adjustments to a link and opens up the print dialog boxMoroccan Data. Developed computational recipe for a different approach to the problemContinued working on [[Selenium Documentation]].
09/30/2016 2017-12:00-14:0020: Working on Selenium program selects view pdf option from the website, and goes to the pdf webpageDocumentation. Program then switches handle to the new pageWrote 2 demo files. CTRL S Wiki Page is sent to the page to launch save dialog windowavaiable [http://www. Text cannot be sent to this windowedegan. Brainstorm ways around this issuecom/wiki/Selenium_Documentation here]. Explored Chrome Options Created 3 spreadsheets for saving automatically without a dialog window. Looking into other libraries besides selenium that may helpthe Moroccan data.
10/3/2016 13:00 2017-12- 16:0019: Moroccan Web Driver projects completed for driving of Finished fixing the Monarchy proposed bills, the House of Representatives proposed bills, and the Ratified bills sitesDemo Day Crawler. Begun process of devising a naming system for the Changed files that does not require scraping. Tinkered with naming through regular expression parsing of the URL. Structure for the oral questions and written questions drivers is set up, but need fixes due installed as appropriate to the differences make linked in crawler compatible with the sitesRDP. Fixed bug on McNair wiki for women's biz team where email was plain text instead Removed some of an email link. Took a glimpse at Kuwait Parliament website, the bells and it appears to be very different from the Moroccan setupwhistles.
10/6/2016 13:30 2017-12- 18:00: Discussed Continued finding errors with Drthe Demo Day Crawler analysis. Elbadawy about Rewrote the desired file names for Moroccan data download. The consensus was parser to remove any search terms that were in the bill programs are ready top 10000 most common English words according to launch once the files can be named properly, Google. Finished uploading and the questions submitting Moroccan data must be retrieved using a web crawler which I need to learn how to implement. The naming of files is currently drawing errors in going from arabic, to url, to download, to filename. Debugging in process. Also built a demo selenium program for Dr. Egan that drives the McNair blog site on an infinite loop.
10/7/2016 2017-12:00 - 14:0015: Learned unicode and utf8 encoding and decoding in arabicFound errors with the Demo Day Crawler. Still working on transforming an ascii url into printable unicodeFixed scripts to download Moroccan Law Data.
10/11/2016 15:00 2017-12- 18:0014: Fixed arabic bug, files can now be saved with arabic titles. Monarchy bills downloaded and ready for shipmentUploading Morocco Parliament Written Questions. House of Representatives Bill mostly downloaded, ratified bills prepared Creating script for next Morocco Parliament download. Started learning scrapy library in python for web scrapingBegin writing Selenium documentation. Discussed idea of screenshot-ing questions instead of scrapingContinuing to download TIGER data.
10/13/2016 13:002017-12-1806:00: Completed download of Moroccan Bills. Working on either a web driver screenshot approach or a webcrawler approach to download the Moroccan oral and written questions data. Began building Web Crawler for Oral and Running Morocco Parliament Written Questions sitescript. Edited Moroccan Web Driver/Analyzing Demo Day Crawler wiki pageresults. [http://mcnair.bakerinstituteContinued downloading for TIGER geocoder.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
10/14/2016 12:002017-11-1428:00: Finished Oral Questions crawlerDebugging Morocco Parliament Crawler. Finished Written Questions crawlerRunning Demo Day Crawler for all accelerators and 10 pages per accelerator. Waiting for further details on whether that data needs TIGER geocoder is back to be tweaked in any way. Updated the Moroccan Web Driver/Web Crawler wiki page. [http://mcnair.bakerinstituteForbidden Error.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
10/18/2016 15:002017-11-1827:30: Finished code for Oral Questions web driver Rerunning Morocco Parliament Crawler. Fixed KeyTerms.py and Written Questions web driver using seleniumrunning it again. Now, the data Continued downloading for the dates of questions can be found using the crawler, and the pdfs of the questions will be downloaded using selenium. [http://mcnair.bakerinstituteTIGER geocoder.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
10/2017-11-20/2016 13:00-18:00: Continued to download data for the Moroccan Parliament Written and Oral Questions. Updated Wiki page. Started working on Twitter project with Christy. running [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web DriverDemo_Day_Page_Parser Demo Day Page Parser]. Fixed KeyTerms.py and trying to run it again. Forbidden Error continues with the TIGER Geocoder. Began Image download for Image Classification on cohort pages. Clarifying specs for Morocco Parliament crawler.
10/21/2016 12:002017-11-14:0016: Continued to download data for the Moroccan Parliament Written and Oral Questions. Looked over running [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Christy_Warden_(Twitter_Crawler_Application_1) Christy's Twitter CrawlerDemo_Day_Page_Parser Demo Day Page Parser] to see how I can be helpful. DrFixed KeyTerms. Egan asked me to think about how to potentially make multiple tools py and trying to get cohorts and other sorts of data from accelerator sitesrun it again. See [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator List] He also asked me to look at Forbidden Error continues with the [http://mcnairTIGER Geocoder.bakerinstituteBegan Image download for Image Classification on cohort pages.org/wiki/Govtrack_Webcrawler_(Wiki_Page) GovTrack Web Crawler] Clarifying specs for potential ideas on how to bring this project to fruitionMorocco Parliament crawler.
2017-11/1/2016: -15:00-18Continued running [http:00: Continued //www.edegan.com/wiki/Demo_Day_Page_Parser Demo Day Page Parser]. Wrote a script to download Moroccan data in the backgroundextract counts that were greater than 2 from Keyword Matcher. Went over code Continued downloading for GovTracker Web Crawler, continued learning Perl. [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Govtrack_Webcrawler_(Wiki_Page) GovTrack Web CrawlerTiger_Geocoder TIGER Geocoder] Began Kuwait Web Crawler/Driver. Finished re-formatting work logs.
2017-11/3/2016: 13:00-1814:00Continued running [http: Continued ///www.edegan.com/wiki/Demo_Day_Page_Parser Demo Day Page Parser]. Wrote an HTML to download Moroccan data in the backgroundText parser. See Parser Demo Day Page for file location. DrContinued downloading for [http://www. Egan fixed systems requirements to run the GovTrack Web Crawleredegan. Made significant progress on the Kuwait Web Crawlercom/Driver for the Middle East Studies Departmentwiki/Tiger_Geocoder TIGER Geocoder].
2017-11/4/2016: 12:00-1413:00: Continued to download Moroccan data in the background. Finished writing initial Kuwait Web Crawler/Driver for the Middle East Studies Department. Middle East Studies Department asked for additional embedded files in the Kuwait website. Built [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web DriverDemo_Day_Page_Parser Demo Day Page Parser].
2017-11/8/2016: 15:00-1809:00: Continued to download Moroccan data in the backgroundRunning demo version of Demo Day crawler (Accelerator Google Crawler). Finished writing code for the embedded files on the Kuwait Site. Spent time debugging the frame errors due to the dynamically generated content. Never found an answer to the bug, and instead found a workaround that sacrificed run time for the ability to Fixing worklog format. [http://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
2017-11/10/2016 13:00-1807:00: Continued to download Moroccan data Created file with 0s and Kuwait data in 1s detailing whether crunchbase has the backgroundfounder information for an accelerator. Began work Details posted as a TODO on [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Google_Scholar_Crawler Google Scholar CrawlerAccelerator_Seed_List_(Data) Accelerator Seed List]page. Wrote a crawler Still waiting for feedback on the PostGIS installation from [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) Accelerator ProjectTiger_Geocoder Tiger Geocoder] to get the HTML files of hundreds of accelerators. The crawler ended up failing; it appears to have been due to HTTPSContinued working on Accelerator Google Crawler.
2017-11/11/2016 12:00-206:00Contacted Geography Center for the US Census Bureau, [https: Continued to download Moroccan data in the background//www.census.gov/geo/about/contact.html here], and began email exchange on PostGIS installation problems. Attempted to find bug fixes for Began working on the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) Selenium_Documentation Selenium Documentation]. Also began working on an Accelerator Project] crawlerGoogle Crawler that will be used with Yang and ML to find Demo Days for cohort companies.
2017-11/15/2016 15:00-1801:00Attempted to continue downloading, however ran into HTTP Forbidden errors. Listed the errors on the [http: Finished download of Moroccan Written Question pdfs//www. Wrote a parser with Christy to be used for parsing bills from Congress and eventually executive ordersedegan. Found bug in the system Python that was worked out and rebootedcom/wiki/Tiger_Geocoder Tiger Geocoder Page].
11/17/2016 13:002017-10-1831:00: Wrote a crawler to retrieve information about executive orders, and their corresponding pdfs. They can be found Began downloading blocks of data for individual states for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/E%26I_Governance_Policy_Report hereTiger_Geocoder Tiger Geocoder] project.] Next step is to run code to convert Wrote out the pdfs new wiki page for installation, and beginning to text files, then use the parser fixed by Christywrite documentation on usage.
11/18/2016 12:002017-10-2:0030: Converted Executive Order PDFs With Ed's help, was able to text files using adobe acrobat DCget the national data from Tiger installed onto a database server. See The process required much jumping around and changing users, and all the things we learned are outlined in [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/E%26I_Governance_Policy_Report WikipageDatabase_Server_Documentation#Editing_Users the database server documentation] for detailsunder "Editing Users".
11/22/2016 15:002017-10-1825:00: Transferred downloaded Morocco Written Bills to provided SeaGate Drive. Made a "gentle" F6S crawler to retrieve HTMLs of possible accelerator pages documented Continued working on the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) herePostGIS_Installation TigerCoder Installation].
11/29/2016 15:002017-10-1824:00: Began pulling data from Throw some addresses into a database, use address normalizer and geocoder. May need to install things. Details on the installation process can be found on the accelerators listed [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) herePostGIS_Installation PostGIS Installation page]. Made text files for about 18 accelerators.
12/1/2016 13:002017-10-18:0023: Continued making text files Finished Yelp crawler for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Built tool for the [http://mcnair.bakerinstitute.org/wiki/E%26I_Governance_Policy_Report E&I Governance Report Houston_Innovation_District Houston Innovation District Project] with Christy. Adds a column of data that shows whether or not the bill has been passed.
12/2/2016 12:002017-10-1419:00: Built and ran web Continued work on Yelp crawler for Center for Middle East Studies on Kuwait. Continued making text files for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List projectHouston_Innovation_District Houston Innovation District Project].
12/6/2016 15:002017-10-18:00: Learned how to use git. Committed software projects from the semester to the McNair git repository. Projects can be found at; [http://mcnair.bakerinstitute.org/wiki/E%26I_Governance_Policy_Report Executive Order Crawler], [http://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Foreign Government Web Crawlers], Continued work on Yelp crawler for [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Accelerator_Seed_List_(Data) F6S Crawler and ParserHouston_Innovation_DistrictHouston Innovation District Project].
12/7/2016 15:002017-10-1817:00: Continued making text files Constructed ArcGIS maps for the agglomeration project. Finished maps of points for every year in the [http://mcnairstate of California.bakerinstituteFinished maps of Route 128.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]Began working on selenium Yelp crawler to get cafe locations within the 610-loop.
12/8/2016 14:002017-10-1816:00: Continued making text files for Assisted Harrison on the [http://mcnairUSITC project.bakerinstituteLooked for natural language processing tools to extract complaintants and defendants along with their location from case files.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]Experimented with pulling based on parts of speech tags, as well as using geotext or geograpy to pull locations from a case segment.
2017-10-13: Updated various project wiki pages.
<font size="5">'''Spring 2017'''</font>-10-12: Continued work on Patent Thicket project, awaiting further project specs.
2017-10-05: Emergency ArcGIS creation for Agglomeration project.
1/2017-10/2017 14:30-17:1504: Continued making text files Emergency ArcGIS creation for the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List Agglomeration project]. Downloaded pdfs in the background for the [http://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project].
1/11/2017 -10:00-12:00: Continued making text files for the [http02://mcnair.bakerinstituteWorked on ArcGIS data.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Downloaded pdfs in the background See Harrison's Work Log for the [http://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project]details.
1/12/2017 14:30-17:45: Continued making text files for the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Downloaded pdfs in the background for the [http09-28://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project]Added collaborative editing feature to PyCharm.
1/17/2017 14:30-17:15: Continued making text files for the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Downloaded pdfs in the background for the [http09-27://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project]Worked on big database file.
1/18/2017 10-09-25:00New task -12:00: Downloaded pdfs in the background for the - Create text file with company, description, and company type.#[http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler ProjectVC_Database_Rebuild VC Database Rebuild]#psql vcdb2#table name, sdccompanybasecore2#Combine with Crunchbasebulk #TODO: Write wiki on linkedin crawler, write wiki on creating accounts. 2017-09-21: Wrote wiki on Linkedin crawler, met with Laura about patents project.
1/19/2017 14:30- 1709-20:45: Downloaded pdfs in the background for the [http://mcnair.bakerinstitute.org/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project]. Created parser for the [http://mcnair.bakerinstituteFinished running linkedin crawler.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project], completed creation of final Transferred data set(yay!)to RDP. Began working on cohort parserWill write wikis next.
1/23/2017 10:00-1209-19:00: Worked on parser for cohort data of the [http://mcnairBegan running linkedin crawler.bakerinstitute.org/Helped Yang create RDP account, get permissions, and get wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Preliminary code is written, working on debuggingsetup.
1/24/2017 14:30-17:1509-18: Worked on parser for cohort data Finished implementation of the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Cohort data file createdExperience Crawler, debugging is almost complete. Will begin work continued working on the google accelerator search soonEducation Crawler for LinkedIn.
1/25/2017 10:00-12:0009-14: Finished parser Continued implementing LinkedIn Crawler for cohort data of the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Some data files still need proofreading as they are not in an acceptable format. Began working on Google sitesearch projectprofiles.
1/26/2017 14:30-1709-13:45: Continued Implemented LinkedIn Crawler for main portion of profiles. Began working on Google sitesearch project. Discovered crunchbase, changed project priority. Priority 1, split accelerator data up by flag, priority 2, use crunchbase to get web urls for cohorts, priority 3, make internet archive wayback machine driver. Located [http://mcnair.bakerinstitute.org/wiki/Whois_Parser Whois Parser]crawling Experience section of profiles.
1/30/2017 10:00-09-12:00Continued working on the [http: Optimized enclosing circle algorithm through memoization//www.edegan.com/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler for Accelerator Founders Data]. Developed script Added to read addresses from accelerator data and return latitude and longitude coordinatesthe wiki on this topic.
1/31/2017 14:30-1709-11:15Continued working on the [http: Built WayBack Machine //www.edegan.com/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler. Updated documentation for coordinates script. Updated profile page to include locations of codeAccelerator Founders Data].
2/1/2017 10:00-1209-06:00Combined founders data retrieved with the Crunchbase API with the crunchbasebulk data to get linkedin urls for different accelerator founders. For more information, see [http://www.edegan.com/wiki/Crunchbase_Data here].
Notes 2017-09-05: Post Harvey. Finished retrieving names from Session the Crunchbase API on founders. Next step is to query crunchbase bulk database to get linkedin urls. For more information, see [http://www.edegan.com/wiki/Crunchbase_Data here]. 2017-08-24: Began using the Crunchbase API to retrieve founder information for accelerators. Halfway through compiling a dictionary that translates accelerator names into proper Crunchbase API URLs. 2017-08-23: Decided with Ed: Project on US university patenting to abandon LinkedIn crawling to retrieve accelerator founder data, and instead use crunchbase. Spent the day navigating the crunchbasebulk database, and entrepreneurship programs (writing code to identify universities seeing what useful information was contained in assignees)it. 2017-08-22: Discovered that LinkedIn Profiles cannot be viewed through LinkedIn if the target is 3rd degree or further. However, if entering LinkedIn through a Google search Wikipedia , the profile can still be viewed if the user has previously logged into LinkedIn. Devising a workaround crawler that utilizes Google search. Continued blog post [http://www.edegan.com/wiki/LinkedIn_Crawler_(XML then bulk downloadPython), student pop, faculty pop, etchere] under Section 4.Circle 2017-08-21: Began work on extracting founders for accelerators through LinkedIn Crawler. Discovered that Python3 is not installed on RDP, so the virtual environment for the project for VC data will end cannot be fired up being a joint project . Continued working on Ubuntu machine.</onlyinclude> ===Spring 2017=== 2017-05-01: Continued work on HTML Parser. Uploaded all semester projects to join accelerator datagit server. Pull descriptions 2017-04-20: Finished the HTML Parser for VCthe [http://www.edegan.com/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]. Founders of Ran HTML parser on accelerator founders. Data is stored in projects/accelerators in linkedin/LinkedIn Founder Data. 2017-04-19: Made updates to the [http://www.edegan. LinkedIn cannot be caughtcom/wiki/LinkedIn_Crawler_(pretend to not be a botPython)LinkedIn Crawler] Wikipage. Ran LinkedIn Crawler on accelerator data. Can eventually get academic backgrounds through linkedinWorking on an html parser for the results from the LinkedIn Crawler. Pull business registration 2017-04-18: Ran LinkedIn Crawler on matches between Crunchbase Snapshot and the accelerator data, Stern. 2017-04-17: Worked on ways to get correct search results from the [http://www.edegan.com/wiki/Guzman AlgorithmLinkedIn_Crawler_(Python) LinkedIn Crawler]. Worked on an HTML Parser for the results from the LinkedIn Crawler. GIS ontop 2017-04-13: Worked on debugging the logout procedure for the [http://www.edegan.com/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]. Began formulation of geocoded process to search for founders of startups using a combination of the LinkedIn Crawler with the dataresources from the [http://www.edegan.com/wiki/Crunchbase_2013_Snapshot CrunchBase Snapshot]. Maps that works 2017-04-12: Work on bugs with the [http://www.edegan.com/wiki or blog /LinkedIn_Crawler_(Python) LinkedIn Crawler]. 2017-04-11: Completed functional [http://www.edegan.com/wiki/LinkedIn_Crawler_(CartoDBPython), Maps API crawler of LinkedIn Recruiter Pro]. Basic search functions work and Rdownload profile information for a given person.  2017-04-10: Began writing functioning crawler of LinkedIn.NLP Projects2017-04-06: Continued working on debugging and documenting the [http://www.edegan.com/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]. Wrote a test program that logs in, Description Classifiersearches for a query, navigates through search pages, and logs out. Recruiter program can now login and search.
2/2/2017 14:30-15:4504-05: Out sick, independent research and Began work from RDP. Brief research into on the [http://jorgeg.scripts.mit.edu/homepage/wp-content/uploads/2016/03/Guzman-Stern-State-of-American-Entrepreneurship-FINAL.pdf Stern-Guzman algorithm]. Research into [http://mcnair.bakerinstitute.org/wiki/interactive_maps Interactive Maps]LinkedIn Crawler. No helpful additions to map embedding problemResearched on launching Python Virtual Environment.
2/7/2017 14:30-1704-03:15: Fixed bugs in parse_cohort_data.py, the script Finished debugging points for parsing the cohort data from the [http://mcnair.bakerinstitute.org/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]Enclosing Circle Algorithm. Added descriptive statistics Command Line functionality to cohort data excel filethe Industry Classifier.
2/8/2017 10:00-1203-29:00 Worked on Neural Net debugging points for the [http://mcnair.bakerinstitute.org/wiki/Industry_Classifier Industry Classifier Project]Enclosing Circle Algorithm.
2/13/2017 10:00-1203-28:00 Finished running the Enclosing Circle Algorithm. Worked on Neural Net for removing incorrect points from the [http://mcnair.bakerinstitute.org/wiki/Industry_Classifier Industry Classifier Project]data set(see above).
2/14/2017 14:30-17:1503-27: Worked on the application of debugging the Enclosing Circle algorithm Algorithm. Implemented a way to the VC study. Working on bug fixes remove interior circles, and determined that translation to latitude and longitude coordinates resulted in the Enclosing Circle algorithm. Created wiki page for the [http://mcnair.bakerinstitute.org/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm]slightly off center circles.
2/15/2017: 10:00-12:0003-23: Finished debugging the brute force algorithm for [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] applied . Implemented a method to plot the VC studypoints and circles on a graph. Enclosing Circle Analyzed runtime of the brute force algorithm still needs adjustment, but the program runs with the temporary fixes.
2/16/2017 14:30-1703-21:45: Reworked Coded a brute force algorithm for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] to create a file of geocoded data. Began work on wrapping the algorithm in C to improve speed.
2/20/2017 10:00-1203-20:00: Continued to download geocoded data for VC Data as part of Worked on debugging the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Assisted work on the [http://mcnair.bakerinstitute.org/wiki/Industry_Classifier Industry Classifier].
2/21/2017 14:30- 17:1503-09: Continued to download geocoded data for VC Data as part of the running [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Projecton the Top 50 Cities. Researched into C++ Compilers for Python so that the Finished script to draw Enclosing Circle Algorithm could be wrapped in C. Found Circles on a recommended one [https://www.microsoft.com/en-us/download/details.aspx?id=44266 here]Google Map.
2/22/2017 10:00-12:0003-08: Continued to download geocoded data for VC Data as part of the running [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Projecton the Top 50 Cities. Helped out with [http://mcnair.bakerinstitute.org/wiki/Industry_Classifier Industry Classifier Project]Created script to draw outcome of the Enclosing Circle Algorithm on Google Maps.
2/23/2017 14:30-1703-07:45Redetermined the top 50 cities which Enclosing Circle should be run on. Data on the [http: Continued to download geocoded data //www.edegan.com/wiki/Top_Cities_for_VC_Backed_Companies Top 50 Cities for VC Data as part of the Backed Companies can be found here.] Ran [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Installed C++ Compiler for Python. Ran tests on difference between Python and C wrapped Pythonthe Top 50 Cities.
2/27/2017 10:00-12:0003-06: Continued Ran script to download geocoded data for VC Data as part of determine the [http://mcnair.bakerinstitute.org/wiki/Enclosing_Circle_Algorithm top 50 cities which Enclosing Circle Algorithm] Projectshould be run on. Assisted work on Fixed the [http://mcnair.bakerinstitute.org/wiki/Industry_Classifier Industry Classifier]VC Circles script to take in a new data format.
2/28/2017 14:30-1703-02:15: Finished downloading geocoded Cleaned up data for the VC Data as part Circles Project. Created histogram of the data in Excel. See [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Found bug in Enclosing Circle AlgorithmBegan work on the [http://www.edegan.com/wiki/LinkedInCrawlerPython LinkedIn Crawler].
3/1/2017 10:00-12:0003-01: Created statistics for the VC Circles Project.
3/2/2017 14:30-1702-28:45: Cleaned up Finished downloading geocoded data for the VC Circles Project. Created histogram Data as part of data in Excel. See the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Began work on the [http://mcnair.bakerinstitute.org/wiki/LinkedInCrawlerPython LinkedIn Crawler] Found bug in Enclosing Circle Algorithm.
3/6/2017 10:00-1202-27:00: Ran script Continued to determine download geocoded data for VC Data as part of the top 50 cities which [http://www.edegan.com/wiki/Enclosing_Circle_Algorithm Enclosing Circle should be run Algorithm] Project. Assisted work onthe [http://www. Fixed the VC Circles script to take in a new data formatedegan.com/wiki/Industry_Classifier Industry Classifier].
3/7/2017 14:30-1702-23:15: Redetermined the top 50 cities which Enclosing Circle should be run on. Continued to download geocoded data for VC Data on as part of the [http://mcnairwww.bakerinstituteedegan.org/wiki/Top_Cities_for_VC_Backed_Companies Top 50 Cities for VC Backed Companies can be found here.] Ran [http://mcnair.bakerinstitute.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Installed C++ Compiler for Python. Ran tests on the Top 50 Citiesdifference between Python and C wrapped Python.
3/8/2017 10:00-12:0002-22: Continued running to download geocoded data for VC Data as part of the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] on the Top 50 CitiesProject. Helped out with [http://www.edegan. Created script to draw outcome of the Enclosing Circle Algorithm on Google Mapscom/wiki/Industry_Classifier Industry Classifier Project].
3/9/2017 14:30-17:4502-21: Continued running to download geocoded data for VC Data as part of the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] on Project. Researched into C++ Compilers for Python so that the Top 50 CitiesEnclosing Circle Algorithm could be wrapped in C. Finished script to draw Enclosing Circles on Found a Google Maprecommended one [https://www.microsoft.com/en-us/download/details.aspx?id=44266 here].
3/20/2017 10:00-1202-20:00: Worked on debugging Continued to download geocoded data for VC Data as part of the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm] Project. Assisted work on the [http://www.edegan.com/wiki/Industry_Classifier Industry Classifier].
3/21/2017 14:30-1702-16:15: Coded a brute force algorithm for the Reworked [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm]to create a file of geocoded data. Began work on wrapping the algorithm in C to improve speed.
3/23/2017 14:30- 17:4502-15: Finished debugging the brute force algorithm for [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm]. Implemented a method applied to plot the points and circles on a graphVC study. Analyzed runtime of Enclosing Circle algorithm still needs adjustment, but the brute force algorithmprogram runs with the temporary fixes.
3/27/2017 10:00-12:0002-14: Worked on debugging the application of the Enclosing Circle Algorithmalgorithm to the VC study. Implemented a way to remove interior circles, and determined that translation to latitude and longitude coordinates resulted Working on bug fixes in slightly off center circlesthe Enclosing Circle algorithm. Created wiki page for the [http://www.edegan.com/wiki/Enclosing_Circle_Algorithm Enclosing Circle Algorithm].
3/28/2017 14:30- 1702-13:15: Finished running the Enclosing Circle Algorithm. Worked on removing incorrect points from Neural Net for the data set(see above)[http://www.edegan.com/wiki/Industry_Classifier Industry Classifier Project].
3/29/2017 10:00-12:0002-08: Worked on debugging points Neural Net for the Enclosing Circle Algorithm[http://www.edegan.com/wiki/Industry_Classifier Industry Classifier Project].
4/3/2017 10:00-1202-07:00: Finished debugging points Fixed bugs in parse_cohort_data.py, the script for parsing the cohort data from the Enclosing Circle Algorithm[http://www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Added Command Line functionality descriptive statistics to the Industry Classifiercohort data excel file.
4/5/2017 9:45-1102-02:45: Began Out sick, independent research and work on from RDP. Brief research into the LinkedIn Crawler[http://jorgeg.scripts.mit.edu/homepage/wp-content/uploads/2016/03/Guzman-Stern-State-of-American-Entrepreneurship-FINAL.pdf Stern-Guzman algorithm]. Research into [http://www.edegan.com/wiki/interactive_maps Interactive Maps]. Researched on launching Python Virtual EnvironmentNo helpful additions to map embedding problem.
4/6/2017 14:00-1702-01:15Notes from Session with Ed: Continued working Project on debugging US university patenting and documenting the [http://mcnair.bakerinstitute.org/wiki/LinkedIn_Crawler_entrepreneurship programs (writing code to identify universities in assignees), search Wikipedia (PythonXML then bulk download) LinkedIn Crawler], student pop, faculty pop, etc. Wrote Circle project for VC data will end up being a test program that logs joint project to join accelerator data. Pull descriptions for VC. Founders of accelerators in, searches for linkedin. LinkedIn cannot be caught(pretend to not be a querybot). Can eventually get academic backgrounds through linkedin. Pull business registration data, navigates through search pagesStern/Guzman Algorithm. GIS ontop of geocoded data.Maps that works on wiki or blog (CartoDB), Maps API and logs outR. Recruiter program can now login and searchNLP Projects, Description Classifier.
4/10/2017 10:00-12:0001-31: Began writing functioning crawler Built WayBack Machine Crawler. Updated documentation for coordinates script. Updated profile page to include locations of LinkedIncode.
4/11/2017 14:30-17:15: Completed functional [http://mcnair.bakerinstitute.org/wiki/LinkedIn_Crawler_(Python) crawler of LinkedIn Recruiter Pro]. Basic search functions work and download profile information for a given person.
4/12/2017 10:00-1201-30:00: Work on bugs with the [http://mcnair.bakerinstituteOptimized enclosing circle algorithm through memoization.org/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]Developed script to read addresses from accelerator data and return latitude and longitude coordinates.
4/13/2017 14:30-1701-26:45: Worked Continued working on debugging the logout procedure for the [http://mcnairGoogle sitesearch project.bakerinstituteDiscovered crunchbase, changed project priority.org/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]. Began formulation of process Priority 1, split accelerator data up by flag, priority 2, use crunchbase to search get web urls for founders of startups using a combination of the LinkedIn Crawler with the data resources from the cohorts, priority 3, make internet archive wayback machine driver. Located [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Crunchbase_2013_Snapshot CrunchBase SnapshotWhois_Parser Whois Parser].
4/17/2017 10:00-12:00: Worked on ways to get correct search results from the [http://mcnair.bakerinstitute.org/wiki/LinkedIn_Crawler_(Python) LinkedIn Crawler]. Worked on an HTML Parser for the results from the LinkedIn Crawler.
4/18/2017 14:30-1701-25:15Finished parser for cohort data of the [http: Ran LinkedIn Crawler //www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Some data files still need proofreading as they are not in an acceptable format. Began working on matches between Crunchbase Snapshot and the accelerator dataGoogle sitesearch project.
4/19/2017 10:00-1201-24:00: Made updates to Worked on parser for cohort data of the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/LinkedIn_Crawler_Accelerator_Seed_List_(PythonData) LinkedIn CrawlerAccelerator Seed List project] Wikipage. Ran LinkedIn Crawler on accelerator Cohort datafile created, debugging is almost complete. Working Will begin work on an html parser for the results from the LinkedIn Crawlergoogle accelerator search soon.
4/20/2017 14:30-1701-23:45: Finished the HTML Parser Worked on parser for cohort data of the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/LinkedIn_Crawler_Accelerator_Seed_List_(PythonData) LinkedIn CrawlerAccelerator Seed List project]. Ran HTML parser Preliminary code is written, working on accelerator founders. Data is stored in projects/accelerators/LinkedIn Founder Datadebugging.
5/1/2017 13-01-19:00-17Downloaded pdfs in the background for the [http:00//www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project]. Created parser for the [http: Continued work //www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project], completed creation of final data set(yay!). Began working on HTML Parser. Uploaded all semester projects to git servercohort parser.
<font size="5">'''Fall 2017'''<-01-18: Downloaded pdfs in the background for the [http:/font>/www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project].
8/21/2017 14:00-1701-13:00Continued making text files for the [http: Began work on extracting founders for accelerators through LinkedIn Crawler//www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Discovered that Python3 is not installed on RDP, so Downloaded pdfs in the virtual environment background for the project cannot be fired up[http://www.edegan. Continued working on Ubuntu machinecom/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project].
8/22/2017 14:00-1601-12:00Continued making text files for the [http: Discovered that LinkedIn Profiles cannot be viewed through LinkedIn if the target is 3rd degree or further//www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. However, if entering LinkedIn through a Google search, Downloaded pdfs in the profile can still be viewed if background for the user has previously logged into LinkedIn. Devising a workaround crawler that utilizes Google search. Continued blog post [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/LinkedIn_Crawler_(Python) hereMoroccan_Parliament_Web_Crawler Moroccan Government Crawler Project] under Section 4.
8/23/2017 14:00-1501-11:30Continued making text files for the [http: Decided with Ed to abandon LinkedIn crawling to retrieve accelerator founder data, and instead use crunchbase//www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Spent Downloaded pdfs in the day navigating background for the crunchbasebulk database, and seeing what useful information was contained in it[http://www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project].
8/24/2017 14:30-1601-10:30Continued making text files for the [http: Began using //www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project]. Downloaded pdfs in the Crunchbase API to retrieve founder information background for acceleratorsthe [http://www. Halfway through compiling a dictionary that translates accelerator names into proper Crunchbase API URLsedegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Government Crawler Project].
9/5/2017 14:00-16:00: Post Harvey. Finished retrieving names from the Crunchbase API on founders. Next step is to query crunchbase bulk database to get linkedin urls. For more information, see [http://mcnair.bakerinstitute.org/wiki/Crunchbase_Data here].
9/6/2017 14:00-15:30: Combined founders data retrieved with the Crunchbase API with the crunchbasebulk data to get linkedin urls for different accelerator founders. For more information, see [http://mcnair.bakerinstitute.org/wiki/Crunchbase_Data here].===Fall 2016===
9/11/2017 14:002016-12-17:0008: Continued working on making text files for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/LinkedIn_Crawler_Accelerator_Seed_List_(PythonData) LinkedIn Crawler for Accelerator Founders DataSeed List project].
9/2016-12/2017 14:00-16:0007: Continued working on making text files for the [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/LinkedIn_Crawler_Accelerator_Seed_List_(PythonData) LinkedIn Crawler for Accelerator Founders DataSeed List project]. Added to the wiki on this topic.
92016-12-06: Learned how to use git. Committed software projects from the semester to the McNair git repository. Projects can be found at; [http:/13/2017 14www.edegan.com/wiki/E%26I_Governance_Policy_Report Executive Order Crawler], [http:00-15//www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Foreign Government Web Crawlers], [http:30: Implemented LinkedIn //www.edegan.com/wiki/Accelerator_Seed_List_(Data) F6S Crawler for main portion of profiles. Began working on crawling Experience section of profilesand Parser].
9/14/2017 13:302016-12-15:3002: Built and ran web crawler for Center for Middle East Studies on Kuwait. Continued implementing LinkedIn Crawler making text files for profilesthe [http://www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Seed List project].
92016-12-01: Continued making text files for the [http://www.edegan.com/18wiki/2017 14Accelerator_Seed_List_(Data) Accelerator Seed List project]. Built tool for the [http:00-17:00: Finished implementation //www.edegan.com/wiki/E%26I_Governance_Policy_Report E&I Governance Report Project] with Christy. Adds a column of Experience Crawler, continued working on Education Crawler for LinkedIndata that shows whether or not the bill has been passed.
9/19/2017 14:302016-11-1629:30Began pulling data from the accelerators listed [http: Began running linkedin crawler//www.edegan. Helped Yang create RDP account, get permissions, and get com/wiki setup/Accelerator_Seed_List_(Data) here]. Made text files for about 18 accelerators.
9/20/2017 14:002016-11-1522:30Transferred downloaded Morocco Written Bills to provided SeaGate Drive. Made a "gentle" F6S crawler to retrieve HTMLs of possible accelerator pages documented [http: Finished running linkedin crawler//www. Transferred data to RDPedegan. Will write wikis nextcom/wiki/Accelerator_Seed_List_(Data) here].
#TODO2016-11-18: Write wiki on linkedin crawler, write Converted Executive Order PDFs to text files using adobe acrobat DC. See [http://www.edegan.com/wiki on creating accounts/E%26I_Governance_Policy_Report Wikipage] for details.
9/21/2017 14:002016-11-1617:00Wrote a crawler to retrieve information about executive orders, and their corresponding pdfs. They can be found [http: Wrote //www.edegan.com/wiki on Linkedin crawler/E%26I_Governance_Policy_Report here.] Next step is to run code to convert the pdfs to text files, met with Laura about patents projectthen use the parser fixed by Christy.
9/25/2017 14:002016-11-1715:00: New task -- Create text file Finished download of Moroccan Written Question pdfs. Wrote a parser with company, description, Christy to be used for parsing bills from Congress and company typeeventually executive orders.#[http://mcnairFound bug in the system Python that was worked out and rebooted.bakerinstitute.org/wiki/VC_Database_Rebuild VC Database Rebuild]#psql vcdb2#table name, sdccompanybasecore2#Combine with Crunchbasebulk
9/27/2017 14:002016-11-1611:00Continued to download Moroccan data in the background. Attempted to find bug fixes for the [http: Worked on big database file//www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Project] crawler.
92016-11-10: Continued to download Moroccan data and Kuwait data in the background. Began work on [http://www.edegan.com/28wiki/2017 13Google_Scholar_Crawler Google Scholar Crawler]. Wrote a crawler for the [http:30-15:30: Added collaborative editing feature //www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator Project] to get the HTML files of hundreds of accelerators. The crawler ended up failing; it appears to have been due to PyCharmHTTPS.
10/2/2017 14:002016-11-1708:00: Worked Continued to download Moroccan data in the background. Finished writing code for the embedded files on ArcGIS datathe Kuwait Site. Spent time debugging the frame errors due to the dynamically generated content. See Harrison's Work Log Never found an answer to the bug, and instead found a workaround that sacrificed run time for the detailsability to work.[http://www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
10/4/2017 14:002016-11-1604:00Continued to download Moroccan data in the background. Finished writing initial Kuwait Web Crawler/Driver for the Middle East Studies Department. Middle East Studies Department asked for additional embedded files in the Kuwait website. [http: Emergency ArcGIS creation for Agglomeration project//www.edegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
10/5/2017 14:152016-11-1503:45: Emergency ArcGIS creation Continued to download Moroccan data in the background. Dr. Egan fixed systems requirements to run the GovTrack Web Crawler. Made significant progress on the Kuwait Web Crawler/Driver for Agglomeration projectthe Middle East Studies Department.
10/12/2017 14:002016-11-15:3001: Continued work on Patent Thicket projectto download Moroccan data in the background. Went over code for GovTracker Web Crawler, awaiting further project specscontinued learning Perl. [http://www.edegan.com/wiki/Govtrack_Webcrawler_(Wiki_Page) GovTrack Web Crawler] Began Kuwait Web Crawler/Driver.
2016-10-21: Continued to download data for the Moroccan Parliament Written and Oral Questions. Looked over [http://www.edegan.com/13wiki/2017 14Christy_Warden_(Twitter_Crawler_Application_1) Christy's Twitter Crawler] to see how I can be helpful. Dr. Egan asked me to think about how to potentially make multiple tools to get cohorts and other sorts of data from accelerator sites. See [http:00-15//www.edegan.com/wiki/Accelerator_Seed_List_(Data) Accelerator List] He also asked me to look at the [http:00: Updated various //www.edegan.com/wiki/Govtrack_Webcrawler_(Wiki_Page) GovTrack Web Crawler] for potential ideas on how to bring this project wiki pagesto fruition.
2016-10/16/2017 14:00-1720:00: Assisted Harrison Continued to download data for the Moroccan Parliament Written and Oral Questions. Updated Wiki page. Started working on the USITC Twitter projectwith Christy. Looked for natural language processing tools to extract complaintants and defendants along with their location from case files[http://www. Experimented with pulling based on parts of speech tags, as well as using geotext or geograpy to pull locations from a case segmentedegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
2016-10/17/2017: 15:00-1718:00: Constructed ArcGIS maps Finished code for Oral Questions web driver and Written Questions web driver using selenium. Now, the data for the agglomeration project. Finished maps dates of points for every year in questions can be found using the crawler, and the state pdfs of Californiathe questions will be downloaded using selenium. Finished maps of Route 128[http://www. Began working on selenium Yelp crawler to get cafe locations within the 610-loopedegan.com/wiki/Moroccan_Parliament_Web_Crawler Moroccan Web Driver]
2016-10/18/2017 -14:15-15:45: Continued work on Yelp Finished Oral Questions crawler. Finished Written Questions crawler . Waiting for further details on whether that data needs to be tweaked in any way. Updated the Moroccan Web Driver/Web Crawler wiki page. [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Houston_Innovation_DistrictHouston Innovation District ProjectMoroccan_Parliament_Web_Crawler Moroccan Web Driver].
2016-10/19/2017 14:30-1613:30: Continued work Completed download of Moroccan Bills. Working on Yelp crawler either a web driver screenshot approach or a webcrawler approach to download the Moroccan oral and written questions data. Began building Web Crawler for Oral and Written Questions site. Edited Moroccan Web Driver/Crawler wiki page. [http://mcnairwww.bakerinstituteedegan.orgcom/wiki/Houston_Innovation_District Houston Innovation District ProjectMoroccan_Parliament_Web_Crawler Moroccan Web Driver].
2016-10/23/2017 14:00-1711:00: Finished Yelp crawler Fixed arabic bug, files can now be saved with arabic titles. Monarchy bills downloaded and ready for shipment. House of Representatives Bill mostly downloaded, ratified bills prepared for [http://mcnairdownload.bakerinstituteStarted learning scrapy library in python for web scraping.org/wiki/Houston_Innovation_District Houston Innovation District Project]Discussed idea of screenshot-ing questions instead of scraping.
2016-10/24/2017 14:00-1607:00: Throw some addresses into a database, use address normalizer Learned unicode and utf8 encoding and geocoderdecoding in arabic. May need to install things. Details Still working on the installation process can be found on the [http://mcnair.bakerinstitute.org/wiki/PostGIS_Installation PostGIS Installation page]transforming an ascii url into printable unicode.
2016-10/25/2017 14:15-1506:45: Continued working on Discussed with Dr. Elbadawy about the desired file names for Moroccan data download. The consensus was that the bill programs are ready to launch once the files can be named properly, and the [http://mcnairquestions data must be retrieved using a web crawler which I need to learn how to implement. The naming of files is currently drawing errors in going from arabic, to url, to download, to filename. Debugging in process.bakerinstituteAlso built a demo selenium program for Dr.org/wiki/PostGIS_Installation TigerCoder Installation]Egan that drives the McNair blog site on an infinite loop.
2016-10/30/2017 14:00-1603:00: With Ed's helpMoroccan Web Driver projects completed for driving of the Monarchy proposed bills, the House of Representatives proposed bills, was able to get and the national data from Tiger installed onto Ratified bills sites. Begun process of devising a database servernaming system for the files that does not require scraping. Tinkered with naming through regular expression parsing of the URL. The process required much jumping around Structure for the oral questions and changing userswritten questions drivers is set up, and all but need fixes due to the things we learned are outlined differences in [http://mcnairthe sites.bakerinstituteFixed bug on McNair wiki for women's biz team where email was plain text instead of an email link.org/wiki/Database_Server_Documentation#Editing_Users Took a glimpse at Kuwait Parliament website, and it appears to be very different from the database server documentation] under "Editing Users"Moroccan setup.
Starting 2016-09-30: Selenium program selects view pdf option from the website, and goes to the pdf webpage. Program then switches handle to the new page. CTRL S is sent to use non-military timecodes because I want the page to launch save dialog window. Text cannot be sent tothis window. Brainstorm ways around this issue. Explored Chrome Options for saving automatically without a dialog window. Looking into other libraries besides selenium that may help.
10/31/2017 2pm2016-4pm09-29: Began downloading blocks of data for individual states for Re-enroll in Microsoft Remote Desktop with proper authentication, set up Selenium environment and Komodo IDE on Remote Desktop, wrote program using Selenium that goes to a link and opens up the [http://mcnairprint dialog box.bakerinstitute.org/wiki/Tiger_Geocoder Tiger Geocoder] project. Wrote out the new wiki page Developed computational recipe for installation, and beginning a different approach to write documentation on usagethe problem.
11/1/2017 22016-309-26:30pm: Attempted to continue downloadingSet up Staff wiki page, work log page; registered for Slack, however ran into HTTP Forbidden errors. Listed the errors Microsoft Remote Desktop; downloaded Selenium on the [http://mcnair.bakerinstitutepersonal computer, read Selenium docs.org/Created wiki/Tiger_Geocoder Tiger Geocoder Page]page for Moroccan Web Driver Project.
11/6/2017 2-5pm: Contacted Geography Center for the US Census Bureau, [https://www.census.gov/geo/about/contact.html here], and began email exchange on PostGIS installation problems. Began working on the [http://mcnair.bakerinstitute.org/wiki/Selenium_Documentation Selenium Documentation]. Also began working on an Accelerator Google Crawler that will be used with Yang and ML to find Demo Days for cohort companies.=='''Notes=='''
*Ed moved the Morocco Data to E:\McNair\Projects from C:\Users\PeterJ\Documents

Navigation menu