Joe Reilly (Work Log)

From edegan.com
Revision as of 18:43, 2 August 2017 by Jreilly (talk | contribs)
Jump to navigation Jump to search

Joe Reilly Work Logs (log page)

2017-06-20: Joined the center! Wrote my page. Started on Collecting SBIR Data. Finished collecting SBIR Data; saved in bulk(D:)--> McNair-->Projects-->SBIR. Began researching VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData

2017-06-21: Continued researching VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData.

2017-06-22: Finished researching VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData. Began grouping.

2017-06-23: Finished grouping VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData. Researched whether based in Houston, and whether they should be considered alive.

2017-06-27: Sorted VC funds in E:\McNair\Projects\Houston\VCData; deleted non-operating ones; finalized groups. Began researching the relative size of different sectors in Houston's economy. Work saved in E:\McNair\Projects\Houston\Industries.

2017-06-28: Began adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as a text file.

2017-06-29: Continued adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as a text file. Searched through documents in E:\McNair\Projects\SimplerPatentData\data\extracts\applications, in the modern and vintage folders, for examples of patents of the following type: utility, plant, reissue, and design, in versions 1.5, 1.6, 4.0, 4.1, 4.2, 4.3, 4.4, and 4.5. Placed examples in the folder E:\McNair\Projects\SimplerPatentData\data\examples. As mentioned in the wiki page (and all but confirmed with regex searches of hundreds of the patent documents), we appear only to have data on utility patents, except for a few plant patents.

2017-06-30: Continued adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, either as an excel file or a text file. Added addresses to companies in E:\McNair\Projects\Houston\VCData.


2017-07-11: Continued adding addresses to companies in Venture Funds in E:\McNair\Projects\Houston\VCData. Note: The file "VC Data" is now called "Venture Funds".

2017-07-12: Added addresses/PO box locations to Venture Funds in E:\McNair\Projects\Houston\VCData to remaining VC firms. Organized list. Began organizing cohort data collected on 6/30 w/ regex to streamline searching and gathering of cohort names themselves. Compiled addresses, along with other categorized info, in an excel file called "VC firms with address, basic sector info", saved in E:\McNair\Projects\Houston\VCData. For each cohort in E:\McNair\Projects\Accelerators\Accelerator Match, added headers to the tabbed name, founder, description, etc. Continued searching for addresses of the firms with no addresses listed in "VC firms with address, basic sector info". Double-checked addresses.

Note: The Houston VC data write up is on Houston_Entrepreneurship_Ecosystem_Project#VC_Funds_in_Houston

2017-07-13: Helped Diana with researching proportion of Houston's city budget allocated towards startup funding/entrepreneurship (apparently none...). Gathered info on IT and Procurement sections of Budget. Researched proportion of Houston budget allocated towards IT and Procurement. Excel and text files saved in E:\McNair\Projects\Houston\Budget. Continued researching relative sizes of industries in Houston, gathered relevant info and links in "Industry breakdown..." in E:\McNair\Projects\Houston\Industries.

2017-07-14: used data from links in "Industry breakdown by GDP..." in E:\McNair\Projects\Houston\Industries to create Excel charts of Houston employment & Gross Area Product broken down by industry. Saved charts in the same folder. Energy, health, and (for employment data) IT sectors were emphasized, in line with the goal of communicating the idea that, as substantial parts of the Houston economy, those industries will benefit from supporting local startups. Looked up example charts in the file 2017ReportV1 in E:\McNair\Projects\Houston\Houston Ecosystem Recommendations, both for ideas for future charts, and to get a idea of quantitative VC data in Houston. Fixed typos, improved incomplete keys. Cleaned the file Venture Funds in E:\McNair\Projects\Houston\VCData: added research on funds declared dead to make sure they actually are. Double checked that all firms in the master list were accounted for in the grouped list of VC firms.

2017-07-18: For the file "hubs list" in Z:\Hubs\2017\hubs_data, researched whether organizations not listed as hubs (aka shaded red) in "Hubs Data v2_16" (located in the same folder) should be considered hubs, under the definition that a hub has 1) has a coworking space, 2) provides mentorship, 3) offers coding classes/tech events for cohort companies. Whether the hub had an accelerator or was tech focused was also noted.

2017-07-19: Continued editing "hubs list" in Z:\Hubs\2017\hubs_data, researching organizations marked as questionably hubs. Used websites like Alexa and similarsites.com to find hubs with websites similar to hubs in the "hubs list" file. Only found 1 new hub. Began searching possible hubs in the file "Raw Program list" in E:\McNair\Projects\Hubs\summer 2016.

2017-07-20: Searched through the firms in "Raw Program list" in E:\McNair\Projects\Hubs\summer 2016 to determine if they could be considered hubs based on the definition listed above. If they were, they were added to the list of new hubs in "hubs list" in Z:\Hubs\2017\hubs_data.

2017-07-21: Confirmed whether each hub in last year's hub list (in "Hubs Data v2_16" Z:\Hubs\2017\hubs_data) is still operating.

2017-07-25: Noted hubs that met the new definition but were not considered hubs in hubs_list in the same file. Copied all hubs data from "hubs list" in Z:\Hubs\2017\hubs_data to "Joe hub list 2017" in the same folder. Searched hubs.txt and "Potential Hubs" in Z:\Hubs\2017\hubs_data for new hubs; added new ones to "Joe hubs list 2017".

2017-07-26: Began Patent Schema Reconciliation, creating a text document of xpaths for the following nodes: patent number, filing number, grant date, kind, type, application number, and filing date. Saved file in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation.

Notes: I am assuming "application number" in the patent code means "filing number", because the word "filing" appears nowhere in the code, and there is already a different number, under the "publication reference" header, that seems to be referring to the patent number. It's likely that the number under which the patent is internally filed is called "application number", and appears under the header "application reference", and that the (publicized) patent number appears under the header "publication reference".

An example xpath for a certain block of code from granted, v4.5, plant:

<us-bibliographic-data-grant> <publication-reference> <document-id> <country>US</country> <doc-number>PP027502</doc-number> <kind>P3</kind> <date>20161227</date> </document-id>

For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic-data-grant/publication-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date). I am adding the xpaths for these nodes, as well as the others mentioned above, for the 4 types of patents, for each version, for both granted and applications. Still have to do xpaths for granted version 2.5 for all types, and all applications. Waiting on Oliver about whether we need xpaths for more nodes other than the 6 example nodes.

2017-07-27: Doubled checked that the xpaths in http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences were accurate for v4.0,v4.1, v4.2, and added to the page xpaths for the nodes listed on that wiki page for v<4.3. Began adding xpaths for other nodes Oliver noted would be helpful, like Invention Title. Went over new hubs definition with Hira; ensured no hubs on "Joe hub list 2017"(see above) were actually just incubators, and that they all had coding/tech events/programs with substance. Took 17 hubs off total. Saved new list in Z:\Hubs\2017\hubs_data, called Joe hub list 2017 w comments.

2017-07-28: Continued working on xpaths. For Top50_Table in E:\McNair\Projects\Ecosystem\Ranking, found and entered necessary data. Source for city population and area is in the same folder, titled "City area chart".

2017-08-01: For the xpath project: Found which patent versions for text files in E:\McNair\Projects\SimplerPatentData\data\examples\granted had PRIORITY_CLAIMS_DATE, PRIORITY_CLAIMS_COUNTRY, and PRIORITY_CLAIMS_PATENT_NUMBER; noted which ones did, and added their xpaths to the file of xpaths, Patent Schema Reconciliation.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation. Checked which types and versions had pct document numbers, updated xpaths in http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences and Patent Schema Reconciliation.txt. Began the same process with IPCR_Subclass and the following xpaths on http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences. Began listing examples for each xpath. Used data from roundplus.txt in Z:\VentureCapitalData\SDCVCData\vcdb to create charts, saved as New2017Report(Aug) in E:\McNair\Projects\Houston\2017Report.

2017-08-02: Edited excel charts from 08-01. Continued on xpath project, completed IPCR sections. For Copy of Rankingv3_Diana's_workingfile in E:\McNair\Projects\Ecosystem\Ranking, added data on population, political activity in the 2016 presidential election, whether it had a university (using a filter of organizations that gave out doctorate degrees in Carnegie Classifications 2015_cleaned in E:\McNair\Projects\University Patents).