Changes

Jump to navigation Jump to search
no edit summary
{{McNair ProjectsProject|Has project output=|Project TitleHas title=Matching VentureOne (Data)|Topic Area=Patents and Innovation|OwnerHas owner=Ariel Sun, Rosemarie Ziedonis|Start TermHas start date=Summer 2016|StatusDue Date=NA|Has sponsor=McNair Center|Has notes=|Is dependent on=Active|DeliverableDepends upon it=Other|Primary BillingHas project status= AccMcNair01Complete
}}
=Updated=
'''New Requirements'''
*re-run the match using *both* name-related fields in the startups_cl.dta file: “name” and “name_prev”.
#the latter field pulls in patents applied for under a former name of the same company
*in the output file, please include…
#include the field “entityid” that corresponds to each startup (this step is critical; else, we can’t link patents filed under alternative names of the company to the same firm); in startups_cl.dta
#all assignee-related fields in your patent data (e.g., assignee name, and any original and current uspto assignee codes listed for the patent); merge in from your patent files
 
'''Output'''
 
Text files are in:<code>E:\McNair\Projects\Venture One Data\</code>
#<code>summarytablefinal</code>: summary on number of patents and grant year for all companies
#<code>ullyjoinedtable</code>: all patent and assignee information for entities that have patents (combining 3 and 4)
#<code>fullyjoinednow</code>: patent information under current name of the company
#<code>fullyjoinedprev</code>: patent information under previous name of the company
 
A new version of sql script can be cound at :<code>E:\McNair\Projects\Venture One Data\sql script.txt</code>
 
'''notes'''
In summarytablefinal table: (for all entities)
Variables:
Entity Name|Standard Orgname|Number of patent|
Previous Name|Previous Standard Orgname|Previous Number of Patent|
Total number of Patent|
Orinigal ID | Revised ID|
min grant year|max grant year|avg grant year|
***One company have the exactly same name for entity name and previous name(Z-KAT) and there is double counting of patent. So the total number of patent
should be 11 instead of 22.
 
In fullyjoinedtable: (for entities that have patent)
Including all patent and assignee variables
33 variables in total
variables start with 'asg' are assignee information, e.g. asgtype = assignee type
The rest are patent information.
 
=Old=
==Overview==
In this matching process, we will join patent data to VentureOne companies and count the number of patents that affiliated to each company.
 
We first get the standard company names for VentureOne companies from the source VentureOne data set. Then we standardize the names of the companies that have patents from our patent database. Based on the common standard company names, we join patent information to VentureOne companies.
===Raw Data===
*All Variables: EntityName,Employees, City, State, Zip, AreaCode, Business Status, IndustryGroup...etc
*Variables used for matching: EntityName
 
Original patent data is in our database: <code>128.42.44.181/bulk/allpatent</code>
 
===Procedure===
We first get the standard company names for VentureOne companies from the source VentureOne data set. Then we standardize the names of the companies that have patents from our patent database. Based on the common standard company names, we join patent information to VentureOne companies.
===Final Matched Tables===
#Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company(including the ones that own no patents).It can be found at:<code>E:\McNair\Projects\Venture One Data\venturesummary.txt</code>#A table contains all patent information for each company the companies that has have patentsand can be found at <code>E:\McNair\Projects\Venture One Data\venturefullyjoined.txt</code>
===Desired Variables===
Below is the list of variables that were in the STATA file we were given:
Contains data from C:\Users\ArielSun\Downloads\allpats_3sectors_06jun13.dta
obs: 19,409
vars: 36 11 Jun 2016 17:31
size: 10,655,541 (_dta has notes)
----------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------
id_vone double %9.0g VentureOne id
name str39 %39s startup name
patent str9 %9s patno in string
apn str6 %6s pat application number
nmi str40 %40s inventor name
ttl str244 %40s invention title
nma str65 %65s original assignee
ocd str15 %15s main us patent class
icd str15 %15s main intl patent class
apd float %td application date
gdateold float %td Grant date
fnd_year float %8.0g startup founding year
last_yr float %9.0g * OLD last_yr, 2006; see notes
source byte %8.0g 1 if 2012 delphion searches; else from 2004/5 search
pdate float %td priority date, delphion; may pre-date application date if provisional apps
utility float %9.0g * 1 if utility patent as initially awarded; 0 if other (reissued, reexamed, design
state_country str3 %9s state/country of first inventor listed
asscode float %9.0g assignee code; basic.dta
ayear int %9.0g application year
amonth byte %9.0g application month
atype str1 %9s * initial assignee type; see notes
class str3 %9s 3 digit us pat class
subclass str6 %9s patent subclass
gdate int %d grant, or issuance, date
industry str15 %15s semi, software, or med devices
state_hq str2 %9s firm hq location; vone
status06 str4 %9s * status of firm known in 2006; rhs truncation varies by sector
exitdate str8 %9s exit date, if known
exityr str4 %9s exit year, if known
status08 str6 %9s * status of firm in 2008, see notes
last_yr08 int %8.0g * exityr if ipo/acq, else 2008
dcohort float %9.0g 1 if founding yr during 1987-99
lastyr08_minu~r float %9.0g
dsearch_assign float %9.0g 1 if searches of pat assignment data need to be conducted; carlosn confirm?
carlos_chk float %9.0g carlos: pls confirm assignment data = compiled for these pats
entityid long %12.0g unique startup id as of 2008, vone
* indicated variables have notes
----------------------------------------------------------------------------------------------------------------------
==Detailed Data Processing==
#All data in <code>allpatentsprocessed database</code>. Access it by logging on to <code>researcher@McNair DBServ:/bulk/allpatentsprocessed</code>
#A script of detailed processing procedure can be found at <code>E:\McNair\Projects\Venture One Data\patent data script.txt</code>
 
==The matched data==
 
We are giving back two files:
*One is at the patent level and contains information on 38,497 patents held by the 1,557 of the 3,357 companies.
*The other file is at the company level and aggregate patent information for the 3,357 companies.
<includeonly>
[[Category: McNair Projects]]
</includeonly><!-- flush flush --><!-- flush flush --><!-- flush flush --><!-- flush -->

Navigation menu