RETech Data – Donald Bowen

Bowen-Hoberg-Fresard Patent-Text Data

⭐ ⭐ ⭐Newly updated through 2024!⭐ ⭐ ⭐
Click here to download public data covering patent applications through 2021 and grants through 2024!

About These Measures

The Bowen, Hoberg, and Fresard measures provided here are based on the text in the section of patents describing the innovation.

They provide researchers a new way to characterize innovation within public firms, startups, places and more. Importantly, they are distinct from existing measures and do not have look-ahead bias: they only use information available in the patent itself.

This Data is Provided by

Donald Bowen (Lehigh University),
Laurent Fresard (Universita della Svizzera italiana),
and Gerard Hoberg (University of Southern California)

and developed within “Rapidly Evolving Technologies and Startup Exits” which is forthcoming in Management Science. Please cite that study when using or referring to any data or code in this repository.

RETech

“RETech” measures whether the patent pertains to a technological area that is rapidly evolving (i.e., following breakthroughs) or stable.

Higher levels of our measure detects patents in new areas and those in subsequent waves of development. High RETech patents substitute for existing technologies rather than complement them, receive more citations and get higher stock market reactions.

Among measures without look-ahead bias, RETech has the strongest association with notable breakthrough patents (like lasers, DNA modifications, satellites, Google’s PageRank, and more).

RETechCat

RETechIndex

Top 20 Patents by RETech

Among patents applied for from 2009-2019, these are the top 20 patents by RETech.

Patent	RETech	NBER Cat	App Year	Grant Year	Title
8268964	93	Chemicals	2009	2012	MHC peptide complexes and uses thereof in infectious diseases
10336808	93	Drugs & Medicine	2012	2019	MHC peptide complexes and uses thereof in infectious diseases
10633715	91	Drugs & Medicine	2016	2020	Gene controlling shell phenotype in palm
9481889	91	Drugs & Medicine	2013	2016	Gene controlling shell phenotype in palm
9067987	91	Drugs & Medicine	2012	2015	Neisserial antigenic peptides
8394390	91	Drugs & Medicine	2012	2013	Neisserial antigenic peptides
11047011	90	Drugs & Medicine	2016	2021	Immunorepertoire normality assessment method and its use
8624015	90	Chemicals	2012	2014	Probe set and method for identifying HLA allele
10607717	89	Comps & Commun	2014	2020	Method for subtyping lymphoma types by means of expression profiling
10405749	88	Drugs & Medicine	2015	2019	RNA agents for P21 gene modulation
9765315	87	Drugs & Medicine	2013	2017	Cellulose and/or hemicelluloses degrading enzymes from Macrophomina phaseolina and uses thereof
10655102	87	Drugs & Medicine	2014	2020	Identification and isolation of human corneal endothelial cells (HCECS)
8110199	87	Drugs & Medicine	2009	2012	Streptococcus pneumoniae proteins and nucleic acid molecules
10731174	86	Drugs & Medicine	2018	2020	Plants showing a reduced wound-induced surface discoloration
10835585	86	Drugs & Medicine	2016	2020	Shared neoantigens
10570457	85	Drugs & Medicine	2015	2020	Methods for predicting drug responsiveness
9700502	84	Drugs & Medicine	2010	2017	Methods for generating new hair follicles, treating baldness, and hair removal
9642789	84	Drugs & Medicine	2010	2017	Methods for generating new hair follicles, treating baldness, and hair removal
11220714	83	Drugs & Medicine	2016	2022	Method of diagnosing bladder cancer
9029525	83	Chemicals	2012	2015	Compositions and methods for inhibiting expression of GSK-3 genes

Tech Breadth

“Tech Breadth” measures how much (or little) the patent’s text is spread across technological fields. Patents with low levels of breadth (i.e. 0) are niche and can be understood by scientists familiar with a single field of study. High values of breadth indicate that the patent imbues ideas from many fields and will likely require teams with diverse knowledge to implement. As such, we expect low breadth patents to be more redeployable and complementary to the technology stacks outside the inventing firm.

Usage Notes

1. The dataset contains raw values (i.e. not winsorized) and we recommend winsorizing by application year before using them.
2. We generally recommend using the data above by application year to match the timing of the innovation best.
3. You can use this Stata function to convert patent-level variables into group-time variables (e.g. firm-year, state-year, MSA-quarter).

Please see the paper for details on the construction of the measures. Questions can be directed to Donald Bowen, and pointers to errors or omissions, and corrections are welcome.

Power Users Only

A GitHub repo contains our code library that downloads Google patent pages, parses and cleans the text, and creates variables. Users interested in creating measures from patent text or modifying ours should go there.