Research Datasets

To advance research on matters relevant to intellectual property, entrepreneurism, and innovation, the Office of Chief Economist (OCE) releases datasets to allow for study of the economics of patents and trademarks--an element in the USPTO economics research agenda. OCE presents these data for the first time in forms convenient for public use and academic research, consistent with the agency's responsibility to make patent and trademark information open and transparent. Furthermore, it supports White House policy championing transparency and access to government under the "data.gov" umbrella of initiatives. Since these data have not been commonly used in the research community, OCE has developed supplementary documentation providing a comprehensive description of datasets and offering initial findings.

The following datasets and accompanying documentation are available for download.

Cancer Moonshot Patent Data

The USPTO Cancer Moonshot Patent Data contains detailed information on published patent applications and granted patents relevant to cancer research and development (R&D). We generate the dataset using USPTO examiner tools to execute a series of queries designed to identify the various fields and subject matter that cancer-related innovations encompass. The final dataset consists of roughly 270,000 patent documents spanning the 1976 to 2016 period and is intended to help identify promising R&D on the horizon in diagnostics, therapeutics, data analytics, and model biological systems.

Patent Examination Research Dataset (Public PAIR)

The Patent Examination Research Dataset (PatEx) contains detailed information on 9.2 million publicly viewable patent applications filed with the USPTO through December 2014. The data are sourced from the Public Patent Application Information Retrieval system (Public PAIR). The data files include information on each application’s characteristics, prosecution history, continuation history, claims of foreign priority, patent term adjustment history, publication history, and correspondence address information.

PatentsView

PatentsView is a prototype patent data visualization and analysis platform intended to increase the value, utility, and transparency of US patent data. The initiative is supported by the Office of Chief Economist in the US Patent & Trademark Office (USPTO), with additional support from the US Department of Agriculture (USDA). The PatentsView platform is built on a newly developed database that longitudinally links inventors, their organizations, locations, and overall patenting activity. The platform uses data derived from USPTO bulk data files. These data are provided for research purposes and do not constitute the official USPTO record. The data visualization tool, query tool, and flexible API enable a broad spectrum of users to examine the dynamics of inventor patenting activity over time and space. They also permit users to explore patent technologies, assignees, citation patterns and co-inventor networks.

Historical Patent Data Files

Patent classification systems are largely designed for administrative purposes, limiting their value for most research purposes. To address this deficiency, Hall, Jaffe, and Trajtenberg (2001) developed a higher-level classification for the National Bureau of Economic Research (NBER) Patent Citation Data File by aggregating U.S. Patent Classification (USPC) classes into economically relevant technology categories. While this NBER classification scheme has proven valuable for researchers investigating US patent grants, comparable information on patent applications remained unavailable. For that reason, OCE developed a probability-matching algorithm to apply NBER classifications to patent applications as well as in-force and expired patents. From matched data, we construct the USPTO Historical Patent Data Files, four research datasets containing time series and micro-level data by NBER sub-category on applications, grants, and in-force patents spanning two centuries of innovation.

Patent Assignment Dataset

The USPTO allows parties to record assignments of patents and patent applications to, as much as possible, maintain a complete history of claimed interests in a patent. The USPTO also permits recording of other documents that affect title (such as certificates of name change and mergers of businesses) or are relevant to patent ownership (such as licensing agreements, security interests, mortgages, and liens). The Patent Assignment Dataset contains detailed information on 6.8 million patent assignments and other transactions recorded at the USPTO since 1970 and involving roughly 11.1 million patents and patent applications. The Patent Assignment Dataset is updated annually.

Trademark Assignment Dataset

The USPTO allows parties to record assignments of trademark applications and registrations to maintain a complete history of claimed interests in a mark. The Trademark Assignment Dataset contains detailed information on more than 873,000 assignments and other transactions recorded at the USPTO since 1952 and involving 1.6 million unique trademark properties. The Trademark Assignment Dataset is updated annually.

Trademark Case Files Dataset

The Trademark Case Files Dataset contains detailed information on 7.7 million trademark applications filed with or registrations issued by the USPTO between 1870 and December 2015. It is derived from the USPTO main database for administering trademarks and includes data on mark characteristics and designs, prosecution events, ownership, classification, renewal history, foreign priority, and international registration. The Trademark Case Files Dataset is updated annually.

 

Legal Authority

The release of these data is consistent with the agency's responsibility under 35 USC 2 to make information about patents and trademarks available to the public. Providing research datasets to allow for study of the economics of patents and trademarks is also an element in the USPTO economics research agenda. Furthermore, it supports the Obama administration's policy championing transparency and access to government under the "data.gov" umbrella of initiatives.