Historical Patent Data Files

Patent classification systems are largely designed for administrative purposes, limiting their value for most research purposes. To address this deficiency, Hall, Jaffe, and Trajtenberg (2001) developed a higher-level classification for the National Bureau of Economic Research (NBER) Patent Citation Data File by aggregating U.S. Patent Classification (USPC) classes into economically relevant technology categories. While this NBER classification scheme has proven valuable for researchers investigating US patent grants, comparable information on patent applications remained unavailable. For that reason, the Office of Chief Economist (OCE) developed a probability-matching algorithm to apply NBER classifications to patent applications as well as in-force and expired patents. From matched data, we construct the USPTO Historical Patent Data Files, four research datasets containing time series and micro-level data by NBER sub-category on applications, grants, and in-force patents spanning two centuries of innovation. Our hope is that researchers will make use of these data which, for the first time, enable detailed study of the complex dynamics between new filings, pendency, and abandonment and put into context recent trends in patenting activity, litigation, and technological change.

The USPTO Historical Patent Data Files includes four datasets:

  • The annual dataset contains counts of in-force and issued patents from 1840 to 2014 by NBER sub-category.
  • The monthly file contains a monthly count of applications, issued patents, and in-force patents by application status, disposal type (abandoned, issued, or pending), and NBER sub-category from 1981 to 2014.  
  • The monthly_disposal dataset contains counts of application by disposal type for each monthly application cohort by NBER sub-category from 1981 to 2014.
  • The historical_masterfile contains micro-level application, NBER sub-category, and prosecution data on 2.2 million patent applications filed from 1981 to 2014 and 8.9 million patents issued through 2014.  
  • Three intermediate files (orders, orders_class, and orders_subclass) used to generate the four datasets are also available for download.

A document describing these data is available and can be cited as: Marco, Alan C. and Carley, Michael and Jackson, Steven and Myers, Amanda F., The USPTO Historical Patent Data Files: Two Centuries of Innovation (June 1, 2015). SSRN working paper, available at http://ssrn.com/abstract=2616724

For questions, please email EconomicsData@uspto.gov

Data Files

File Name2014
Output files:
105 KB
117 KB
758 MB
228 MB
280 KB
425 KB
22.8 MB
37.7 MB
Intermediate files:
49.8 KB
169 KB
75 MB
15.6 MB
3.96 MB
5.34 MB

Direct download here.

* Note: the files marked with an asterisk have been compressed into a ZIP archive.