The Patent Claims Research Dataset contain detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014. The dataset is derived from the Patent Application Publication Full-Text and Patent Grant Full Text files, available at https://bulkdata.uspto.gov/, to which the Office of Chief Economist (OCE) applied a Python algorithm to identify individual claims as well as the dependency relationship between claims. From the parsed claims text, OCE created six data files containing individually-parsed claims, claim-level statistics, and document-level statistics, including newly-developed measures of patent scope.
A document describing the motivation behind and trends of the patent scope measurements is available and can be cited as: Marco, Alan C. and Sarnoff, Joshua D. and deGrazia, Charles, Patent Claims and Patent Scope (October 2016). USPTO Economic Working Paper 2016-04. Available at: SSRN: https://ssrn.com/abstract=2844964
The OCE developed these data files for public use and encourage users to identify fixes and improvements. Please provide all feedback to: EconomicsData@uspto.gov
Patent Claims Research Dataset Documentation
Download individual data files:
The direct download page is here.
Note: The DTA (Stata dataset) files are saved in the Stata-13 data file format.
Note: The code used to parse the Patent Application Publication Full-Text and Patent Grant Full Text files and generate the datasets below will be made available soon.