Patent Examination Research Dataset (Public PAIR)

The original release of the Patent Examination Research Dataset (PatEx) contains detailed information on 9.2 million publicly viewable patent applications filed with the USPTO through December 2014. Currently, two updates of the dataset are available as well, the most recent posted in November 2017 covering activity through the early summer of that year. The data are sourced from the Public Patent Application Information Retrieval system (Public PAIR). There are several data files, each of which coincides with a tab on USPTO’s Public PAIR web portal. The data files include information on each application’s characteristics, prosecution history, continuation history, claims of foreign priority, patent term adjustment history, publication history, and correspondence address information. 

A document describing these data sets is available and can be cited as: Graham, Stuart J.H. and Marco, Alan C. and Miller, Richard, The USPTO Patent Examination Research Dataset: A Window on the Process of Patent Examination (November 30, 2015). Available at SSRN: https://ssrn.com/abstract=2702637

The OCE developed these data files for public use and encourage users to identify fixes and improvements. Please provide all feedback to EconomicsData@uspto.gov.

Documentation

Understanding how patent examination records become public is crucial to the proper analysis of the PatEx data. Thus, the document focuses primarily on the coverage of the underlying Public PAIR data and how it has evolved over time. It also includes several appendices that provide more detailed descriptions of the data elements in each of the files. These appendices can be accessed separately by clicking on the following links.

Appendix A: Description of the Application Data Tab Release

Appendix B: Description of the Transaction History Tab Release

Appendix C: Description of the Continuity Data Tab Release

Appendix D: Description of the Foreign Priority Tab Release

Appendix E: Description of the Patent Term Adjustment Tab Release

Appendix F: Description of the Address and Attorney/Agent Tab Release

Notes Regarding 2015 PatEx Data Files

Data Files

Each of the files below can be downloaded in either Stata-14 (DTA) or CSV format.

Download a full set of data files (2014): [.dta format (5.42 GB)] [.csv format (4.33 GB)]

Download a full set of data files (2015): [.dta format (5.56 GB)] [.csv format (4.99 GB)]

Download a full set of data files (2016): [.dta format (4.98 GB)] [.csv format (4.36 GB)]

Download individual data files (the direct download pages are here: 2014, 2015, 2016).

File Name 2014 2015 2016
application_data DTA
1.53 GB
CSV
585 MB
DTA
1 GB
CSV
635 MB
DTA
1.1 GB
CSV
681 MB
all_inventors DTA
229 MB
CSV
225 MB
DTA
271 MB
CSV
268 MB
DTA
348 MB
CSV
347 MB
transactions DTA
2.55 GB
CSV
2.45 GB
DTA
2.99 GB
CSV
2.86 GB
DTA
2.02 GB
CSV
1.91 GB
event_codes DTA
75 KB
CSV
21.2 KB
DTA
34.3 KB
CSV
22.4 KB
DTA
36.4 KB
CSV
22.8 KB
status_codes DTA
8.56 KB
CSV
3.53 KB
DTA
5.83 KB
CSV
3.66 KB
DTA
5.87 KB
CSV
3.74 KB
continuity_parents DTA
49.9 MB
CSV
48.7 MB
DTA
67.2 MB
CSV
52.9 MB
DTA
73.2 MB
CSV
58 MB
continuity_children DTA
40.9 MB
CSV
40.9 MB
DTA
44 MB
CSV
43.8 MB
DTA
47.9 MB
CSV
47.7 MB
foreign_priority DTA
36.5 MB
CSV
35.2 MB
DTA
38.3 MB
CSV
37.1 MB
DTA
40.7 MB
CSV
39.4 MB
pat_term_adj DTA
823 MB
CSV
747 MB
DTA
943 MB
CSV
860 MB
DTA
1.12 GB
CSV
1.01 GB
pta_summary DTA
19.6 MB
CSV
16.2 MB
DTA
22.4 MB
CSV
17.9 MB
DTA
25.1 MB
CSV
20.1 MB
correspondence_address DTA
165 MB
CSV
243 MB
DTA
216 MB
CSV
260 MB
DTA
236 MB
CSV
280 MB

 

Additional Resources

A good primer for the art of patent examination is the Manual of Patent Examining Procedure.