1. How are the USPTO data products organized?

USPTO bulk electronic data products are generally organized by patents and trademarks and by issue date or publication date.

Patent products include patent grants and patent application publications in image, text, text and image, and bibliographic forms; and additional information such as patent assignments, maintenance fee events, etc.

Trademark products include registration images, application text, assignment text, and Trademark Trial and Appeal Board (TTAB) text.

2. Why is patent and trademark information published in bulk format?

The bulk format allows users to obtain data sets rather than by an individual file. These data sets are likely to be of interest to researchers, commercial vendors, academics and consultants, and not to most members of the public.

If you just want to look up an individual patent or trademark, you can get that information without using bulk data. USPTO searchable data is viewable through the Patent Full-Text Database or Trademark Electronic Search System (TESS).

3. Why is there a fee for some data products?

The USPTO plans to eventually provide all data products online at no charge. Most data products are already available from the USPTO for no charge.  A few data products are available from USPTO for a fee, either because they are provided on physical media or because of bandwidth considerations. USPTO has made these products alternatively available online and at no charge from Reed Tech Patents or Reed Tech Trademarks.

4. How current are the datasets?

Data products obtained directly from USPTO are available on the date of publication.

5. How large are the bulk data sets?

Individual bulk data files generally range in size from a few Megabytes to several Gigabytes. Collections of data can be several Terabytes.

6. Are all types of patents included in the bulk data sets?

Patents data sets include:

  • Design Patents
  • Plant Patents
  • Reexamination Certificates (available only in Patent Grant Image files)
  • Reissue Patents
  • Statutory Invention Registration (SIR) documents
  • Utility Patents

7. Are there any restrictions on using the bulk data sets?

There are no restrictions on the use of the data in these products, unless otherwise prohibited by law or specific agreement.

8. What is Extensible Markup Language (XML)?

XML is a standard way of storing structured data. It is hierarchical and can be applied to many situations (in this case to patent grant and published application information). In general XML files are designed to be used by programmers with specialized tools. For background information, a good reference is the Wikipedia XML article.  

9. How do I view bulk data sets in XML?

The bulk data sets can be viewed with an XML reader. A generic XML reader can extract the XML element structure. In order to perform useful automated processing with the documents, however, a program needs specific knowledge of the XML schema used, which the USPTO has documented online.

The concatenated XML documents in the ZIP files, which have file extension “XML,” are not the same as standard XML files and therefore will not be immediately readable by an ordinary XML parser. Instead, the files must be broken into individual XML documents, by splitting them apart at the XML declarations and/or DOCTYPE declarations.

10. Where do I find XML documentation?

www.uspto.gov/products/xml-resources.jsp

The documentation includes machine-readable Document Type Definitions (DTD) and human-readable documentation for the XML formats suitable for use by a customer who wishes to extract information from the XML files.

11. What other XML resources are available?

Links to older versions of the documentation may be found at:  www.uspto.gov/products/xml-retrospective.jsp

The USPTO generally does not update old files when it migrates to a new XML version, so users accessing data from several different years may need to use more than one version of DTDs.

12. Who do I contact for additional information?

Questions and suggestions can be directed to ipd@uspto.gov.

 

  • Where do I find XML format documentation for patent and trademark bulk data?

    Documentation for bulk patent and trademark data may be found at: http://beta.uspto.gov/learning-and-resources/xml-resources

    The documentation includes machine-readable Document Type Definitions (DTDs) and human-readable documentation for the XML formats used by an XML programmer wishing to extract information from the XML files.

  • What XML resources are available at the USPTO to process bulk data?

    Bulk data uses different versions of XML depending on the year of data publication. Links to older versions of the documentation may be found at http://www.uspto.gov/learning-and-resources/xml-resources/xml-resources-retrospective

    The USPTO generally does not update old files when it migrates to a new XML version, so users accessing data from different years may need to use multiple DTDs associated with the corresponding XML version to process the data.

  • Who do I contact for more bulk data information?

    Questions and suggestions can be directed to ipd@uspto.gov

  • Are there any restrictions on using the bulk data products?

    There are no restrictions on the use of the data in these products, unless otherwise prohibited by law or specific agreement.

  • What is Extensible Markup Language (XML)?

    XML is a standard way of storing structured data. It is hierarchical and can be applied to many situations (in this case to patent grant and published application information). In general XML files are designed to be used by programmers with specialized tools. For background information, a good reference is the Wikipedia XML article.

  • How large are the bulk data products?

    Individual bulk data files generally range in size from a few Megabytes to several Gigabytes. Collections of data can be several Terabytes.

  • What types of patents are included in the bulk data products?

    Patent bulk data includes:

    • Design Patents
    • Plant Patents
    • Reexamination Certificates (available only in Patent Grant Image files)
    • Reissue Patents
    • Statutory Invention Registration (SIR) documents
    • Utility Patents
  • Why is there a fee for some bulk data products?

    The USPTO plans to eventually provide all bulk data products online at no charge. Most bulk data products are already available from the USPTO for no charge. A few bulk data products are available from USPTO for a fee, either because they are provided on physical media or because of bandwidth considerations. USPTO has made these products alternatively available online and at no charge from Reed Tech Patents or Reed Tech Trademarks.

  • How current are the bulk data products?

    Bulk data products are available on the date of publication.

  • How do I find bulk data products?

    Bulk data products are generally organized by type of intellectual property: patents or trademarks. Then they are organized by issue date or publication date.

    Patent data includes patent grants and patent application publications with image only, text, text and image, and bibliographic; and additional information such as patent assignments, maintenance fee events, etc.

    Trademark data includes application and registration images, application text, assignment text, and Trademark Trial and Appeal Board (TTAB) text.

    Download bulk data: https://eipweb.uspto.gov/soms/http://patents.reedtech.com, or http://trademarks.reedtech.com.

  • Do I need patent or trademark bulk data products?

    Most members of the public do not need patent and trademark bulk data products. Bulk data is likely to be used by researchers, commercial vendors, academics and consultants.

    To search for an individual patent or trademark, you can get that information without using bulk data. Find data in the Patent Full-Text Database or Trademark Electronic Search System (TESS)