Further documentation concerning the MSA_ORI data file
------------------------------------------------------
(text primarily as extracted from the TAF report,
"United States Patent Grants By State, County, and
Metropolitan Area.")

3/1/2001


Federal Information Processing Standards
(FIPS) codes for each geographic area are 
provided. 

The geographic distribution of patents is based on 
the residence of the inventor whose name appears 
first on the printed patent. Only utility patents 
granted from 1990 - 1999 with a first-named inventor 
who resided in the United States are included in this 
report. (1)

The patent data used to prepare these data originate
from the USPTOs Technology Assessment and 
Forecast database. Records in this relational database 
contain patent status and bibliographic information 
that is used by the Information Products Division, 
Technology Assessment and Forecast Branch, to develop 
statistical summaries of patent activity. A general 
description of the methodology used to generate the 
geographic distributions of patent activity in TAF
reports is contained in the sections that follow.

For additional information about this documentation, 
the MSA_ORI file, or other patent data that are available 
from the USPTO, please contact the Information Products 
Division, Technology Assessment and Forecast Branch by 
telephone at (703) 306-2600, or by facsimile at 
(703) 306-2737.
 
Written correspondence should be addressed to:

  USPTO
  Information Products Division, 
    Technology Assessment and Forecast Branch
  PK3--Suite 441
  Washington, DC 20231


DATA SOURCES AND METHODOLOGY

The distributions of patents by state and by metropolitan 
area are derived from a distribution of patents at
the county level. Unfortunately, the inventor county of 
residence seldom appears on patent records. Most patent 
records show only the inventor city and state of residence. 
Full address information, including zip code, is present 
only when an individual (as opposed to an organization) 
owns the patent. Less than one quarter of the patents awarded 
to U.S. resident inventors are individually owned. Therefore, 
a distribution of patents by county is, to a large extent, 
based on inventor city and state data. 

The methodology used to determine inventor county of residence 
involves matching the inventor city, state, and zip code 
(if present) with corresponding data from a geographic 
reference file that also contains the county location of 
each combination of city, state, and zip code in the United 
States.

The geographic reference file used by USPTO to determine 
inventor county of residence contains 185,000 records that 
identify the city, state, zip code, and county location of 
populated places in the United States. This file was developed 
from three data sources:

  * Federal Information Processing Standards Publication 55: 
    Guideline: Codes for Named Populated Places, Primary County 
    Divisions, and Other Locational Entities of the United States 
    and Outlying Areas (Department of Commerce, National 
    Institute of Standards and Technology);
 
  * the Geographic Names Information System (U.S. Department 
    of the Interior, U.S. Geological Survey); and
  
  * a data file based on the 1990 City-State file developed 
    by the United States Postal Service.

Additional records are continuously added to the geographic 
reference file to reflect naming conventions that frequently 
appear on patent records, such as abbreviations (e.g., "hgts.",
"heights" or "hts."). New records for frequently misspelled 
city names (e.g., "Tucson" or "Tuscon"), and for new zip codes 
are also added to the reference file.

About 88% of utility patents can be associated with one specific 
county by matching inventor address information with records 
in the geographic reference file. The remaining 12% of utility 
patents contain inventor addresses that correspond to two or more 
counties. Usually, this situation occurs when the inventor address 
corresponds to a single, populated place that spans multiple counties. 
Occasionally, this situation can occur when the inventor address 
corresponds to two or more places within a state that share the same 
name. In some states, for example, the same township name can appear 
in ten or more counties.(2)

A relatively small number of utility patents (.02% of total patents 
issued) can not be associated with any county due to insufficient 
or incorrect inventor address information. These patents are identified 
as county UNKNOWN.

The program that matches inventor address data to geographic reference 
data also assigns a weight to each patent to indicate the number of 
counties associated with that patent. When a patent is associated with a
single county, the county weight equals "1." When a patent is associated 
with multiple counties within a state, a fractional weight is assigned 
(e.g., .5 = two counties; .33 = three counties, etc.). Patent counts for
each county are calculated from the sum of these weights. State and 
metropolitan area counts are calculated from the appropriate county 
totals.

METROPOLITAN AREAS

The numbers of patents awarded to inventors within a county do not 
necessarily reflect the level of inventive activity that is occurring 
within that county. County totals are based on inventor county of
residence, which is not necessarily the same as the inventor county of 
employment. A distribution of patent activity by metropolitan area 
is, by definition, more likely to encompass residential and employment 
areas.

A metropolitan area is a geographic area that contains a large population 
nucleus, plus surrounding communities with a high degree of social and 
economic integration with that population nucleus. At a minimum, the 
population nucleus consists of a city or urbanized area with a population 
of at least 50,000 inhabitants.

The Office of Management and Budget established criteria to define several 
categories of metropolitan areas. These criteria consider factors such as 
percent urban, population density, and patterns of commuting to work to 
establish the limits of the metropolitan area. With the exception of the 
New England region, the geographic limits of a metropolitan area are 
defined by county boundaries. This facilitates the comparison 
of various socio-economic data that are compiled at the county level. In 
the New England region, cities and towns have a high degree of administrative 
importance and these are typically used to define the boundaries of a 
metropolitan area. One exception is the New England County Metropolitan Area
(NECMA). The limits of a NECMA are defined by county boundaries, instead of 
cities and towns, in order to facilitate analysis of socio-economic data that 
can not be compiled at the sub-county level.

Other categories of metropolitan area include the Metropolitan Statistical 
Area (MSA), formerly known as a Standard Metropolitan Statistical Area, 
the Consolidated Metropolitan Statistical Area (CMSA), and the Primary 
Metropolitan Statistical Area (PMSA). Generally, an MSA includes a city or 
urban area with at least 50,000 inhabitants, plus surrounding counties 
(or subcounties in New England) with metropolitan characteristics. The CMSA 
is a much larger metropolitan area with at least one million inhabitants and
which meets certain other definitional criteria. CMSAs are composed of PMSAs 
and PMSAs are composed of counties (or subcounties in New England).

New metropolitan areas can be established and changes in the geographic 
composition of current metropolitan areas can occur each year. The metropolitan 
areas identified in this report were announced by the Office of Management and 
Budget and were effective July 1, 1999. Some metropolitan areas in this report 
were not established for the entire period 1990-1999, although patent data are 
presented for the entire period. Thus, the total number of patents awarded to 
inventors who resided in metropolitan areas can change for earlier years if 
new or modified metropolitan areas are announced in the current year. In 
addition, changes to previously published data can result from corrections to 
the patent file. The Office for Patent and Trademark Information continuously 
monitors the content of the patent file to ensure the accuracy of the
data. The number of patents associated with a particular geographic area may 
change slightly as state/country code errors in the patent file are identified 
and corrected.

--------------------------
(1) The majority of patents issued by the USPTO are utility (i.e. 
invention) patents. Other types of patents and patent documents 
issued by USPTO, but not included in this report, are plant 
patents, design patents, statutory invention registration documents, 
and defensive publications.

(2) Although relatively few patents contain place names that correspond 
to multiple places within a single state, this can be a potentially serious 
source of error in the number of patents associated with some counties. 
This is particularly true if a larger city with a high volume of patent 
activity (e.g., Mountain View, CA) shares the same name as a small
town with little or no patent activity. The procedure for identifying 
these situations was to first identify any incorporated place, census 
designated place or minor civil division with a 1990 population of 
2,500 or more inhabitants. These place names were then compared to place 
names on the geographic reference file to identify multiple places with the 
same name within each state. The resulting place names were queried on 
USPTOs Automated Patent System to determine if patents were issued to 
inventors in one or more of the places with the same name. The residence 
of other inventors and assignees was used to determine the probable 
geographic location of each patent. If research indicated high patent 
activity was concentrated in some, but not all of the locations, duplicate 
place names without patent activity were deleted from the geographic 
reference file.

