PRESENTATION BY GREGORY AHARONIAN
SOURCE TRANSLATION AND OPTIMIZATION
MR. AHARONIAN;:  Before I address the topic -- I'll mainly be
speaking about software prior art -- there were three kind of
little tidbits that came out of other discussions I thought
I'd share with everyone.
About a year ago, a group either with the German Patent Office
or the European Patent Office did a study of the maintenance
fee renewal process for German patents.  In Germany I guess
they're done every year as opposed to being done every three
or four years, as in the U.S.  So that from an economic
analysis point of view, yearly data is very easy to analyze.
They found that for the computer software industry -- no, for
the computer industry as a whole, that the average length of
the patent was about six or seven years before they
effectively stopped renewing the patent.  So these talks about
lowering the patent life, I mean you could go down as far as
about seven years.  And if you actually look at renewal rates,
it would have absolutely no impact.
It's a little known study, but it's one that probably should
be circulated more widely.
The second thing that was also talked about is, there are a
growing number of investment funds in New York City that are
pooling money to find people with patents so they can go chase
lawsuits and stuff.  So that all these problems we're talking
about are going to get a lot worse because there's going to be
a lot more floating around to play these games.  Especially in
the field of software, there's a definite window of time
before it gets really messy with the monies being thrown into
this stuff.
And the third is a patent I just came across out of Microsoft
that -- I had seen something in there that I had never seen
before.  In the preamble to the specification, they said that
part of the patent specification contained copyrighted
material.  And there was a warning in there of some sort.
It was the first time I've actually seen anything copyrighted
inside of a patent.  And I'm wondering if this is going to be
a whole new family of hybrid copyright patent things that are
going to confuse everyone to death.
COMMISSIONER GOFFNEY;:  Did that happen to be code that was in
there?
MR. AHARONIAN;:  I didn't look.  I was just examining
something over at the Public Search Room, and the first page
had this paragraph that I Xeroxed because it was just
something I'd never seen before.
PARTICIPANT:  For clarification, it's a notice that says that,
for purposes -- that you can copy this patent application or
patent, once it issues, for any purpose you want related to
the patent application.  But you can't -- all other copyright
rights are reserved.  And it's a common practice by
practitioners.
MR. AHARONIAN;:  I'd never seen it before, and I thought it
was kind of interesting.  A couple of us were chuckling.
I'm here to talk about software prior art, and I happen to
know a little bit about the subject.
Software prior art comes up in six areas of activities.  In
the information disclosure document, when the applicant files
a document, during the patent examination when the examiner is
dealing with issues of novelty and obviousness, during
reexaminations when somebody is going to challenge it,
infringement lawsuits, and the circuit court decisions.
Each of these need to have access to what's been done in the
field before.  Actually, in terms of economic activities,
which dwarfs all software prior art activities, there's just
some general software technology trends for reuse, well, they
have actually the same question:  What is out there that
exists that can be used?
For many years now, at least eight years now, I've been
maintaining a very large -- the largest software prior art
reuse database in the country.  I have information over 15,000
computer programs coming out in government, corporate, and
university facilities, 5,000 patents, and over 100,000
abstracts to articles in the field.
This is in a sense an active collection.  Each of the items
are items that I've actively sought out to include in my
database and examined either in depth or just briefly to look
at them.
I'm located in the Boston area, and in this modern era I'm
located on the Internet.
One of the things I do is that every year or two years I
publish a directory of -- what I call the Government Source
Code Directory -- since a lot of the public domain software,
a lot of the university software, even a lot of the corporate
software is actually funded under government contract, except
for obviously corporate commercial software.
The current directory has the titles to about 10,000 programs. 
It's actually a pretty good guide to both what is
state-of-the-art, what is historical, how to classify
software.  It's just such a large body of information that
there is a lot you can do with it.
I run a business of helping companies get at the software,
helping them reuse it in their business practices, helping
them examine the technology inside of it, things of that
nature.  It's a very rich source material.  This country
spends about $50 billion a year developing this stuff.  And
there are a lot of good programmers working here, so that
there's a tremendous wealth of technology available.
I also, in recent years, as software patenting has become
active, and I tend to share a lot in the information I have,
I've started up something called Internet Patent News Service,
where each week I mail out over the Internet the titles and
numbers to the most recent patents and the most recent gazette
that happens to hit the Boston Public Library where I do a lot
of my research.
I have about 1,000 subscribers around the world, many of which
are actually rebroadcast sites, gopher sites where they
collect the information and make it available -- 880 of the
subscribers get the news service, where I, for example,
announce the PTO hearings and other such things.  Nine hundred
or so are electronic, most of which are software.  There is a
tremendous demand on the Internet for software patenting
information.  These people would kill for almost anything.
I have all types of people, government agencies, people in 35
states, 28 countries, corporations, universities, and one of
the Texas patent depositories got tired of getting their data
so late down there they just figured they'd get it through me.
This is a map that was collected by one of the Internet node
maintainers of traffic flow over the Internet.  And it's kind
of a pretty picture which I like showing to people.  But it
also kind of shows both the sites where a lot of software
activity is going on in the U.S., where I track a lot of the
software.  That does come out, where a lot of the software
prior art is being made, and where actually a lot of my Patent
News Service subscribers are.
It's all pretty much the same thing.  And not surprisingly,
there are heavy concentrations in New York, Boston, and
Washington on the East Coast, obviously.  And then up on the
West Coast, it's the Bay Area, Silicon Valley, and down in LA,
San Diego.  There's a decent movement in Texas and somewhere
in the Midwest.  But for the most part, it's regionalized into
the five big tech cities of the country that are up there.
Where do you find software prior art?  Well, the sources I
find out when I'm traveling around the country are in these
seven categories:  technical reports, both government,
corporate, and academic; journal articles; conference
proceedings; theses and books -- and universities theses are
probably one of the most richest sources of software prior
art, in a timely sense; commercial products; Internet files;
bulletin board systems, which are in many cases not part of
the Internet formally but tend to store growing mounts of
information; and in software patents.
Each of those sources of information have a legacy of history
behind the organizations involved with them.  And you have to
learn about them to learn how to search through them.
What types of software prior art do I search for?  Well,
obviously, source code is the most obvious one to look for,
since that is the best description of a program.
Then there are object libraries and executables.  There are
flow charts and state charts.  There are pseudo-code which you
see in a lot of journal articles.  There are patent claims. 
Obvious things, obvious description of those software.
Then there are some things that kind of border on the software
field, the SPICE and VHDL circuit description languages, and
with the growing convergence of hardware and software, they
too become prior art of a sort that have to be searched for,
even though to most people concerned searching for software
prior art, they would not look in such sources.
Then spreadsheets and numerical data also can be considered
software as a form.
Now, when I think of software prior art -- and what follows is
a series of slides that I'm going to give you a tour of where
I hang out most of my life.  When I think of software prior
arts, I think of dusty, grungy old basements.  That's where
most of this type of literature can be found.  You have to
look for it.  But this is where you're going to find it a lot
of times -- dark basements, with endless stacks of materials
that you have to search through one by one.
Most of this stuff is not on computer databases at all.  The
only way you're going to really find any of this stuff is to
pull out these volumes one by one and flip through them.  It's
a very tedious, lengthy process.  It's the only way it really
can be done.
This happens to be a collection of books dealing purely with
software.  So in some cases, the information is fairly
compact.  These I think are programming language books in a
variety of languages.  I think those green books up top are
all the ADA books.
In some cases, the information is tightly concentrated, and it
does make the search easier.
In other cases, for example, the bookcases you see in the
background, are for the subject matters of physics and
engineering.  Normally you wouldn't consider searching through
such stacks for software, especially since they're really not
in software.  But there is a growing amount of software prior
art in such subjects.  Physicists do a lot of cutting edge
software development that does qualify as prior art.  And when
you deal with stacks like that, the books are very scattered
in there and it takes a long time to go through them all.
Another source is journals.  This is a series of journals. 
And most of the journals up there come from one of the leading
societies, the ACM.  I think the third and fourth rows up
there are mostly the ACM journals.
But there are a variety of other journals in related fields to
software that all have to be searched through, all coming out
every month, all potentially sources of prior art.  And each
journal has a family of editors and reviewers behind it,
associations behind it.  There are certain styles of software
in there.  Knowing that is very important to tracking software
prior art.
The journals you just saw up there were one current month's
work for all the journal from like A to Z.  There are
tremendous numbers of them.  These are all the back journals. 
In this case, for those familiar with searching for such
stuff, the IEEE has the previous journals around.  They just
use lots of different colors for their journal covers, and you
can usually identify which section of the library deals with
them.  But in each case you have to flip through each one of
these volumes to find stuff.
Then there are collections of technical reports.  And these
tend to be even more unorganized and scattered about.  But
even there, there is structure to how they are kept.  If
you'll see, in the middle you'll see some white journals with
a colorful band across them.  Those happen to belong to the
Electric Power Research Institute, and they actually do some
software development which they've had patents on.  So you
have to search through all of them.
Next to them are some orange journals which are characteristic
of the Japanese Atomic Energy Research Institute.  And again,
they have software.  In that case, it's even more difficult to
search for that stuff because their reports tend to be all in
Japanese except for an English abstract in source code and
usually FORTRAN or something.  And you know, I can read
FORTRAN, but I'm still trying to learn to read Japanese.
But again, it's there and it's something that has to be dealt
with.
These are again even older technical reports.  These are so
old that they've lost most of their colors.  The orange ones
are the NASA reports, and NASA tends to have bright orange and
dull blue covers.  The middle ones are from a European defense
group, AGARD, that has a lot of software prior art.
Endless number of these in these libraries all over the
country, that require one to go through them.
In some cases, the volumes of reports are so great that no
library could contain them all, and you reduce them down with
microfiche.  This happens to be one subsection of a collection
of microfiche for NASA technical reports.
Again, you have to go through each one of these one by one,
stick them into the microfiche reader, and examine them to see
if they have prior art, flow charts, whatever.  It's not a fun
process, and I've got a fair number of cuts on my fingers over
the years from going through these things.
Again, here are more cabinets of microfiche.  And in the
background you see microform, which is a different type of
film, with its set of printers.  And it's just endless volumes
of these things.
One of the richest sources of software prior art are
university theses, because they tend to let their students do
things that are as wacky as wacky can be, mainly because
students are there to learn how to do wacky things as opposed
to doing anything really meaningful.  So a lot of the ideas --
I mean, something like Compton's patents I initially laughed
at it because I've seen theses in the '80s that did all types
of things with CD ROMs, because back then they were first
coming out.  And some student said, hey, there's a new CD ROM,
let me try doing something educational with it.
Unfortunately most thesis information is not on any database,
and it's very hard to find short of actually going to each
university and flipping through these reports one by one.  It
can be a pain.
And finally, there is in the academic community, even in the
corporate research community, the preprint system where people
tend to distribute copies of their reports before they're
published, or in many cases they don't even get published,
they just pass them out anyway.
These things are very unorganized, and you tend to find them
in stacks on carts.  I think this is actually an IBM library
in the Boston area I happened to be floating through. 
Searching that stuff is a pain.
Now, increasingly computers are making an impact on the
library world.  This is the main reference section for one
such library.  But in terms of prior art, most of the really
interesting stuff predates most databases so that, while such
computer systems will help in the future, they really won't
help in the past.
Of course, I complain about a lot of the places I hang out. 
But this happens to be out the window of one of the MIT
libraries, and during the summer it's a very pretty view.  So
it is somewhat relaxing sometimes in doing my prior art
searches.
Now, in San Jose -- and once again I'd like to reiterate it
out here -- recent developments in the hardware design world
are really blurring the distinctions between hardware and
software.  And I'll disagree with some of the others who say
that there are such distinctions.  While this will have an
impact on patenting issues and procedures, it has a great
impact on software prior art because it opens up tremendous
sections of hardware research over the past 20 years as
potential software prior art.
There exists programs that allow me to scan in circuits what
anyone would consider to be a pure piece of hardware, and turn
them into a software algorithm.  That means that in building
a software prior art database you have to include all of the
hardware prior art that exists out there because nowadays it
can be turned into software.
And based on some counts I've made, there's at least twice as
much hardware prior art as there is software prior art, so it
basically triples the size of such an effort.
This is just a little article on a company in Germany that
combined case tools, which is basically software engineering,
with their hardware design tools, so that within one
environment for the most part the engineer doesn't even care
what the end result will be, hardware or software.  He's just
worrying about processes and algorithms and devices and things
like that.  At the end he pushes a button to get out a chip or
a computer program.  So that this issue of prior art is
becoming more complicated even as we're holding these
hearings.
Building prior art databases is not for amateurs.  I mean,
over the past ten years at least eight government efforts have
tried to do similar things, and they all have failed for a
variety of reasons.  It's a very complicated process.  There
are at least 10 different knowledge classification schemes
I've had to learn over the years, Library of Congress, IEEE
has one, ACM has one, I have two, the Patent Office has one,
there's the Dewey decimal system.
When you're going through all these sources of information out
there, each one classifies its stuff differently.  And to do
these searches effectively and cost-efficiently, you have to
know each one.  It's a tremendous amount of information.
There have been suggestions that the Internet could be a
substitute.  I'm very skeptical.  I think that doing prior art
searches and requests over the Internet has actually caused
more problems than it will solve.
In recent months a variety of different people have actually
asked me how much it would cost to build a truly useful
software prior art database.  My guess is, based on what I've
been doing over the past eight years, is that you need a
minimum of $10 million, plus $2 million a year as maintenance.
Now, that might seem a lot, but remember, this is to track a
$50 billion a year development process.  And out of that, $10
million is fairly minor.  But given the vast amount of
literature that already exists out there, you're going to need
a very rapid development effort to catch up with all of that,
plus future development efforts to do so into the future.
With the databases I already have in my knowledge, I could
reject about a quarter of all existing software patents.  So
I would think there is indeed a problem.  And most people have
recognized that.
As a kind of incentive to the Patent Office, if they're
considering actually building such databases, the software
prior art database would have even greater benefits to the
U.S. software development community.  And you could score a
fair number of brownie points by helping them out at the same
time.
The last slide illustrates some of the problems we're now
facing with software patents.  This is from the January 4th,
1994 Official Gazette.  And it's a patent from IBM for
choosing items off of a menu.
Now, the Official Gazette includes the first claim and a
diagram of the best mode embodiment.  And it is inconceivable
to me that in 1994 the best mode embodiment of a menu
selection system is what appears in the Gazette and what
appears in the patent.  I haven't examined this patent in
detail, but I suspect what we see there reflects what's in the
rest of it.
Those type of menu selection systems date back to the '60s. 
And the fact that something was issued with such diagrams
makes me kind of nervous that the problem is even worse than
we think it is.
But like I said, you flip open recent Gazettes, and you'll see
patents in there that are truly questionable.
That's it.
COMMISSIONER LEHMAN;:  Thank you very much, Mr. Aharonian.
Well, I think we did pretty well today for a snowy day.  We
actually got all but a handful of people that were supposed to
testify, and we got a couple more from yesterday.  And I want
to thank everybody for coming through the snow.
As I indicated, this hearing transcript will be made available
after February 21st.  But we're happy to accept more
supplemental information, either written information that can
be sent directly to us, or information that can be sent to
Jeff Kushan on the Internet.
We're always open to information at any time, even two, three
years from now if you -- you know, reelect President Clinton,
we'll be available for information even then, and then maybe
President Gore, and then maybe President Hillary Clinton.  By
then we'll have the prior art database completely resolved,
that problem.
So anyway, thank you very much, and have a good day.
(Whereupon, the hearing in the above-entitled matter was
adjourned.)

Due to the inclement weather, a number of speakers were unable
to attend or provide oral remarks.  Prepared remarks from
these individuals has been included in the transcripts in
response to their request.
Back to The index of speakers for Arlington

Back to The Home Page