PRESENTATION BY GREGORY AHARONIAN SOURCE TRANSLATION AND OPTIMIZATION MR. AHARONIAN;: Before I address the topic -- I'll mainly be speaking about software prior art -- there were three kind of little tidbits that came out of other discussions I thought I'd share with everyone. About a year ago, a group either with the German Patent Office or the European Patent Office did a study of the maintenance fee renewal process for German patents. In Germany I guess they're done every year as opposed to being done every three or four years, as in the U.S. So that from an economic analysis point of view, yearly data is very easy to analyze. They found that for the computer software industry -- no, for the computer industry as a whole, that the average length of the patent was about six or seven years before they effectively stopped renewing the patent. So these talks about lowering the patent life, I mean you could go down as far as about seven years. And if you actually look at renewal rates, it would have absolutely no impact. It's a little known study, but it's one that probably should be circulated more widely. The second thing that was also talked about is, there are a growing number of investment funds in New York City that are pooling money to find people with patents so they can go chase lawsuits and stuff. So that all these problems we're talking about are going to get a lot worse because there's going to be a lot more floating around to play these games. Especially in the field of software, there's a definite window of time before it gets really messy with the monies being thrown into this stuff. And the third is a patent I just came across out of Microsoft that -- I had seen something in there that I had never seen before. In the preamble to the specification, they said that part of the patent specification contained copyrighted material. And there was a warning in there of some sort. It was the first time I've actually seen anything copyrighted inside of a patent. And I'm wondering if this is going to be a whole new family of hybrid copyright patent things that are going to confuse everyone to death. COMMISSIONER GOFFNEY;: Did that happen to be code that was in there? MR. AHARONIAN;: I didn't look. I was just examining something over at the Public Search Room, and the first page had this paragraph that I Xeroxed because it was just something I'd never seen before. PARTICIPANT: For clarification, it's a notice that says that, for purposes -- that you can copy this patent application or patent, once it issues, for any purpose you want related to the patent application. But you can't -- all other copyright rights are reserved. And it's a common practice by practitioners. MR. AHARONIAN;: I'd never seen it before, and I thought it was kind of interesting. A couple of us were chuckling. I'm here to talk about software prior art, and I happen to know a little bit about the subject. Software prior art comes up in six areas of activities. In the information disclosure document, when the applicant files a document, during the patent examination when the examiner is dealing with issues of novelty and obviousness, during reexaminations when somebody is going to challenge it, infringement lawsuits, and the circuit court decisions. Each of these need to have access to what's been done in the field before. Actually, in terms of economic activities, which dwarfs all software prior art activities, there's just some general software technology trends for reuse, well, they have actually the same question: What is out there that exists that can be used? For many years now, at least eight years now, I've been maintaining a very large -- the largest software prior art reuse database in the country. I have information over 15,000 computer programs coming out in government, corporate, and university facilities, 5,000 patents, and over 100,000 abstracts to articles in the field. This is in a sense an active collection. Each of the items are items that I've actively sought out to include in my database and examined either in depth or just briefly to look at them. I'm located in the Boston area, and in this modern era I'm located on the Internet. One of the things I do is that every year or two years I publish a directory of -- what I call the Government Source Code Directory -- since a lot of the public domain software, a lot of the university software, even a lot of the corporate software is actually funded under government contract, except for obviously corporate commercial software. The current directory has the titles to about 10,000 programs. It's actually a pretty good guide to both what is state-of-the-art, what is historical, how to classify software. It's just such a large body of information that there is a lot you can do with it. I run a business of helping companies get at the software, helping them reuse it in their business practices, helping them examine the technology inside of it, things of that nature. It's a very rich source material. This country spends about $50 billion a year developing this stuff. And there are a lot of good programmers working here, so that there's a tremendous wealth of technology available. I also, in recent years, as software patenting has become active, and I tend to share a lot in the information I have, I've started up something called Internet Patent News Service, where each week I mail out over the Internet the titles and numbers to the most recent patents and the most recent gazette that happens to hit the Boston Public Library where I do a lot of my research. I have about 1,000 subscribers around the world, many of which are actually rebroadcast sites, gopher sites where they collect the information and make it available -- 880 of the subscribers get the news service, where I, for example, announce the PTO hearings and other such things. Nine hundred or so are electronic, most of which are software. There is a tremendous demand on the Internet for software patenting information. These people would kill for almost anything. I have all types of people, government agencies, people in 35 states, 28 countries, corporations, universities, and one of the Texas patent depositories got tired of getting their data so late down there they just figured they'd get it through me. This is a map that was collected by one of the Internet node maintainers of traffic flow over the Internet. And it's kind of a pretty picture which I like showing to people. But it also kind of shows both the sites where a lot of software activity is going on in the U.S., where I track a lot of the software. That does come out, where a lot of the software prior art is being made, and where actually a lot of my Patent News Service subscribers are. It's all pretty much the same thing. And not surprisingly, there are heavy concentrations in New York, Boston, and Washington on the East Coast, obviously. And then up on the West Coast, it's the Bay Area, Silicon Valley, and down in LA, San Diego. There's a decent movement in Texas and somewhere in the Midwest. But for the most part, it's regionalized into the five big tech cities of the country that are up there. Where do you find software prior art? Well, the sources I find out when I'm traveling around the country are in these seven categories: technical reports, both government, corporate, and academic; journal articles; conference proceedings; theses and books -- and universities theses are probably one of the most richest sources of software prior art, in a timely sense; commercial products; Internet files; bulletin board systems, which are in many cases not part of the Internet formally but tend to store growing mounts of information; and in software patents. Each of those sources of information have a legacy of history behind the organizations involved with them. And you have to learn about them to learn how to search through them. What types of software prior art do I search for? Well, obviously, source code is the most obvious one to look for, since that is the best description of a program. Then there are object libraries and executables. There are flow charts and state charts. There are pseudo-code which you see in a lot of journal articles. There are patent claims. Obvious things, obvious description of those software. Then there are some things that kind of border on the software field, the SPICE and VHDL circuit description languages, and with the growing convergence of hardware and software, they too become prior art of a sort that have to be searched for, even though to most people concerned searching for software prior art, they would not look in such sources. Then spreadsheets and numerical data also can be considered software as a form. Now, when I think of software prior art -- and what follows is a series of slides that I'm going to give you a tour of where I hang out most of my life. When I think of software prior arts, I think of dusty, grungy old basements. That's where most of this type of literature can be found. You have to look for it. But this is where you're going to find it a lot of times -- dark basements, with endless stacks of materials that you have to search through one by one. Most of this stuff is not on computer databases at all. The only way you're going to really find any of this stuff is to pull out these volumes one by one and flip through them. It's a very tedious, lengthy process. It's the only way it really can be done. This happens to be a collection of books dealing purely with software. So in some cases, the information is fairly compact. These I think are programming language books in a variety of languages. I think those green books up top are all the ADA books. In some cases, the information is tightly concentrated, and it does make the search easier. In other cases, for example, the bookcases you see in the background, are for the subject matters of physics and engineering. Normally you wouldn't consider searching through such stacks for software, especially since they're really not in software. But there is a growing amount of software prior art in such subjects. Physicists do a lot of cutting edge software development that does qualify as prior art. And when you deal with stacks like that, the books are very scattered in there and it takes a long time to go through them all. Another source is journals. This is a series of journals. And most of the journals up there come from one of the leading societies, the ACM. I think the third and fourth rows up there are mostly the ACM journals. But there are a variety of other journals in related fields to software that all have to be searched through, all coming out every month, all potentially sources of prior art. And each journal has a family of editors and reviewers behind it, associations behind it. There are certain styles of software in there. Knowing that is very important to tracking software prior art. The journals you just saw up there were one current month's work for all the journal from like A to Z. There are tremendous numbers of them. These are all the back journals. In this case, for those familiar with searching for such stuff, the IEEE has the previous journals around. They just use lots of different colors for their journal covers, and you can usually identify which section of the library deals with them. But in each case you have to flip through each one of these volumes to find stuff. Then there are collections of technical reports. And these tend to be even more unorganized and scattered about. But even there, there is structure to how they are kept. If you'll see, in the middle you'll see some white journals with a colorful band across them. Those happen to belong to the Electric Power Research Institute, and they actually do some software development which they've had patents on. So you have to search through all of them. Next to them are some orange journals which are characteristic of the Japanese Atomic Energy Research Institute. And again, they have software. In that case, it's even more difficult to search for that stuff because their reports tend to be all in Japanese except for an English abstract in source code and usually FORTRAN or something. And you know, I can read FORTRAN, but I'm still trying to learn to read Japanese. But again, it's there and it's something that has to be dealt with. These are again even older technical reports. These are so old that they've lost most of their colors. The orange ones are the NASA reports, and NASA tends to have bright orange and dull blue covers. The middle ones are from a European defense group, AGARD, that has a lot of software prior art. Endless number of these in these libraries all over the country, that require one to go through them. In some cases, the volumes of reports are so great that no library could contain them all, and you reduce them down with microfiche. This happens to be one subsection of a collection of microfiche for NASA technical reports. Again, you have to go through each one of these one by one, stick them into the microfiche reader, and examine them to see if they have prior art, flow charts, whatever. It's not a fun process, and I've got a fair number of cuts on my fingers over the years from going through these things. Again, here are more cabinets of microfiche. And in the background you see microform, which is a different type of film, with its set of printers. And it's just endless volumes of these things. One of the richest sources of software prior art are university theses, because they tend to let their students do things that are as wacky as wacky can be, mainly because students are there to learn how to do wacky things as opposed to doing anything really meaningful. So a lot of the ideas -- I mean, something like Compton's patents I initially laughed at it because I've seen theses in the '80s that did all types of things with CD ROMs, because back then they were first coming out. And some student said, hey, there's a new CD ROM, let me try doing something educational with it. Unfortunately most thesis information is not on any database, and it's very hard to find short of actually going to each university and flipping through these reports one by one. It can be a pain. And finally, there is in the academic community, even in the corporate research community, the preprint system where people tend to distribute copies of their reports before they're published, or in many cases they don't even get published, they just pass them out anyway. These things are very unorganized, and you tend to find them in stacks on carts. I think this is actually an IBM library in the Boston area I happened to be floating through. Searching that stuff is a pain. Now, increasingly computers are making an impact on the library world. This is the main reference section for one such library. But in terms of prior art, most of the really interesting stuff predates most databases so that, while such computer systems will help in the future, they really won't help in the past. Of course, I complain about a lot of the places I hang out. But this happens to be out the window of one of the MIT libraries, and during the summer it's a very pretty view. So it is somewhat relaxing sometimes in doing my prior art searches. Now, in San Jose -- and once again I'd like to reiterate it out here -- recent developments in the hardware design world are really blurring the distinctions between hardware and software. And I'll disagree with some of the others who say that there are such distinctions. While this will have an impact on patenting issues and procedures, it has a great impact on software prior art because it opens up tremendous sections of hardware research over the past 20 years as potential software prior art. There exists programs that allow me to scan in circuits what anyone would consider to be a pure piece of hardware, and turn them into a software algorithm. That means that in building a software prior art database you have to include all of the hardware prior art that exists out there because nowadays it can be turned into software. And based on some counts I've made, there's at least twice as much hardware prior art as there is software prior art, so it basically triples the size of such an effort. This is just a little article on a company in Germany that combined case tools, which is basically software engineering, with their hardware design tools, so that within one environment for the most part the engineer doesn't even care what the end result will be, hardware or software. He's just worrying about processes and algorithms and devices and things like that. At the end he pushes a button to get out a chip or a computer program. So that this issue of prior art is becoming more complicated even as we're holding these hearings. Building prior art databases is not for amateurs. I mean, over the past ten years at least eight government efforts have tried to do similar things, and they all have failed for a variety of reasons. It's a very complicated process. There are at least 10 different knowledge classification schemes I've had to learn over the years, Library of Congress, IEEE has one, ACM has one, I have two, the Patent Office has one, there's the Dewey decimal system. When you're going through all these sources of information out there, each one classifies its stuff differently. And to do these searches effectively and cost-efficiently, you have to know each one. It's a tremendous amount of information. There have been suggestions that the Internet could be a substitute. I'm very skeptical. I think that doing prior art searches and requests over the Internet has actually caused more problems than it will solve. In recent months a variety of different people have actually asked me how much it would cost to build a truly useful software prior art database. My guess is, based on what I've been doing over the past eight years, is that you need a minimum of $10 million, plus $2 million a year as maintenance. Now, that might seem a lot, but remember, this is to track a $50 billion a year development process. And out of that, $10 million is fairly minor. But given the vast amount of literature that already exists out there, you're going to need a very rapid development effort to catch up with all of that, plus future development efforts to do so into the future. With the databases I already have in my knowledge, I could reject about a quarter of all existing software patents. So I would think there is indeed a problem. And most people have recognized that. As a kind of incentive to the Patent Office, if they're considering actually building such databases, the software prior art database would have even greater benefits to the U.S. software development community. And you could score a fair number of brownie points by helping them out at the same time. The last slide illustrates some of the problems we're now facing with software patents. This is from the January 4th, 1994 Official Gazette. And it's a patent from IBM for choosing items off of a menu. Now, the Official Gazette includes the first claim and a diagram of the best mode embodiment. And it is inconceivable to me that in 1994 the best mode embodiment of a menu selection system is what appears in the Gazette and what appears in the patent. I haven't examined this patent in detail, but I suspect what we see there reflects what's in the rest of it. Those type of menu selection systems date back to the '60s. And the fact that something was issued with such diagrams makes me kind of nervous that the problem is even worse than we think it is. But like I said, you flip open recent Gazettes, and you'll see patents in there that are truly questionable. That's it. COMMISSIONER LEHMAN;: Thank you very much, Mr. Aharonian. Well, I think we did pretty well today for a snowy day. We actually got all but a handful of people that were supposed to testify, and we got a couple more from yesterday. And I want to thank everybody for coming through the snow. As I indicated, this hearing transcript will be made available after February 21st. But we're happy to accept more supplemental information, either written information that can be sent directly to us, or information that can be sent to Jeff Kushan on the Internet. We're always open to information at any time, even two, three years from now if you -- you know, reelect President Clinton, we'll be available for information even then, and then maybe President Gore, and then maybe President Hillary Clinton. By then we'll have the prior art database completely resolved, that problem. So anyway, thank you very much, and have a good day. (Whereupon, the hearing in the above-entitled matter was adjourned.) Due to the inclement weather, a number of speakers were unable to attend or provide oral remarks. Prepared remarks from these individuals has been included in the transcripts in response to their request.Back to The index of speakers for Arlington
Back to The Home Page