Thank you for sharing!

Your article was successfully shared with the contacts you provided.
Consolidation is key in much of today’s global economy, and nowhere is that as evident as with Optical Character Recognition (OCR) software. OCR software is used to convert the text on scanned images of a printed page into characters that can be indexed and searched by a full-text-search program or edited in a word processor or text-editor program. Lawyers need OCR software to recycle draft contracts, interrogatories, the results of document production, or any other document that originates in someone else’s computer. In the last year it seems that ScanSoft, a publicly held corporation at one time owned by Xerox, acquired most of the well-known OCR software that hadn’t been owned by Caere Corporation, an established name in OCR that had been on an OCR acquisition program of its own. ScanSoft then acquired Caere. This week we look at three different OCR solutions from ScanSoft: TextBridge Pro Millennium Business Edition from the company’s Xerox tradition; the new edition of PaperPort from its Visioneer acquisition; and Version 10 of OmniPage Pro, the entry from ScanSoft’s Caere antecedents. OCR The first OCR program that we remember did little more than look at a scanned page and attempt to translate each separate image of each character on the page into its text equivalent. The program, which required a co-processor board to handle the heavy translation effort, worked well with very clean typewriter originals, particularly those printed with an IBM Electric ball with a font known as OCR B, but not as well with newspapers, books, faxes, carbon copies — this was a long time ago — or even second- or third-generation Xerox copies. “Working well” meant operating at 98 or 99 percent accuracy. This sounded good, but it meant that on a typical double-spaced page with about 400 words or 2,000 characters, there were between 20 and 40 errors. Cleanup wasn’t particularly difficult with the help of a good word processor’s spell-checker, but both software and scanners were very expensive, and some secretaries found it easier to just key in the page on a typewriter. Today’s OCR products are almost perfect on even complex proportional fonts printed with laser or inkjet printers, and they can deal with newspapers and books. Simple image recognition of characters has been supplemented with smart technology that can deal with columns, tables, and other complex page formatting. Further, these programs use word recognition that compares each word with a dictionary and lets the user know when the characters it thinks it has recognized form a word that isn’t in the dictionary, hence creating a potential error. (As far as we can tell, none of these programs moves to the sort of “phrase recognition” that speech recognition products like ViaVoice or Naturally Speaking use.) Modern programs also pick up font type, size, and attributes such as bold or italics. The output from these programs can be automatically sent to your choice of word processing programs. THE PRODUCTS We tested the top-of-the-line OmniPage Pro and TextBridge Millennium Pro Business Edition, each with a $500 price tag. The programs use different OCR engines, but both programs deal with English or a variety of foreign languages, and both will “zone” each page to properly deal with columns, tables, and other complex formatting. An OCR program that doesn’t “zone” well can push the lines from a two-column page together, so that each recognized line would consist of half a line from column one and half from column two. We tested both programs on a variety of documents around the office including multipage contracts and pleadings. Both programs ran a Twain-compliant Hewlett Packard OfficeJet 500 scanner with an automatic document feeder (ADF). With our combination of a scanner capable of about six letter-sized pages per minute and a 266 MHz Pentium II processor, the OmniPage seemed to be able to recognize and read the scan at the same time, making the entire process a bit faster than with the TextBridge. The best way to deal with a long document, of course, is to toss it into the ADF and take a coffee break. Both products let us customize the zoning on individual pages, but both worked well enough with automatic zone recognition. Both programs pointed out unrecognized words and suggested possible alternatives, but we thought that the OmniPage method of presenting choices and highlighting possible errors was easier to see than the TextBridge version’s. Both programs did considerably better than 99 percent with the variety of documents that we were using to test: two or three possible errors per page was typical, and about half of these were specialized legal vocabulary. Both programs let us add such words to a user dictionary so that they wouldn’t show up as errors the next time they were encountered. OmniPage did a better job picking up page formatting, but we had to do some formatting cleanup with the results of either product. OmniPagePro, but not TextBridge, has a text-to-speech option that lets the computer read the recognized pages aloud. Listening to the recognized text while following the scanned pages seems to be an excellent way to proof the OCR output. The program does a good job of reading the words in a pleasant voice, but it does not specifically mention punctuation, which makes the procedure less than fully useful for any document in which the placement of a comma or semicolon is important. The program also has a nasty habit of reading numbers as words, so that our local zip code is read as “sixty thousand four hundred thirty” rather than the much easier to understand “six zero four three zero.” TextBridge, but not OmniPagePro, lets the user output the finished text as a PDF (Portable Document Format) file, a handy step if you want to use PDF. Users of OmniPage Pro will have to save to a word processor that can output PDF or use Adobe Acrobat to change to a PDF file. We didn’t test the $79 standard, non-business edition of TextBridge Millennium, but we are told that the only difference between the two is that $79 doesn’t buy PDF output. Although we like OmniPagePro a little better than TextBridge, if you don’t need PDF the $79 non-business TextBridge is a much better deal. The $60 PaperPort Deluxe 7.0 uses the TextBridge recognition engine and does, oddly enough, output to PDF, but we found it better as a personal, desktop-clutter-removal system, great for bills and receipts, letters, and the like. You can also set the program to automatically index the documents that it scans and OCRs, so that you can find them when needed. Version 7 doesn’t differ much from earlier versions, but it is a good buy if your OCR and scanning needs aren’t great. WEB OF THE WEEK We were impressed by comments from a number of the product exhibitors at the recent ATLA gathering, but perhaps none more than the answer given by a representative of USLaw.com, yet another web site that seeks to provide the public with information about lawyers and the law and to put potential clients together with good lawyers. The site’s representative was proud to show us that when he typed the word “lawyer” into the AltaVista search engine, the USLaw site popped up as an AltaVista recommendation. “How much did you pay for that?” I asked. “Fifteen million dollars!” Never mind that the banner from Martindale-Hubbell’s lawyers.com appeared near the top of the page. We were told that’s old stuff that will be ending its advertising cycle soon and won’t be renewed. USLaw.com presumably expects to make back that $15 million and a lot more from monthly revenues from advertising and other purchases from small firms and solos. The company will not only put you together with potential clients, but will also offer discounts on firm essentials such as “legal research, management and marketing consulting services, temporary staffing, accounting services, computer equipment, office supplies,” and “even public relations.” If you fall into the small firm/solo category, take a look at the site. Potential clients who search for “lawyer” on AltaVista will. The URL, of course, is www.uslaw.com. SUMMARY AND DETAILS The latest $500 versions of TextBridge and OmniPage OCR systems are both excellent products, but if you don’t need PDF output, the $80 non-business edition of TextBridge is an excellent buy. PaperPort Deluxe Version 7.0 is little changed from earlier versions, but it’s still a great personal scanning and OCR product to help eliminate desktop paper clutter. OmniPage Pro 10, price: $499.99. TextBridge Pro Millennium, Business Edition, price: $499. TextBridge Pro Millennium, price: $79. PaperPort Deluxe 7.0, price: $79. All require computer with a Pentium processor running Microsoft Windows 95 / 98 / 2000 or Windows NT 4.0. OmniPage requires 50 to 90 Mbytes hard disk space; TextBridge requires 40 Mbytes disk space. PaperPort requires 60 Mbytes hard disk space. ScanSoft Inc., 9 Centennial Dr., Peabody, Mass., 01960. Phone: 978-977-2000. Web: www.scansoft.com. Barry D. Bayer practices law and writes about computers from his office in Homewood, Ill. You may send comments or questions to his e-mail address [email protected] or write c/o Law Office Technology Review, P.O. Box 2577, Homewood, Ill. 60430.

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

To view this content, please continue to their sites.

Not a Lexis Advance® Subscriber?
Subscribe Now

Not a Bloomberg Law Subscriber?
Subscribe Now

Why am I seeing this?

LexisNexis® and Bloomberg Law are third party online distributors of the broad collection of current and archived versions of ALM's legal news publications. LexisNexis® and Bloomberg Law customers are able to access and use ALM's content, including content from the National Law Journal, The American Lawyer, Legaltech News, The New York Law Journal, and Corporate Counsel, as well as other sources of legal information.

For questions call 1-877-256-2472 or contact us at [email protected]


ALM Legal Publication Newsletters

Sign Up Today and Never Miss Another Story.

As part of your digital membership, you can sign up for an unlimited number of a wide range of complimentary newsletters. Visit your My Account page to make your selections. Get the timely legal news and critical analysis you cannot afford to miss. Tailored just for you. In your inbox. Every day.

Copyright © 2021 ALM Media Properties, LLC. All Rights Reserved.