Archive for Accessibility

Project: Removing Footnotes for Screen Readers

So here’s a problem which I imagine is common for any blind or visually impaired student or scholar: Many academic books are not available on tape through the Library of Congress. So the solution is frequently to get a hard copy of the book and scan it, page by page, into a computer. Then the user performs optical character recognition (OCR) on the scanned images to recognize them as text.

My experience in this case is limited only to Kurzweil 1000, version 8.5 (2 upgrades behind the current version) because that’s the software my (blind) partner uses. So Kurzweil does a decent job recognizing the text, automatically orienting the page, recognizing the language, and converting it to its own proprietary format (which appears to be a text file with additional tags before each word that indicate phonetic pronunciation, or something similar). Up to this point all is well and good. Here’s the problem: In academic books, there are footnotes at the bottom of the page, and Kurzweil doesn’t recognize them as such, and treats them no differently from the book’s text.

The footnotes make the book annoying to read at best, and impossible to understand at worst. The other day, after my 3rd hour of removing footnotes by hand (scroll down, silence the reading, identify the garbled text, select it, remove it, repeat), I got to thinking – there has to be a better way. I have some ideas for simple heuristics to determine what is a footnote, and a few ideas of blue-sky improvements that would be very spiffy. So my current project is to create an application (Java for convenience, portability and, of course, accessibility) that will be a utility designed to help with editing plain text books that have been OCR’d.

Here are the purposes of the program:

  • Allow users to edit a scanned book in Kurzweil’s .kes format or in .txt format
  • Provide footnote removal tools with adjustable sensitivity/aggression, and the ability to preview the changes
  • Provide regular expression search and replace, with some useful pre-made replacements
  • Maintain a dictionary of replacements similar to Kurzweil’s “Apply Corrections” function, but with added functionality
  • (Maybe) Display frequency-weighted spelling mistakes and the ability to replace them all

And here are some loose initial requirements:

  • Accessible to blind and visually impaired users
  • Preserves .kes formatting and allows editing without seeing .kes markup
  • User-friendly, correct, robust
  • Open-source (license to be determined)

As soon as possible, I will be uploading screenshots (well-described, of course) and writing about design decisions, implementation progress and eventually interesting code snippets. Next up: the basic User Interface prototype within a week.

Comments