Tuesday, February 04, 2003

Old Book Index Database

[This is my first post on the new blogging system. It's actually a re-post from last year.]

Only a sliver of everything published before the Web is currently on the Web. While some folks want to scan in whole books, I'd like to start a project for scanning in only the indexes at the back of non-fiction books. Authors and editors have already spent much energy in indexing the texts, and the scans, through OCR, could be turned into a searchable concept database.

Would this really be useful? It would to me. For example, I have an ongoing project to research an obscure old writer, who is almost never mentioned these days. But I have a stored eBay search query using his name, and his name appeared in the index of a book from the 1920s; the seller had scanned the index, OCR'd it, and included in the auction description. That's what suggested the Old Book Index Database to me.