Jump to content

Please read the Forum Rules before posting.

Photo

Book of Books by William Evans; Outline Study of the Bible by William Evans

COMPLETE

  • Please log in to reply
7 replies to this topic

#1 Josh Bond

Josh Bond

    Administrator

  • Administrators
  • PipPipPipPipPip
  • 2,891 posts
  • LocationGallatin, TN
Offline

Posted 15 September 2011 - 11:46 AM

Had someone run these books through an OCR scanner. Thought I'd try an experiment with how tough it is to OCR text and produce nicely formatted e-Sword modules.

The raw text is much better than what archive.org or Google books normally produces. I expect these 2 books to be done in the next two weeks, maybe sooner, depends on what else pops up. I doubt anyone else would happen to be working on these texts since they are not available on the Internet.

#2 mdiazu

mdiazu

    New to Bible Support

  • Veterans
  • Pip
  • 7 posts
Offline

Posted 22 September 2011 - 07:07 AM

Thanks Josh.
I really like William Evans.

I'll be waiting :rolleyes:

God bless,
Manuel.

#3 Josh Bond

Josh Bond

    Administrator

  • Administrators
  • PipPipPipPipPip
  • 2,891 posts
  • LocationGallatin, TN
Offline

Posted 22 September 2011 - 09:31 AM

I still hope to get these modules done on time. I've been working on an ecommerce (aprons) site for my wife and haven't been making any modules. The design work is done, just have to load the inventory / pictures, tweak the menus, and configure the supporting pages.

#4 Josh Bond

Josh Bond

    Administrator

  • Administrators
  • PipPipPipPipPip
  • 2,891 posts
  • LocationGallatin, TN
Offline

Posted 05 October 2011 - 04:39 PM

Had someone run these books through an OCR scanner. Thought I'd try an experiment with how tough it is to OCR text and produce nicely formatted e-Sword modules.

The raw text is much better than what archive.org or Google books normally produces. I expect these 2 books to be done in the next two weeks, maybe sooner, depends on what else pops up. I doubt anyone else would happen to be working on these texts since they are not available on the Internet.


A number of real-life things did popup, but a few days ago I started on these two books. This experimental project went very well. The books were chosen because I like William Evans for one. And secondly, the books were small, making them ideal for my first OCR project. I must say, OCRing my own text into an e-Sword module wasn't very heard at all. In fact, it was much easier than working with text from Archive.org and Google Books. The error rate was much lower.

At least one Bible software company sells one of these books for $10. And now that text is "liberated" into e-Sword format. :) If you know me, you know I really don't like the practice of selling public domain text. I understand the economic reasons for it--I still don't like it. And now that I can produce my own text to work with, this opens up all kinds of new doors.

I'm not sure what my next OCR project will be. Anyone know of a good book(s) that isn't too big of a project?

Links to Completed e-Sword Modules:
Outline Study of the Bible: http://www.biblesupp...trative-charts/
Book of Books: http://www.biblesupp...-book-of-books/

Josh

#5 APsit190

APsit190

    e-Sword Tools Developer

  • Members (T)
  • PipPipPipPipPip
  • 2,872 posts
  • LocationLand of the Long White Cloud (AKA New Zealand)
Online

Posted 05 October 2011 - 08:13 PM


I'm not sure what my next OCR project will be. Anyone know of a good book(s) that isn't too big of a project?


Hey Josh,
This looks really interesting. Could you either IM me or email me and give me some info on how I could get started into doing weird things like that too. You know, as in software requirements and all that sort of stuff.


Bless ya oodles
Stephen (Php 1:21).
X (formerly Twitter)

 


#6 Josh Bond

Josh Bond

    Administrator

  • Administrators
  • PipPipPipPipPip
  • 2,891 posts
  • LocationGallatin, TN
Offline

Posted 06 October 2011 - 12:27 AM

I started out by sending some material to a book scanner and the results came back so good, I thought well maybe I can come close to this myself. Non-destructive book scanning is expensive and requires all sorts of special equipment and digital cameras and crazy stuff. On the other hand, destructive scanning is inexpensive.

If your willing to destroy the book, destructive scanning is the way to go. You need a way to cut the book's spine off (industrial paper cutter or a copy shop like Kinkos here in the US), a sheet feed scanner (like Fujitsu Scapscan 1500), and you need OCR software to perform character recognition on the letter images in the PDF (Omnipro 18 or Abbyy FineReader 11). Right now, I'm toying with an older sheet feed scanner I already happened to have. I don't have a way to remove the spines, so I go to Kinkos for that. This video got me started:

Just taking the PDF from archive.org or Google books and using OmniPro or Abbyy FineReader on it renders much better text than the raw text provided by Archive.org/Google Books. So if you already have a PDF somewhere, with decent resolution and clarity, you don't need to scan the book.

#7 patchworkid

patchworkid

    Resource Builder

  • Members (T)
  • PipPipPipPipPip
  • 1,554 posts
  • LocationOld England
Offline

Posted 06 October 2011 - 04:21 AM

Hi Josh,

If you want me to make a map module to this Topic Please let me know

thanks
Patchworkid
Merismos the Scriptures with Patchworkid's Study Bible Set<p>http://www.biblesupp...tudy-bible-set/, MySword -http://www.biblesupp...tudy-bible-set/

#8 Josh Bond

Josh Bond

    Administrator

  • Administrators
  • PipPipPipPipPip
  • 2,891 posts
  • LocationGallatin, TN
Offline

Posted 06 October 2011 - 02:42 PM

A couple of you private messaged me that when you converted Outline Study of the Bible to TheWord, the table formatting was lost. I am attaching a TheWord version that corrects these issues. I also embedded the scanned images directly into the text rather than linking to the images.

Attached Files







0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users




Similar Topics



Latest Blogs