Posted 08 July 2011 - 03:16 AM
Hi all. As you know this is a very big project for me. I have uploaded the Lang's Cmt (which is in Word Doc) - In Topics section.
If anyone would like to help, it would be great. This might help.
I searched for: ([a-z])(^13)([a-z])
Replaced: \1 \3
Searched: ([a-z],)(^13)([a-z])
Replaced: \1 \3
Searched: "
Replaced: "
Hey Niobi,
Attached is the sample document I sent. Here is what I did:
I searched for: ([a-z])(^13)([a-z])
Replaced: \1 \3
Searched: ([a-z],)(^13)([a-z])
Replaced: \1 \3
Searched: "
Replaced: "
BUT this will be a HUGE project. The text was scanned from a scanner (OCR). So on almost every line, there's a typo or misspelling. Sentences sometimes end with odd punctuation (OCR typos). There's no formatting, no bold, no italics, no tabs, no indents, etc. Then there's the verse formatting. I converted some of the roman numerals to digits, but the verse formatting has many typos as well, so some of the roman numerals still exist in the verse references.
What you have is a raw text dump, with tens of thousands of misspellings and typos. And you will almost have to go line by line through the text. A spell check would help, but be careful of greek/hebrew words that an English spellcheck won't find.
There's still some lines that will need the line break manually removed because the search missed it (we can't remove all line breaks because then we wouldn't have paragraphs). But 99% of the bad line breaks are gone.
In Word, when you use the Search/Replace feature, a Search Replace Box appears. Click on the More button. More options should appear. Click the Use Wildcards box.
Try searching for this: [a-z]^13[a-z]
And for the replacment text, don't put anything.
That will usually deletes most extra returns.You might need a space before and/or after ^13. it all depends on the document.
Thanks again for your help.
Patchworkid