Thanks
Edited by gad4, 24 October 2019 - 08:58 AM.
Posted 24 October 2019 - 08:56 AM
Edited by gad4, 24 October 2019 - 08:58 AM.
Posted 24 October 2019 - 09:03 AM
I bought the PDF Conversion Utility from Microsoft.
I is available as an App from their Microsoft Store online.
I works good on good legible PDF files, where the pages
do not come out as "IMAGES" in the resulting ".DOC" or ."DOCX"
file.
I have come across a good online Converter that will take
the "IMAGES" in a Word Document and convert to text, oftentimes
preserving the Greek Text....
djmarko53
Posted 25 October 2019 - 12:59 AM
It is better to convert first from PDF to HTM and save it in the UTF-8 encoding (That is the Unicode that you need for the Hebrew or Greek scripts)
Open the HTM with notepad and type
<!DOCTYPE html>
<html class="client-nojs">
<head>
<meta charset="UTF-8"/>
</head>
<body>
Enter here the text under the body and "save as" in UTF-8 encoding.
βιβλος is Book in Greek.
</body>
</html>
Edited by Katoog, 25 October 2019 - 01:18 AM.
Restored Holy Bible 17 and the Restored Textus Receptus
Posted 25 October 2019 - 12:11 PM
Thanks for the ideas.
Marco. I have tried different software and they didn't work. Before purchasing the software, did you notice the problem I experienced and this solved it?
Katoog. I tried using adobe acrobat to output to html setting, i was not seeing an option for utf8. i was not seeing an html output for omnipage. Im curious how the notepad data should be included in the html output.Can i ask a more step by step procedure and which software you are using?
Edited by gad4, 25 October 2019 - 05:47 PM.
Posted 26 October 2019 - 02:46 AM
Normally if you convert adobe acrobat with Hebrew and Greek script html use it by default UF8 (If you can see the Hebrew and Greek script correctly with a browser)
Open a random htm file and change the .htm in the file name into .txt
Or open or create a txt file with notepad.
click on the file menu and "save as" OR Crtr+Shift+S
Left from the save button is there an "Encoding" menu
If the menu shows ANSI or another UTF format then click and change it to UTF-8
then click on the save button.
Now is you document formatted in UTF-8
delete everything OR Crtr+a and the delete key on your keyboard.
Then copy this
<!DOCTYPE html>
<html class="client-nojs">
<head>
<meta charset="UTF-8"/>
</head>
<body>
Enter here the text under the body and "save as" in UTF-8 encoding.
βιβλος is Book in Greek.
</body>
</html>
save it again and change the .txt extension into .htm or .html
Edited by Katoog, 26 October 2019 - 02:47 AM.
Restored Holy Bible 17 and the Restored Textus Receptus
Posted 26 October 2019 - 11:30 AM
Posted 28 October 2019 - 09:26 PM
I tried ms word pdf to doc, Adobe export to doc,adobe output to jpg and then ocr to doc, and omnipAge OCR jpg to doc. Is there something I need to add or do different?
Thanks
is the PDF you are starting with image or text if you are using omnipage pro I assume its image OCR is not even close to perfect and less so with Greek. The Hebrew never did work right so i replaced with strongs number what I had to do is proof the resulting text with the un-OCR'ed PDF manually fix the English and then manually enter the Greek
DSaw
May God change our hearts to what the truth is
2Ti_2:15 Study to shew thyself approved unto God, a workman that needeth not to be ashamed, rightly dividing the word of truth.
Rom_9:16 So then it is not of him that willeth, nor of him that runneth, but of God that sheweth mercy.
0 members, 0 guests, 0 anonymous users
All caps Koine GreekStarted by Guest_Ne0_* , 21 Mar 2024 |
Proverbs 19:19 Strong's error in KJV and Hebrew OT modulesStarted by Guest_Ne0_* , 21 Mar 2024 |
Font for Hebrew (eg HBSE+++) which aligns vowel pointings?Started by Guest_Ryan Kroger_* , 01 Apr 2024 |
iPad V12.1.1 - Study notes displaying in Hebrew? - HELP!Started by Guest_mb@markboyd.info_* , 17 Dec 2023 |
Ancient Hebrew / Proto-Canaanite pictogramsStarted by Guest_hchansen_* , 04 May 2023 |