Jump to content

Please read the Forum Rules before posting.

- - - - -

Converting Text

  • Please log in to reply
1 reply to this topic

#1 patchworkid


    Resource Builder

  • Members (T)
  • PipPipPipPipPip
  • 1,544 posts
  • LocationOld England

Posted 05 August 2011 - 10:10 AM

Hi all.
How to convert a PDF to RTF to be used in Tool Tip Tool = To an e-Sword module.

1. Open a document with PDF, Select all, Copy (This will give you a better format than just save as text).
2. Open Word, Paste (from PDF).

[If the Format of Text is not very good to work with ie = To many returns {USE below format using}]

Replace - More - Use Wild Cards

Searched for: ([a-z])(^13)([a-z])
Replaced: \1 \3

Searched: ([a-z],)(^13)([a-z])
Replaced: \1 \3

Searched: "
Replaced: "

make sure that your searching for: (÷*\))

and replacing with: \1 ^13

and not the other way around. Doing it the other way around, will create your error.

Wildcards must be enabled.
\1 ^13

How do you do a double return?
^p^p or ^13^13

On this particular document, I searched for: ([a-z])(^13)([a-z])

And my replacement text was: \1 \3

In Word, when you use the Search/Replace feature, a Search Replace Box appears. Click on the More button. More options should appear. Click the Use Wildcards box.

Try searching for this: [a-z]^13[a-z]

And for the replacment text, don't put anything.

That will usually deletes most extra returns.You might need a space before and/or after ^13. it all depends on the document.

Add power to Word searches with regular expressions

Sometimes, the ÷ is placed a line or two above where you need it. This is an easy fix with Microsoft Word's search and replace functionality. I intend to write a longer tutorial here, but you can simply use Word's search and replace feature to remove lines and bring the symbol to where you need it. Searching for ÷^13 -or- ÷^13^13 -or- ÷^p -or- ÷^p^p and replacing with ^13÷ -or- ^13^13÷ -or- ^p÷ -or- ^p^p÷ will sometimes be sufficient (you're just finding the line breaks/paragraph breaks and putting the symbol after them instead of before them). Other times, you may need to perform a regular expression search/replace.

By Colin Wilcox,
Graham Mayor, and
Klaus Linke

Have you ever wanted to do more than use the basic find-and-replace functions in Word? Wildcard characters and regular expressions can make those operations much more flexible and powerful.

Microsoft Word 97, 2000, and 2002
See all Power User columns
See all columns

Have you ever had to make a large number of repetitive changes to a document by hand? For example, have you ever had to find and remove duplicate rows from a large table, or transpose a list of names (change them from "Colin Wilcox" to "Wilcox, Colin")? That type of repetitive find-and-replace work gets old in a big hurry, doesn't it?

You can automate many of those find-and-replace tasks. Microsoft Word provides a set of wildcard characters that you can use to build regular expressions, combinations of literal text and wildcard characters. You can use regular expressions to find text that matches a given pattern and then replace those matches with new text.

If this all sounds complex, don't worry. We'll introduce it in easy steps, explain things as we go, and provide several working examples. You can use the information in this column with Word 97, 2000, and 2002. The user interfaces vary slightly between the versions, but you can accomplish the tasks described here with each version.

A quick spin through the jargon

To start, let's define a couple of terms:

A wildcard character is a keyboard character that you can use to represent one or many characters. For example, the asterisk (*) typically represents one or more characters, and the question mark (?) typically represents a single character.
In our case, a regular expression is a combination of literal and wildcard characters that you use to find and replace patterns of text. The literal text characters indicate text that must exist in the target string of text. The wildcard characters indicate the text that can vary in the target string.
That may seem a bit abstract, but you've seen (and most likely used) wildcard characters and regular expressions since you first began computing. For example, the Open dialog box (on the File menu, click the Open command) uses the asterisk wildcard character extensively:

And, if you ever used the MS-DOS operating system, you probably used a command and a simple regular expression to copy files:
copy *.doc a:

That command uses the asterisk wildcard character and the .doc literal text string to copy a set of Word documents to hard disk drive A. If you look around a bit, you'll see that Microsoft Windows® and the Microsoft Office applications use wildcard characters everywhere.

Try it!

The steps in this section explain how to use a regular expression that transposes names. Keep in mind that you always use the Find and Replace dialog box to run your regular expressions. Also, remember that if an expression doesn't work as expected, you can always press CTRL+Z to undo your changes, and then try another expression.

To transpose names

Start Word and open a new, blank document.
Copy this table and paste it into the document.
Doris Hartwig
Tamara Johnston
Daniel Shimshoni
Press CTRL+F to open the Find and Replace dialog box.
If you don't see the Use wildcards check box, click More, and then select the check box. If you don't select the check box, Word treats the wildcard characters as text.
Click the Replace tab, and then enter the following characters in the Find what box. Make sure you include the space between the two sets of parentheses: (<*>) (<*>)
In the Replace with box, enter the following characters. Make sure you include the space between the comma and the second slash: \2, \1
Select the table, and then click Replace All. Word transposes the names and separates them with a comma, like so:
Hartwig, Doris
Johnston, Tamara
Shimshoni, Daniel
At this point, you may wonder what to do if some or all of your names contain middle initials. See the first example in Putting regular expressions to work in Word for more information.

The next section explains how those regular expressions work.

What makes the expression tick

From here on, keep this principle in mind: The content of a document controls most (but not all) of the design of your regular expressions. For example, in the sample table you used earlier, each cell contained two words. If the cell contained two words and a middle initial, you'd use a different expression.

Let's examine each expression from the inside out:

In the first expression, (<*>) (<*>):

The asterisk (*) returns all the text in the word.
The less than and greater than symbols (< >) mark the start and end of each word, respectively. They ensure that the search returns a single word.
The parentheses and the space between them divide the words into distinct groups: (first word) (second word). The parentheses also indicate the order in which you want search to evaluate each expression.
In other words, the expression says: "Find both words."
NOTE Searching on this expression, (*) (*>), produces the same results. However, the expression in the example is easier to describe, and you should use restricting characters whenever you can, because doing so ensures greater accuracy in your results.

In the second expression, \2, \1:

The slash (\) works with the numbers to serve as a placeholder. (You can also use the slash to find other wildcard characters. See the next section for more information.)
The comma after the first placeholder inserts the correct punctuation between the transposed names.
In other words, the expression says: "Write the second word, add a comma, write the first word."
Next, let's take a look at the full set of wildcard characters and what they do.

Wildcard character reference

The following table lists and describes the wildcard characters that are available for use in Word. Keep one fact in mind as you go: Wildcard characters become more powerful when you combine them.

Any single character ? s?t finds "sat" and "set." This character also finds the chosen combination of characters within a word. For example, it could locate "set" within "inset."
Any string of characters *
s*d finds "sad" and "started." The asterisk returns all characters and spaces that lie between the literal characters. For example, use the s*t expression to search for the phrase "analysis system." The following images show you the matches that search highlights:

Notice that the asterisk returns st as a match. That is default behavior. Word does not limit the number of characters that the asterisk can match, and it does not require that characters or spaces reside between the literal characters that you use with the asterisk. So, be careful when using the asterisk, because it can return a lot of unwanted results.
The beginning of a word < <(inter) finds all the words that start with "inter," such as "interesting" and "intercept," but not "splintered."
The end of a word > (in)> finds all the words that end with "in," such as "in" and "within," but not "interesting."
One or more specified characters [ ]
w[io]n finds "win" and "won" but not "worn," because the "r" is not specified.
Always use brackets in pairs. If you use an opening bracket, you also use the closing bracket.
Any single character in a given range of characters [x-z] [r-t]ight finds "right" and "sight." The ranges you specify must be in ascending order. In other words, you can specify [a-m], but not [m-a].
Any single character except the characters in the range inside the brackets [!x-z] t[!a-m]ck finds "tock" and "tuck," but not "tack" or "tick."
Exactly n occurrences of the previous character or expression {n}
fe{2}d finds "feed" but not "fed." f[a-z]{2}d finds "find," "feed," and "food," but not "fed."
f([a-z]){2}d finds "feed" and "food," but not "find" or "fed."
Always use braces in pairs. If you use an opening brace, you also use the closing brace.
At least n occurrences of the previous character or expression {n,} fe{1,}d finds "fed" and "feed."
From n to m occurrences of the previous character or expression {n,m} 10{1,3} finds "10," "100," and "1000."
One or more occurrences of the previous character or expression @ lo@t finds "lot" and "loot."
Any wildcard character \wildcard_character [\?] finds all question mark wildcard characters, [\*] finds all asterisk wildcard characters, and so on.
To group characters and establish orders of evaluation () Use parentheses (also called round brackets) to create complex regular expressions. The example earlier in this column, and the reference article Putting regular expressions to work in Word, demonstrate some of the ways you can use parentheses.
Examples of regular expressions at work

Admittedly, the regular expression syntax is a bit cryptic. So, we created Putting regular expressions to work in Word, a page of examples that demonstrates some of the ways you can use regular expressions. If you'd like to read some of the source material for this article, see Finding and replacing characters using wildcards on the Microsoft Word MVP FAQ site.

Graham Mayor and Klaus Linke are Microsoft Word Most Valuable Professionals (MVPs). For more information about MVPs and the MVP program, see the Microsoft MVP Site and MVPs.org.
Colin Wilcox writes for the Office Help team. In addition to contributing to the Office Power User Corner column, he writes articles and tutorials for Microsoft Data Analyzer.

If any questions Please Ask

Merismos the Scriptures with Patchworkid's Study Bible Set<p>http://www.biblesupp...tudy-bible-set/, MySword -http://www.biblesupp...tudy-bible-set/

#2 Scribe


    e-Sword Fanatic

  • Members (T)
  • PipPipPipPipPip
  • 102 posts

Posted 05 August 2011 - 11:21 AM


0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

Similar Topics

Latest Blogs