Frequently Asked Question List for TeX
While TeX’s hyphenation rules are good, they’re not infallible: you will occasionally find words TeX just gets wrong. So for example, TeX’s default hyphenation rules (for American English) don’t know the word “manuscript”, and since it’s a long word you may find you need to hyphenate it. You can “write the hyphenation out” each time you use the word:
... man\-u\-script ...
Here, each of the \-
commands is converted to a hyphenated break,
if (and only if) necessary.
That technique can rapidly become tedious: you’ll probably only accept it if there are no more than one or two wrongly-hyphenated words in your document. The alternative is to set up hyphenations in the document preamble. To do that, for the hyphenation above, you would write:
\hyphenation{man-u-script}
and the hyphenation would be set for the whole document. Barbara Beeton publishes articles containing lists of these “hyphenation exceptions”, in TUGboat; the hyphenation “man-u-script” comes from one of those articles.
What if you have more than one language in your document? Simple: select the appropriate language, and do the same as above:
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{re-cher-cher}
(nothing clever here: this is the “correct” hyphenation of the word,
in the current tables). However, there’s a problem here: just as
words with accent macros in them won’t break, so an \hyphenation
commands with accent macros in its argument will produce an error:
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{r\'e-f\'e-rence}
tells us that the hyphenation is “improper”, and that it will be “flushed”.
But, just as hyphenation of words is enabled by selecting an 8-bit
font encoding, so \hyphenation
commands are rendered proper again
by selecting that same 8-bit font encoding. For the hyphenation
patterns provided for “legacy”, the encoding is
Cork, so the complete sequence is:
\usepackage[T1]{fontenc}
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{r\'e-f\'e-rence}
The same sort of performance goes for any language for which 8-bit fonts and corresponding hyphenation patterns are available. Since you have to select both the language and the font encoding to have your document typeset correctly, it should not be a great imposition to do the selections before setting up hyphenation exceptions.
Modern TeX variants (principally XeTeX and LuaTeX) use unicode, internally, and distributions that offer them also offer UTF-8-encoded patterns; since the hyphenation team do all the work “behind the scenes”, the use of Unicode hyphenation is deceptively similar to what we are used to.
FAQ ID: Q-hyphexcept
Tags: hyphenation