LF stands for Line Feed, and comes from the old typewriter / teletype idea of a command to move the print head down a line;
CR/LF together indicate moving down a line and back to the left of the page.
The history is not relevant to today's computers in principle, but in practice they all use one of these legacy conventions, and there's nothing we can do about it but pick one.
V.86. One space or two at the end of a sentence?
Whichever you prefer, but if using two spaces, please use them only at the end of a sentence, not after abbreviations like "Dr." and "per cent.", and not after non-sentence-ending punctuation like the question-mark in the sentence: "Must you go? when the night is yet so black!"
Many people have strong views on either side of the "one space or two?" question, and we're not about to try and argue with them. Use whichever is most natural for you.
However, if using two, you take responsibility for deciding where the sentence ends. You can't just place two spaces after every period, question-mark and exclamation mark, since periods are also used for abbreviations end ellipses, and question-marks and exclamation-marks don't always end sentences.
V.87. How do I indicate paragraphs?
Just leave a blank line before each paragraph.
V.88. Should I indent the start of every paragraph?
No.
Printers do this when publishing paper books because they do not leave blank lines in the text, but there is no need for indenting in our eBooks.
V.89. Are there any places where I should indent text?
Yes. You should always make poetry look like the original, and that may mean indenting some lines, for example:
I was a child and she was a child,In a kingdom by the sea;But we loved with a love that was more than love—I and my Annabel Lee;
Even when poetry doesn't have indented lines, it is a good idea to indent quotations embedded in prose. Remember, others will be converting your text later—to HTML, to PDA reader formats, to formats that don't even exist yet—and much of this conversion will be done automatically, by computer programs. It is very hard for a program to know when it can and can't re-wrap lines to fit a screen size unless it has a clear signal thatthisline should not be wrapped. This is one of the biggest problems with auto-converting PG texts.
Just about all formatting programs "know" that lines that are indented shouldn't be wrapped, so by indenting lines just a space or two, you can prevent
I think that I shall never seeA poem lovely as a tree.
from turning into
I think that I shall never see A poem lovely as a tree.
in some future reader's eBook.
You don't really need to do this in texts where the whole book is poetry or blank verse, since these will probably be recognized as whole books that shouldn't be rewrapped, but when there are a few lines of quotation amid an acre of straight prose, a few spaces will be a life-saver. Even in the original plain text version, the extra spaces serve to set the quotation off from the main text.
You shouldn't get carried away and indent things 20 spaces for this reason, though. Anything up to four spaces is reasonable; more is excessive. If you're indenting many short verses in this way, keep your number of spaces for indentation consistent throughout the book.
There are some other times when you may judge it best to indent, where text is indented in the paper book, like newspaper headlines or pictures of handwritten notes.
V.90. Can I use tabs (the TAB key) to indent?
No.
The problem with tab characters is that they act differently in different applications. Typically a tab will move the text to the next tab stop, which might be four spaces on your PC, but 20, or none, on someone else's. The effects are unpredictable.
V.91. How should I treat dashes (hyphens) between words?
In typography, there are four standard types of dashes: the hyphen, the en-dash, the em-dash, and the three-em-dash.
Originally, printers called these the "em-dash" because it was the same width as the capital letter M in whichever font they were using, the "en-dash" because it was the same width as the capital letter N, and the "three-em-dash" because it was as long as three capital Ms.
The hyphen is used for hyphenated words, like "en-dash" itself, or "to-day" or "drawing-room". For this, you just press the single dash or hyphen key on your keyboard.
In typography, the en-dash is a little longer than the hyphen, and is typically used for duration, where you could substitute the word "to". For example, if you were printing "1830-1874", or "9:00-5:30", you would use an en-dash instead of a hyphen. The en-dash is also sometimes used as hyphenation between words that are already hyphenated, for example, "bed-room-sitting-room" might use an en-dash as its central dash to emphasize that it is a different type of separator from the plain hyphens before "room". However, there is no ASCII character for an en-dash, and we use the hyphen in these cases. (HTML and some character sets do provide separate entities for en-dash and em-dash.)
The em-dash is shown in print as a longer dash, and for PG purposes, you should render it as two hyphens with no spaces around them.
You use the em-dash as a kind of parenthesis—as I am doing here—or to indicate a break in thought or subject within a sentence. There is no ASCII equivalent of the em-dash; there is no key on your keyboard that you can press to get one. For PG texts, we represent the em-dash as two dashes with no space between or around them—like this.
The em-dash can also be used at the end of a sentence or speech to indicate that the speaker stopped or trailed off. For example:
"When I saw you with Emily, I thought you were— I thought she was—"
In a case like this, there may be a space following the em-dash, and the context may demand that thereshouldbe a space following the em-dash, not because of the em-dash as such, but to make the break between the statements or sentences clear.
These two hyphens representonecharacter, so you should never break them at line end, with one hyphen at the end of the first line and the other at the start of the second. If you have an em-dash near line end, you can break the line either before or after the em-dash, but never in the middle.
The fourth type of dash, the three-em-dash, is used to represent a missing word, or an undetermined number of missing letters. You will often see it in a sentence like:
Dr. P——— was known for his honesty.
or
Dr. ——— was known for his honesty.
where there is a convention that the character's name has been redacted. Logically, we should represent the three-em-dash as six dashes, but you may reduce that to four. Whichever you choose, do use it consistently in the text you're producing.
Unlike the em-dash, you should leave a space in such cases wherever a space would have been before the letters were replaced by dashes.
Here's a summary table of the dashes:
Name ASCII Used for
Hyphen - Hyphenated WordsEn-dash - Durations, like "3:00-5:30"Em-dash — Break in sentence or parenthetical commentThree-em-dash ——— Indicating a word that was edited out.
V.92. How should I treat dashes replacing letters?
If the dashes obviously represent individual letters, use the same number of hyphens. Otherwise, you can use a three-em-dash (see above: 6 or 4 hyphens) in such places.
A common convention when a character in a novel is using bad language, or when reference is given to a character whose full name is not being used, is to replace the letters with dashes. For example,
"That D—-l, Mr. C———s will regret his hasty actions!"
In this case, it is clear that "D—-l" is meant to represent "Devil" and that there is a character whose name begins with "C" and ends in "s" whose name is not spelled out in full. Where the book makes it clear how many letters are represented by hyphens, just use that number of hyphens.
Where the number of letters omitted is not clear, you can decide how long you want to make your extended dash. Typographers often use the "three-em-dash" for this, so called because it is as wide as three capital Ms. Logically, since we represent an em-dash by two hyphens, we might represent a three-em-dash as six, but if you feel that six hyphens is too long, you can choose a shorter length, like four, but if you do, keep it consistent within your text:
It was in the town of S——, walking on M—— Street, thatSowerby came upon Dr. T—— taking the morning air.
V.93. What about hyphens at end of line?
Remove the hyphens from single words that were wrapped by the printer at line-end on the paper copy. Where two words are joined with a hyphen, you can leave the hyphen at end of the text line.
Books are usually printed with words broken at end of line to make the right side of the text perfectly even. You should remove all such hyphens. For example, in the sentence:
Mary's mouth tightened as she saw the marks on the car- pet, and her hands balled into fists.
you should remove the hyphen from "carpet".
Words which are strung together and hyphenated by the author pose a different question. It is perfectly OK from the point of view of a reader of the plain text version for such a hyphen to occur at end of line, for example:
Now that the guns were silent, convoys brought badly- needed medical supplies and food.
However, be aware that if somebody later rewraps the text for use in a different format like HTML, it is possible that they will introduce a space where it should not be:
Now that the guns were silent, convoys brought badly- needed medical supplies and food.
so there is still a small disadvantage to having a hyphen at line-end.
Sometimes it's not entirely clear whether the hyphen is there because it has to be, or just because it happens to fall at the end of the line:
Daisy rushed to the door, but there were no letters for her to- day, and she retreated sadly.
Sometimes "today" is written as "to-day", especially in older works. So which is this? Should we remove the hyphen or not? In this case, the best thing to do is search the rest of the text for the same word, and see whether it is consistently hyphenated or not in other places.
V.94. What should I do with italics?
There are three different ways volunteers currently render italics: like THIS, likethisand like /this/. Pick one, and use it consistently in your text.
There are really two questions here: "How should I render italics?" and "When should I render italics?"
The original PG standard for italics was to render emphasis italics as CAPITALS, using underscores for an italicizedI, and do nothing for non-emphasis italics like foreign words and names of ships, and this is still the most common usage. For reading a plain-text file in a plain text editor, it is still arguably the most reader-friendly usage as well.
It has two drawbacks:
1. if you do want to preserve italics for non-emphasis words, you may end up with a very ugly text where there are too many capitals.
2. it is impossible to convert CAPITALS reliably back into italics, since the original text might have had a capital letter, or even been all capitals in the first place. This is especially true of automatic conversion for people who want to read PG texts on eBook readers.
To overcome these problems, many volunteers now useunderscoresor /slants/ to render italics. These allow you to preserve all italics without creating an ugly plain-text, and to remove the ambiguity of CAPITALS. Underscores are more popular than slants, but some people feel that underscores should properly be reserved for underlined text. Since printers tend to avoid underlines, however, there aren't many books where this causes a real conflict.
V.95. Yes, but I have a long passage of my book in italics! I can't really CAPITALIZE orotherwise/mark/ all that text, can I?
No, you really can't. On the other hand, if the author intended that section to stand out, you don't want to ignore that information and withhold it from future readers.
What youcando is format it differently from the rest of the text. For example, if you're averaging a 68-character line throughout normal paragraphs, you could reasonably use shorter lines, like 58 characters, for the italicized section. Going a step further, you could shorten the lines and indent them a space or two as well. This will give a clear signal to future readers and converters that this section is to be treated specially.
V.96. Should I capitalize the first word in each chapter?
No.
Capitalization of the first word is often used in printed material to emphasize the break at the start of a section or chapter on the paper, but it is not necessary in an eBook, and leads to the same kind of ambiguity as does the capitalization of italics, and for far less reason.
If you feel you reallymustcapitalize the first word, we probably won't stop you, but if so, please do it consistently throughout the book, not just in one or two places, so that a future reader can be certain that these capitalized words were a chapter-head convention, and not otherwise intended for emphasis.
V.97. What is a Transcriber's Note? When should I add one?
A Transcriber's Note is a small section you can add to a text you produce to give the reader some information about changes you made to the book when rendering it into text.
A Transcriber's Note is not the same as a footnote—a footnote is part of the text you have transcribed; a Transcriber's Note is a note thatyouadd to the text, explaining somethingyouhave done or omitted. If there is a Transcriber's Note, it may be at the top or the end of the text, and it should be clearly marked so that a reader cannot confuse it with the main text or an introduction.
The main thing is to ensure that a reader cannot confuse text that you have added with text that was in the original book.
Transcriber's Notes are rarely needed, but if, for example, you found misprints in the text, or things that might look like misprints even though they're not, you may note them here, if it seems relevant. If there is an image in the book that is important to the content, you may describe it in a note. If there was unusual typography that you had to represent in some uncommon way, you might well explain that here.
You don't need to add a Transcriber's Note just for common conversions like italics, and you should not use such a note to add your own comments or views about the text or the author. It's just there to let the reader know what decision you have made about rendering the text.
Here are some examples of Transcribers' Notes:
Transcriber's Note:
The irregular inclusion or omission of commas between repeated words ("well, well"; "there there", etc.) in this etext is reproduced faithfully from the 1914 edition . . .
Transcriber's Note:
Inserted music notation is represented like [MUSIC—2 bars, melody] or[MUSIC—4-part, 8 bars]
[Transcriber's Note: This letter was handwritten in the original.]
Transcriber's Note:
The spelling "Freindship" is thus in the original book.
Transcriber's Note: Some words which appear to be typos are printed thus in the original book. A list of these possible misprints follows:
If there is an image that is important to the content you may describe it at the point in the text where it appears, for example:
[Transcriber's Note: Here there is a map of three islands just West of and parallel to a coastline running SW to NE, with a big X marked on the North of the middle island. A spur of land extends from the mainland, sheltering the islands from the north-east.]
Transcriber's Notes that apply to the whole text should be placed at the start or end of the text—your choice. Notes that pertain to a specific point in the text, like the map example above, should be placed at the point where in the text where they are relevant, but not interrupting a paragraph except where it cannot be avoided.
V.98. Should I keep page numbers in the e-text?
No. But there are exceptional cases . . .
In general, the page numbers of the original book are irrelevant when making a reader's edition for PG; they are annoying and intrusive for anyone trying to read it, and if you did keep them, they would probably be removed by anyone converting it. Get rid of them!
But there are a few books where page numbers are appropriate. Non-fiction books that use page numbers as internal cross-references are the prime example; if, on page 204, the text reads
"Our studies of plants (see pp. 141-145) show that this is true."
and this kind of cross-reference is frequent throughout the text, then it is probably best to keep the page numbers, since it is otherwise very difficult to honor the author's intent.
In the more common case where cross-references exist, but are not frequent, and not essential to the text, you have several choices: leave the cross-references in, meaningless though the page numbers are, remove the cross-references, change the cross-references to something relevant (like "Start of Chapter 12" instead of "pages 141-145"), or, if you can make it work in context, insert references in the text for the cross-references to point to, like [Reference: Plants] and then reformat the cross-reference like "Our studies of plants (see [Reference: Plants]) show that this is true."
There are a few other cases, where the text you create is likely to be the subject of study or reference, in which it may also be desirable to retain page numbering.
When there are pages at the end of the book with notes referring to page numbers, the simplest answer is to change the page number references to chapter numbers, and add a quote from the page referred to if it's not already in the book's end-notes. That way, a reader can search for the phrase.
V.99. In the exceptional cases where I keep page numbers, how shouldI format them?
Within brackets of your choice, with one space either side, simply added to the text at the exact point of the page break. Unless there is some [142] special reason, you shouldn't insert a line break or new paragraph when indicating a page number; just insert it in the text, as I did with "142" above.
You should use whichever of round brackets, (143) square brackets, [144] or curly brackets {145} is not used (or least used) within the main text itself, and then use it consistently. Try to make sure that your page numbers cannot be confused with anything else.
Don't run your[146]page[147]numbers right up against words with spaces omitted; this just makes the text hard to read. Use spaces before and after.
Where the page break is at the start of a chapter or headed section, you can put it on a line of its own, for example:
[148]
Where a paragraph begins on a new page, you should put the page number at the start of the paragraph, as:
[149] With the extinction of the dinosaurs . . .
V.100. Should I keep Tables of Contents?
Yes, but just keep the contents themselves, and not the page numbers for each chapter or section, except where you have kept the page numbers in the whole text. When you have removed the page numbers from the book, it doesn't make much sense to leave them in the TOC.
Here, for example, is a typical TOC. In the original text, each chapter had a page number beside it:
1 When the Duchess was Dead2 Lady Mary Palliser3 Francis Oliphant Tregear4 It is Impossible5 Major Tifto6 Conservative Convictions8 He is a Gentleman9 'In Media Res'10 Why not like Romeo if I Feel like Romeo?11 Cruel12 At Richmond
Note that I have indented the lines here, to give a sign to automatic converters that these lines should not be wrapped into one paragraph.
V.101. Should I keep Indexes and Glossaries?
If you are working from a pre-1923 publication, then yes.
If you are working from a modern reprint, you must be careful not to take any of the text that might have been added by the modern publisher. If you have any doubt about whether the index or glossary was part of the original printing, you should leave it out. Often with reprints, under your Clearance Line [V.37], you may see an instruction not to use indexes. In such cases, or if there is any doubt at all, don't.
V.102. How do I handle a break from one scene to another, where thebook uses blank lines, or a row of asterisks?
Use a blank line, followed by a line of 3 or 5 spaced asterisks or dashes, followed by another blank line.
In a printed book, where the point of view switches from one character to another, or some other break in the narrative is made without a new chapter or headed section, the publisher will often denote the break just by a couple of blank lines. This gives the reader a cue to notice that the point of view has switched, and avoids confusion.
However, a printed book cannot be edited or changed, while an eBook will be edited and converted over its lifetime, and it is likely that if you denote this break just by a couple of blank lines, as in the book, your break may be lost. For example, in automated conversion to a PDA reader format, it is common to merge multiple blank lines into one.
In making a PG e-text, youmayindicate this break by a couple of additional blank lines, but, if your text is later converted into another format such as HTML, the extra blank lines may get lost in the editing or rendering. Or the person doing the conversion may simply think that the extra blank line was a mistake, and remove it. To guard against this, you should add an unambiguous visual break such as a line of spaced asterisks:
* * * * *
The exact layout of your break is not really important, and you can use whatever format you prefer. Blank line followed by five spaced asterisks followed by another blank. Or you could use two blank lines, and dashes instead of asterisks. Just make sure that future readers can be in no doubt that you intended to indicate a break that was really in the original printed text.
V.103. How should I treat footnotes?
In a printed text, the most common treatment for footnotes is to put them at the end of the page to which they refer. Sometimes, editors gather them all at the end of the book. Footnotes are a real formatting problem for an eBook without defined physical pages; there is no agreement between readers about which is the best way to render them.
There are three basic ways of rendering footnotes in an e-text:
You can insert them right into the text, in brackets, at the point in the paragraph where they occur, with or without an indication that they were originally footnotes. This is only reasonable in a text with very short footnotes.
You can insert them after the paragraph to which they refer, either contiguous with the paragraph or as a new "paragraph" of their own, as I am doing with this one. If the text contains any footnotes longer than a line, [1] you should not try to just append them to the paragraph; you should make a new "paragraph" of them, with a blank line before and after.
[1] Some footnotes can go on not only for several lines, but for several pages!
You can gather all footnotes at the end of the e-text, or to the end of the chapter to which they refer.
Of these three, gathering all footnotes to the end of the chapter or
the end of the whole text is probably the friendliest option, since it
preserves the original intention of allowing the reader to continue
reading the main text without interruption. However, it may involve
some renumbering and general note-keeping on your part, and may not be
needed where there are only a few short footnotes. You can see an
ideal example of this kind of footnote marking in our edition of
Darwin's "The Voyage of the Beagle", file vbgle10.txt from 1997, Etext
number 944, which you can get from:
V.104. My book leaves a space before punctuation like semicolons, question marks, exclamation marks and quotes. Should I do the same?
No.
If you look closely at these "spaces", you will see that they are not as wide as a normal space—they tend to be half to three-quarters as wide. These don't actually represent spaces as such; they were just a convention used by typesetters to make the text feel less cramped, and they did not express any specific intent on the part of the author.
OCR software tends to see them as full spaces, and one of the jobs you typically have to do when editing a text that has been OCRed is to remove them.
In some texts, this also happens following an opening quote, so yourOCR might read a sentence as:
" Hello ! How are you to-day ? "
which you should correct to:
"Hello! How are you to-day?"
Samples of this can be seen in the images used for the FAQ"Why am I getting a lot of mistakes in my OCRed text?" [S.17]
V.105. My book leaves a space in the middle of contracted words like"do n't", "we 'll" and "he 's". Should I do the same?
Unlike the pseudo-spaces before punctuation, these really were intended as spaces indicating the break between words—that is, where we would nowadays contract two words into one, the author or editor has made the contraction, but left them as two separate words.
Since this effect was intended, it is usual to leave the spaces in. Some people who really do n't like this style of spelling do remove them, but generally volunteers want to preserve the text as printed.
V.106. How should I handle tables?
Just line up the information neatly in columns. If you use a non-proportional font [W.5] you will be able to do this reliably. You can also use the dash character "-" , the underscore "_" and the pipe character "|" to make borders if you really need to, but it's usually better to omit them. It is, though, often good to indent your table a little, to set it off from the main text, and to avoid the danger of having it automatically wrapped by some converter later. For example, from "The Albert N'Yanza, Great Basin of the Nile" by Sir Samuel White Baker:
TABLE No. 1.
Table for Increased Reading of Thermometer, using 0 degrees 80 as theResult of Observations for its Error.
Month. 1861. 1862. 1863. 1864. 1865.January. . . — 0'143 0'314 0'487 0'659February . . — '157 '328 '501 '673March . . . 0'000 '172 '344 '516 '688April . . . '014 '186 '358 '530 '702May . . . . '028 '200 '372 '544 '716June . . . . '043 '214 '387 '559 '730July . . . . '057 '228 '401 '573 '744August . . . '071 '243 '415 '587 '758September. . '086 '257 '430 '602 '772October . . '100 '271 '444 '616 '786November . . '114 '285 '458 '630 0'800December . . 0'129 0'300 0'473 0'645 —
V.107. How should I format letters or journal entries?
Make them look like they are in the printed book. If the signature is indented in the book, indent it in the letter. For example:
"Sir,No consideration would induce me tochange my resolve in this matter, but I amwilling to engage your services as my agentfor a fee of 100 pounds."H. Middleton"
When a letter appears in the middle of lots of prose, using shorter lines for the letter is an effective way of making the letter stand out, without resorting to indenting the whole thing.
When the book is largely composed of letters or entries, as happens in an epistolary novel or the publication of somebody's letters or journal, you might reasonably leave two or three (but whichever you choose, keep it consistent throughout the book!) blank lines between entries to give the reader a visual cue that the next is not just a new paragraph, but a new entry, for example:
10 pm.—I have visited him again and found him sitting in a corner brooding. When I came in he threw himself on his knees before me and implored me to let him have a cat, that his salvation depended upon it.
I was firm, however, and told him that he could not have it, whereupon he went without a word, and sat down, gnawing his fingers, in the corner where I had found him. I shall see him in the morning early.
20 July.—Visited Renfield very early, before attendant went his rounds. Found him up and humming a tune. He was spreading out his sugar, which he had saved, in the window, and was manifestly beginning his fly catching again, and beginning it cheerfully and with a good grace.
I looked around for his birds, and not seeing them, asked him where they were. He replied, without turning round, that they had all flown away. There were a few feathers about the room and on his pillow a drop of blood. I said nothing, but went and told the keeper to report to me if there were anything odd about him during the day.
11 am.—The attendant has just been to see me to say that Renfield has been very sick and has disgorged a whole lot of feathers. "My belief is, doctor," he said, "that he has eaten his birds, and that he just took and ate them raw!"
11 pm.—I gave Renfield a strong opiate tonight, enough to make even him sleep, and took away his pocketbook to look at it. The thought that has been buzzing about my brain lately is complete, and the theory proved.
This is different from the case mentioned in the FAQ [V.102] "How do I handle a break from one scene to another, where the book uses blank lines, or a row of asterisks?". In that case, we added a row of asterisks because future reformatting or conversion could cause confusion about the scene break that was explicitly signalled by the blank lines on paper. In this case, each new letter or journal entry cannot be mistaken by a careful reader, so we don't need asterisks or dashes to signal that; we're just adding a bit of extra space to make it more readable.
V.108. What can I do with the British pound sign?
The British pound sign cannot be expressed in ASCII, but is very common in the works of English novelists. It evolved as a stylized version of the letter L (from the Latin "Librii"), and it's entirely appropriate to represent it as such, either like:
The horse cost L8 12s. 6d.
or
The horse cost 8l. 12s. 6d.
This works particularly well where an amount is expressed in pounds, shillings and pence (Librii, soldarii, denarii).
Where there is a simple number of pounds, you may prefer just to use the word:
She was a handsome widow with 500 pounds a year.
V.109. What can I do with the degree symbol?
Just type out the word "degrees" or the abbreviation "deg."—for example:
By the time we reached Cairo it was 115 degrees in the shade.
Geographical degrees are more awkward, but should be handled the same way:
It was at 30 deg. 15' E, 14 deg. 45' N.
In general, any symbol can be represented in words.
V.110. How should I handle . . . ellipses?
Just as I did above . . . and here! Leave one space before and after each dot. Do not break an ellipsis over the end of a line. In principle, an ellipsis is one symbol, like an em-dash, and should not be broken at line end.
A special case arises when an ellipsis follows a sentence instead of being in the middle. . . . In this case, put the period after the last letter of the sentence, as you normally would, then follow the usual format for ellipses. You end up with four dots, with spaces everywhere except before the first.
V.111. How should I handle chapter and section headings?
For a standard novel, you can choose either four blank lines before the chapter heading and two lines after, or three lines before and one line after, but whichever you use, do try to keep it consistent throughout.
Normally, you should move chapter headings to the left rather than try to imitate the centering that is used in some books.
V.112. My book has advertisements at the end. Should I keep them?
Most people seem to think "no", and "no" is the safe choice, but opinions vary.
The typical arguments are: "The ads are not part of the author's intent, so you should remove them." vs. "They give a flavor of the original book, so you should keep them". This latter is particularly cogent when the ads are for other books by the same author.
Decide which of these statements best fits your own views in the case you're looking at; after that, it's up to you!
V.113. Can I keep Lists of Illustrations, even when producing aplain text file?
Yes. As in the case of the Table of Contents, there is no point in including page numbers when your text doesn't have them, but the list of illustrations itself may go in.
V.114. Can I include the captions of Illustrations, even when producing a plain text file?
Yes.
You can format them as short paragraphs of their own, in brackets, with the word Illustration: followed by the caption, something like:
[Frontispiece: A Flash of Light]
or
[Illustration: Goldsmith at Trinity College]
Don't interrupt a paragraph to insert one, unless the reader really needs to know that the original illustration was in the middle of the paragraph; place the note between paragraphs instead.
V.115. Can I include images with my text file?
Yes, as I have done with the zipped version of the plain-text format of this FAQ, but in general it makes much more sense, if you want to include images, to make a HTML version of the book and include them there, where they are anchored into the text in a predictable way, and leave them out of the text version. But there are exceptional cases, such as this—I included images with this plain-text FAQ because I wanted you to be able to experiment with them using your own OCR package.
If you do include images with plain text, they will be included with the ZIP file, but not downloadable separately with the plain text file; for example, if your file gets named abcde10.txt, and you include images pic1.gif, pic2.gif and pic3.gif, then abcde10.zip will include all four files, but only abcde10.zip and abcde10.txt will be posted, so the images will be available only within the zip file, so, even if you are including images, don't assume that the reader will be able to see them.
If you do include images with plain text, be sure to mention them by filename in a note at the appropriate places in the text file; otherwise readers may not even realize they're there. For example:
[Illustration: Goldsmith at Trinity College—see goldtrin.gif]
If you do include images with a text file, don't make them too big. Readers downloading zip files of plain text expect them to be relatively small; don't burden them with huge downloads they don't want. Use the same kind of rules and processing that you would for a HTML file, or better still, include the images only with the HTML version.
About formatting poetry:
V.116. I'm producing a book of poetry. How should I format it?
Make it look like the original.
The only formatting change that you might consider is to limit the amount of centering. Often, in a poetry book, the title of a poem may be centered, when the body of the verse isn't. This can work on paper, particularly when the page is narrow, but "centering" the title on a 70-column line can mean that the title ends up far to the right of the body of the poem, which looks untidy. And even if you center the title correctly over the body ofthispoem, the next poem may have longer lines, and soitstitle may not have the same center as the first poem, and the title of one will be off-center with the title of the next!
If you have this kind of formatting in your book, you should consider
moving all of the poem titles to the left margin rather than try to
keep compensating for different line centers. It's more consistent,
and easier to read, if you just left-align all titles. To see a
not-quite-successful attempt at centering the titles over the poems,
take a look at the Poems of Emily Dickinson, available from
In that case, it would have been better to left-align the numbers and titles. Centering isn't really an effective formatting choice in etexts.
V.117. I'm producing a novel with some short quotations from poems.How should I format them?
As nearly as possible like they look in the book, with the exception that you should indent the whole verse anywhere between 1 and 4 spaces from the left. This is to give a signal to automatic conversion programs that these lines should not be wrapped.
For an example of a novel with many differently formatted quotations
embedded, see the "a" version of Clotel, file clotl10a.txt, Etext
number 2046, from the year 2000, which you can find at
Some of these quotations touch the left-hand column; today, we would think it better to insert at least one space before every line.
About formatting plays:
V.118. How should I format Act and Scene headings?
Pretty much like chapter headings. You can use 4 blank lines between acts, and 3 blank likes between scenes, or 3 between acts and 2 between scenes. If your book has "END OF ACT/SCENE" footers, leave them in the etext.
You may center act/scene headers and footers if they are centered in the book, but it's usually best to left-align them, for the same reasons it's usually best to left-align poem titles in poetry.
V.119. How should I format stage directions?
Generally, in brackets.
In printed texts, it is common to show stage directions as italics inside brackets. You don't have the option of italics in plain text, and you shouldn't need to useunderscoresor /slants/, and certainly not CAPITALS, to indicate italics for stage directions. Normal text within the brackets is all you need. It will be immediately clear to a reader that bracketed text consists of stage directions.
[Square brackets] are most common for stage directions, but (round) or {curly} brackets will work too, if there's a reason why they are preferable in the case of your text. Just make sure that you use the same kind of brackets consistently and only for stage directions—don't use round brackets for stage directions if characters' speeches also contain text in round brackets.
Some printed plays follow the convention of not closing brackets when the direction is at the end of a speech or scene. For example: [Exeunt.
Where the book doesn't close the bracket in a case like this, you shouldn't either.
V.120. How should I format blank verse?
Just like normal verse in poetry. Make it look like the printed book. Left-align it, and make one line of etext the same length as one line of print.
Sometimes in blank verse, a speech may start mid-line, and the print reflects that by leaving a space on the left, and starting mid-way. In a case like that, do the same in the etext.
About some typical formatting issues:
V.121. Sample 1: Typical formatting issues of a novel.
Look at the image novel.tif. It shows a page of a novel, with several typical formatting decisions to be made.
We note that there is no end-quote on the first paragraph, but that's OK, since the second paragraph is a continuation by the same speaker, so the first paragraph doesn't need a closequote. There is also an italicized "I", which will end up with underscores, but there is nothing else to give us any difficulty.
In the second paragraph, we have an ellipsis, an italicized French word with an accented letter, the British pound symbol, and an italicized "Here".
The ellipsis is simple.
Let's assume we're making this into a 7-bit text, so we're going to convert the non-ASCII character a-circumflex and the pound sign. The a-circumflex just goes to an "a", but we have several choices we can make about the pound sign.
The italicized "Here" is clearly for emphasis, so we will mark that up. The word "flaneur" is italicized because it is not English, but possibly also for emphasis . . . if the sentence had read "The Major is afool", with the word "fool" italicized, it would clearly be emphasis. As it stands, we don't know whether emphasis is intended. This doesn't matter if we are just usingunderscoresor /slants/ to render italics, but if we use CAPITALS, we're going to have to impose our best guess on one side or the other.
The third paragraph shows some vaguely familiar squiggles—Greek
letters! We hit the PG transliteration guide at
We then have a note, which we will format a little differently from the main text to help it stand out, and a new chapter heading.
We should certainly indent the second line of the Byron quotation to preserve its original form, but we have the option whether or not to indent the first line a little to signal to any future automatic converter that this is not to be rewrapped.
In the first paragraph of the new chapter, we need to get rid of the hyphenation of "Wentworth" at line-end and fix the two em-dashes.
In the second paragraph of the new chapter, we have a long dash between "d" and "l", clearly meant to denote "devil", so we will fill it in with three dashes, and we see a three-em-dash after "Lord H", so we can use six, or possibly four, dashes for that.
Finally, we have a table, a list of money values against names.
Depending on the standards we've chosen to use throughout the book, we could render these details in a variety of ways. For illustration, here are two acceptable possibilities:
"I shall go down to Wokingham", said Middleton, "a few days before the election, and the Major will stay here. I understand that there will be no other candidate, andIshall take the seat.
"The Major is a . . .flaneur. He has no interest beyond his own advancement. I can buy him for a hundred pounds.Hereis his answer."
Wallace wondered at thehubrisof his friend, and examined the note Middleton thrust upon him.
"Sir, No consideration would induce me to change my resolve in this matter, but I am willing to engage your services as my agent for a fee of 100 pounds. H. Middleton"
Now hatred is by far the longest pleasure;Men love in haste, but they detest at leisure.—— BYRON
On hearing of Middleton's visit, Mr. Wentworth began his preparations. Meeting with Thomas Lake and Riley at the back of the tap-room of The Bull—where the landlord saw to it that they remained undisturbed—he laid out their plan of campaign.
"That d—-l Middleton shall not have the seat," he raved, "not for Lord H———; no, nor for a hundred Lords! We shall see to it that every man's hand is turned against him when he arrives."
Lake unfolded a paper from his vest-pocket and smoothed iton the table. "Here are the expenses we should undertake."Doran L13 10s.Titwell L 8 7s. 6d.St. Charles L25
* * * * *
"I shall go down to Wokingham", said Middleton, "a few days before the election, and the Major will stay here. I understand that there will be no other candidate, andIshall take the seat.
"The Major is a . . . flaneur. He has no interest beyond his own advancement. I can buy him for L100. HERE is his answer."
Wallace wondered at the hubris of his friend, and examined the note Middleton thrust upon him.
"Sir, No consideration would induce me to change my resolve in this matter, but I am willing to engage your services as my agent for a fee of L100. H. Middleton"
Now hatred is by far the longest pleasure;Men love in haste, but they detest at leisure.—— Byron
On hearing of Middleton's visit, Mr. Wentworth began his preparations. Meeting with Thomas Lake and Riley at the back of the tap-room of The Bull—where the landlord saw to it that they remained undisturbed—he laid out their plan of campaign.
"That d—-l Middleton shall not have the seat," he raved, "not for Lord H——; no, nor for a hundred Lords! We shall see to it that every man's hand is turned against him when he arrives."
Lake unfolded a paper from his vest-pocket and smoothed iton the table. "Here are the expenses we should undertake."Doran 13l. 10s.Titwell 8l. 7s. 6d.St. Charles 25l.
V.122. Sample 2: Typical formatting issues of non-fiction
While non-fiction is not in principle any more difficult to format than fiction, many non-fiction books have lots of features like illustrations, tables, section sub-headings and footnotes, that require some extra work on the part of the producer. If the illustrations are essential, you should consider adding a HTML format file to allow you to present them.
See the page image nonfic.tif. This presents many formatting changes: the centered title will go to the left; the italicized chapter contents will become regular text, and the em-dashes will become "—"; the degree symbol needs to be replaced with ASCII "deg.", and of course we need to render the table readably. After all that, we have to deal with the footnote.
Here is a reasonable rendering of this page:
Strait of Magellan—Port Famine—Ascent of Mount Tarn— Forests—Edible Fungus—Zoology—Great Sea-weed— Leave Tierra del Fuego—Climate—Fruit-trees and Productions of the Southern Coasts—Height of Snow-line on the Cordillera—Descent of Glaciers to the Sea— Icebergs formed—Transportal of Boulders—Climate and Productions of the Antarctic Islands—Preservation of Frozen Carcasses—Recapitulation.
An equable climate, evidently due to the large area of sea compared with the land, seems to extend over the greater part of the southern hemisphere; and, as a consequence, the vegetation partakes of a semi-tropical character. Tree-ferns thrive luxuriantly in Van Diemen's Land (lat. 45 degrees), and I measured one trunk no less than six feet in circumference. An arborescent fern was found by Forster in New Zealand in 46 degrees, where orchideous plants are parasitical on the trees. In the Auckland Islands, ferns, according to Dr. Dieffenbach [82] have trunks so thick and high that they may be almost called tree-ferns; and in these islands, and even as far south as lat. 55 degrees. in the Macquarrie Islands, parrots abound.
On the Height of the Snow-line, and on the Descent ofthe Glaciers in South America.[For the detailed authorities for the following table,I must refer to the former edition:]
Height in feetLatitude of Snow-line Observer————————————————————————————————Equatorial region; mean result 15,748 Humboldt.Bolivia, lat. 16 to 18 deg. S. 17,000 Pentland.Central Chile, lat. 33 deg. S. 14,500 - 15,000 Gillies, andthe Author.Chiloe, lat. 41 to 43 deg. S. 6,000 Officers of theBeagle and theAuthor.Tierra del Fuego, 54 deg. S. 3,500 - 4,000 King.
In Eyre's Sound, in the latitude of Paris, there are immense glaciers, and yet the loftiest neighbouring mountain is only 6200 feet high. Some of the icebergs were loaded with blocks of no inconsiderable size, of granite and other rocks, different from the clay-slate of the surrounding mountains. The glacier furthest from the pole, surveyed during the voyages of the Adventure and Beagle, is in lat. 46 degrees 50 minutes, in the Gulf of Penas. It is 15 miles long, and in one part 7 broad and descends to the sea-coast. But even a few miles northward of this glacier, in Laguna de San Rafael, some Spanish missionaries encountered "many icebergs, some great, some small, and others middle-sized," in a narrow arm of the sea, on the 22nd of the month corresponding with our June, and in a latitude corresponding with that of the Lake of Geneva!
In this case, I made some decisions. I made the lines in the contents at the top a bit shorter than usual, to help them stand out. I decided to use the full word "degrees" rather than "deg." where I could, but not in the table, where I shortened the entries as much as possible while preserving the sense. Since I was using the full word "degrees", I decided to go the whole hog and use the word "minutes" for the minutes symbol as well, (though the minutes symbol, a single quote, is in the ASCII set) since it seemed to make the text more readable than using the word degrees with the minutes symbol. I also made a choice about the table layout.
You might prefer different choices in some of these cases, and, as in our example of fiction above, there was more than one way to do it. However, this is a reasonable rendering.
What happened to the footnote? and how did it become [82] rather than the [1] of the original? In this case, I decided to put all footnotes at the end of the whole text, and renumber them accordingly. So the footnote on this page became number 82 in the overall text, and down at the end of the whole text, I would put:
[82] See the German Translation of this Journal; and for the other facts, Mr. Brown's Appendix to Flinders's Voyage.
I could also have transcribed this as:
. . . Forster in New Zealand in 46 degrees, where orchideous plants are parasitical on the trees. In the Auckland Islands, ferns, according to Dr. Dieffenbach [*] have trunks so thick and high that they may be almost called tree-ferns; and in these islands, and even as far south as lat. 55 degrees. in the Macquarrie Islands, parrots abound.
[*] See the German Translation of this Journal; and for the other facts, Mr. Brown's Appendix to Flinders's Voyage.
if I chose to put each footnote with its own paragraph.
V.123. Sample 3: Typical formatting issues of poetry
Poetry is easy to format: just be sure to use a non-proportional font, and make it look as much like the text as possible. To avoid ragged-looking centering, left-align titles.
In a whole book of poetry, there is no need to leave an indentation before every line; unlike a verse lost in fields of prose, there is little danger that someone will wrap it by mistake.
Look at the image poetry.tif. On this page, we have an enlarged first letter to start each poem, and capitals following—we can remove all that. The titles are centered, so we will move them left.
There are line-numbers at every fifth line, and these are common in poetry, especially where footnotes reference lines. We will keep these out on the right-hand margin.
The third poem obviously intends the centering of its last lines in each verse as a feature, so we will keep that as best we can.
The resulting etext looks like:
Mistress Mary
Mistress Mary, quite contrary,How does your garden grow?With cockle-shells, and silver bells,And pretty maids all in a row.
Ozymandias.
I met a traveller from an antique landWho said: Two vast and trunkless legs of stoneStand in the desert. . . . Near them, on the sand,Half sunk, a shattered visage lies, whose frown,And wrinkled lip, and sneer of cold command, 5Tell that its sculptor well those passions readWhich yet survive, stamped on these lifeless things,The hand that mocked them, and the heart that fed:And on the pedestal these words appear:'My name is Ozymandias, king of kings: 10Look on my works, ye Mighty, and despair!'Nothing beside remains. Round the decayOf that colossal wreck, boundless and bareThe lone and level sands stretch far away.
NOTE:9 these words appear: in some editions : this legend clear.
The Rosary.
The hours I spent with thee, dear heart,Are as a string of pearls to me;I count them over, every one apart,My rosary.
Each hour a pearl, each pearl a prayer, 5To still a heart in absence wrung;I tell each bead unto the end—and thereA cross is hung.
Oh, memories that bless—and burn!Oh, barren gain—and bitter loss! 10I kiss each bead, and strive at last to learnTo kiss the cross,Sweetheart,To kiss the cross.
V.124. Sample 4: Typical formatting issues of plays
Look at the image play.tif. Stage directions are indicated by italics and square brackets. We don't have to do much special work with this—lose the italics, but keep the square brackets. The setting for scene I, act II is also italicized, but without square brackets. If we wanted to emphasize this, we could use shorter lines or add square brackets, but it probably isn't necessary here. We're using 4 blank lines between acts and 3 between scenes, so we mark these accordingly. We leave one blank line between speeches. And following these simple conventions, we get:
JACK. There's a sensible, intellectual girl! the only girl I ever cared for in my life. [ALGERNON is laughing immoderately.] What on earth are you so amused at?
ALGERNON. Oh, I'm a little anxious about poor Bunbury, that is all.
JACK. If you don't take care, your friend Bunbury will get you into a serious scrape some day.
ALGERNON. I love scrapes. They are the only things that are never serious.
JACK. Oh, that's nonsense, Algy. You never talk anything but nonsense.
ALGERNON. Nobody ever does.
[JACK looks indignantly at him, and leaves the room. ALGERNON lights a cigarette, reads his shirt-cuff, and smiles.]