Questions and Tips about Structure

Using DIVs to group like things
Using GROUP instead of BODY for several texts with common title page or other front or back matter
DIVS and LETTER tags
Songs embedded in plays
Using Q for "raisins in oatmeal"
OPENERs and CLOSERs as holdalls
Dialogues and Catechisms: Questioner and Responder
When pages are in the wrong order

1. Using DIVs to group like things:

It is quite permissible and often useful to create DIVs to group like things, even when the grouping has no heading to justify it. In a book containing the works of writer x, for example, you might find:

Three verse treatises, followed by two tragedies, followed by 109 sonnets in a sequence, followed by 4 letters to people of eminence. Each treatise has a heading, each sonnet has a heading, and each letter has a heading, but there is no common heading for all the sonnets or all the letters. Yet it wouldn't hurt to tag as follows:

or even like this:

since this reflects the real, albeit implicit, structure of the book.

2. Using GROUP instead of BODY for several texts with common title page or other common front or back matter

Problem: a group of texts, each (or at least one of them) with its own title page, preceded by a common title page or table of contents that lists all the texts, or followed by an index to all the texts, or an errata covering all the texts, etc.

Solution: make the book one <TEXT>, placing the shared title page or other front matter in the <FRONT> matter of that <TEXT>, then using <GROUP> instead of <BODY>. Within <GROUP> each of the parts becomes its own <TEXT>, each can therefore have its own <FRONT>, while still leaving the shared title page in the <FRONT> to the whole book.

<TEXT> -- top level element containing the whole book
<FRONT> -- includes title page and preface that refer to whole book
<GROUP> -- in place of the usual <BODY>, use <GROUP>.
   <TEXT> -- the first of the subordinate works within the book
     <FRONT> -- includes the title page of the 1st subordinate work if any
     <BODY> -- the body of the subordinate work
     <BACK> -- ...
   <TEXT> -- the second of the subordinate works within the book
     <FRONT> --includes the title page of the 2nd subordinate work
     <BODY> -- the body of the 2nd subordinate work
     <BACK>
</GROUP>
<BACK> -- includes, say, an index to ALL the texts.
</TEXT>

3. DIVs and LETTER tags

If a DIV coincides with the LETTER element, one or the other has to go (usually the latter, i.e. the letter).
I.e.
<DIV2 TYPE="letter">
<HEAD>A LETTER FROM SPAIN</HEAD>
<LETTER>
<OPENER><SALUTE>Dear Sir,</SALUTE></OPENER>
<P>...
<P>..
<CLOSER><SIGNED>Your obedient servant, P.F.S.</SIGNED></CLOSER>
</LETTER>
</DIV2>

We should simply drop the <LETTER> tag, which is intended for letters quoted within something else, not <DIV>s of TYPE="letter".

4. Songs embedded in plays.

Many plays descend (ascend?) occasionally into song. I refer to the common situation in which a player,
or a group of players, or a chorus, or some unnamed set of players, breaks into song, usually in a set of stanzas set off from the rest of the play (e.g. by italics) and often supplied with a head:

<HEAD>Drinking Song</HEAD>

Sometimes the rest of the play is in prose, sometimes in verse. Sometimes the song is part of one character's
speech, sometimes it stands apart.

If the lyrics are not printed, "Song" usually appears as a <STAGE> direction, which is easy enough.

But if the lyrics *do* appear, it is usually difficult or impossible to tag them as a <DIV>: they are alien nuggets embedded in the a <DIV TYPE="scene"> and often within a <SP> as well. In that case, it is usually easiest to tag them as <LG>s, and to tag any stanzas belonging to the song as nested <LG>s.

<SP>
<SPEAKER>Minstrel</SPEAKER>
<P>Shall I sing a merry jest for you, Gentlemen?</P>
<LG><HEAD>Song.</HEAD>
<LG N="1"><HEAD>I.</HEAD>
    <L></L>
    <L></L>
    <L></L>
</LG>
<LG N="2"><HEAD>II.</HEAD>
    <L></L>
    <L></L>
    <L></L>
    <L></L>
</LG>
</LG>
<P>Wasn't that fine?</P></SP>

Rarely, the song will have internal structure (e.g. will itself contain speeches) that are not allowed within <LG>. In that case, you will need to treat the song as just another example of a "raisin in oatmeal": that is, you will probably need to tag it as an embedded <TEXT> containing <BODY><DIV1 TYPE="song">. The <TEXT> tag can be most readily inserted by enclosing it in a <Q> or <P>.

The rationale for resorting by preference to LG is that:

(1) It is not *wrong*. LG can mean any kind of group of lines. Its default type for us is"stanza" but it can be used for other, preferably specified, groupings as well.

(2) In some situations, viz., mixed verse and prose, we've instructed the keyers to use LG loosely to mark the beginning of a section of verse, just as they use P to mark the beginning of a section of prose.

(3) <Q><TEXT><BODY><DIV1 TYPE="song">
will also work much of the time, we've used it and sometimes have little
choice but to use it, but it suffers from the drawbacks that the
songs are not usually genuine quotations at all (so we're abusing
<Q>); and that Q cannot in any case appear directly within SP
(should that be necessary).

Here's a simple one-stanza song within a speech:

<P>Pox! This is a pretty Musical business; but
this will not make a man merry—I'll sing you a Song:
Fill the Glasses first. Come on. When I sing Down, down,
Then you must all drink—</P>
<LG TYPE="song">
<HEAD>SONG.</HEAD>
<L><HI>I</HI> Love some body, I love no body,</L>
<L>Some body, no body dearly:</L>
<L>I love some body, <HI>&c.</HI></L>
<L>Be she black, or be she brown,</L>
<L>She's the best in all the Town,</L>
<L>So she keep her Belly down.</L>
<L>Down, down, down down:</L>
<L>There's no fault to be found,</L>
<L>So she keep her Belly down.</L>
</LG>

<P>Hah! I think this is well, hah!</P>

</SP>

And here is a multi-stanza song, not within a speech:

<LG TYPE="song">
<HEAD>SONG.</HEAD>
<LG>
<L>LOve thee till there shall be an end of matter,</L>
<L>So long, till Courtiers leave in Courts to flatter;</L>
<L>While empty Courtlings shall laugh, jeer, and jibe,</L>
<L>Or till an old lean Judge refuse a Bribe.</L>
</LG>
<LG>
<L><PB N="9" REF="8">Till Young'men Women bate, I will love thee;</L>
<L>Till greedy Lawyers shall renounce a Fee,</L>
<L>And till Decrepit Misers Money hate,</L>
<L>Or Statesmen leave to juggle in a State.</L>
</LG>
<LG>
<L>While Priests Ambition troubles Common-wealths,</L>
<L>Till Whores grow chaste and Thieves forsake their Stealths;</L>
<L>Till Tradesmen leave to Cozen and to Lye,</L>
<L>Till there's a Worthy Flatt'rer, or Brave Spie.</L>
</LG>
<LG>
<L>Till Honest Valiant Men can be afraid,</L>
<L>Till Kings by Favourites are not betray'd;</L>
<L>Till all Impossibles do meet in one,</L>
<L>I'll love thee <HI>Phillis,</HI> and love thee alone.</L>
</LG>
</LG>

5. Using Q for "raisins in oatmeal":

This example contains items that seem to be authorities of various kinds (mostly canons from various ecumenical councils) that the author is quoting at length, and then translating into English in a parallel column arrangement. Structurally, they fall into the category that internally and privately we refer to as "raisins", since they float in the text like raisins in oatmeal. (<DIV>s, on the other hand, are more like grapefruit segments!).

They can stand on their own; they have internal structure; they are quoted from an external source. All of these makes it easiest to handle them using the <Q> tag, making use of the fact that <Q> can contain <TEXT>, which can in turn contain the whole hierarchy of TEI elements: <BODY><DIV1> etc.

The "paragraph" material between the quotations is thus captured simply using <P> and the <DIV> structure of the main text. The paired columns are captured as <Q>, each <Q> continuing till the chunk of quoted material ends and the "paragraph" material resumes. Within the <Q>, each column receives its own <DIV>, one for the Latin text, one for the English translation.

<P>And in the Appendix of the same Councell I find this Constitution.</P>
<Q><TEXT><BODY>
<DIV1 LANG="lat">
<P>
In via quilibet ince|dens pudicis oculis, cum modestia et gravitate, ad loca minus honesta non vadat, nec ad spectacula publica, choreas, ludos, hastiludia, torneamenta, et alia hujusmodi. Nemo lu|dat, aut familiares suos ad taxillos, vel alios ludos in|honestos ludere patiatur.</P>
    </DIV1>
    <DIV1 LANG="eng">
    <P>Every one walking in the way with chast eyes, with modesty
    and gravity may not goe to dishonest places, nor yet to publike
    spectacles, dances, Playes, tiltings, jests, and such like
    sports. Let none play, nor yet suffer his familiars to play
    at dice, or other dishonest games.</P>
    </DIV1>
</BODY></TEXT></Q>

6. OPENERS and CLOSERS as hold-alls:

Example: "To the tune of The Merry Cuckold" at the beginning of a song:

Such instructions or comments at the beginning of a song, ballad, etc., can be tagged with the all-purpose <OPENER> tag, on the grounds that it is a recognizable and conventional opening element of printed ballads, etc.

Example: a situation where this is introductory scriptural material to a following prayer. This material consists of a quotation from Ecclesiastes, a one-sentence paraphrase of it, and a cross reference in a different typeface.

We tagged the whole mess as <OPENER>. Within <OPENER>, we tagged the quotation and reference to Ecclesiastes as <EPIGRAPH>, left the paraphrase as simple character data, and treated the cross-reference as a <NOTE>. OPENERS and CLOSERS can sometimes do duty as general holdalls for "stuff at the top" and "stuff at the bottom" respectively.

7. Dialogues and Catechisms: Questioner and Responder

Example: The text appears in blocks headed by "Quest." then "Answ."

There are a number of ways to approach books that consist of exchanges, such as catechisms (question/answer), polemics (statement/refutation), and dialogues (speaker1/speaker2). Which one you choose will depend mostly on

what kind of headings are attached to the sections
what kind of exchange it is
how consistently structured it is
how ambitious you want to be

Method 1 (usually recommended) is the easiest and you can always fall back on it: just record the text in paragraphs. The headings ("Obj." "Answ.") become simply hightlighted text at the beginning of the paragraph. If the text goes in and out of exchange-mode, or is otherwise complex or incoherent, or if there is some other impediment to using a more complicated system, this is usually the best way to go. If you have any doubts about being able to carry out the other methods successfully, use this one.

The only problem with this method is that it makes dealing with alternating typefaces more complicated. Exchanges frequently print the question and the answer in contrasting typefaces (e.g. questions in italic and answers in roman), and the most satisfactory way of dealing with this is to mark as <HI> only exceptions to those types. But this rather stretches the usual rules for use of <HI>.

Method 2 is almost as easy as the plain-text method, since you can go into and out of it at will. That is: treat the text as a dialogue and use the drama tags <SP> (speech) and <SPEAKER>. This is usually the best solution if the two sides are actually two people, even two idealized people (e.g. an argument between characters named "Quaker" and "Baptist").

Method 3 is to use <DIV>s for the individual chunks. This method is in some ways the most satisfactory, since it captures the real structure of the book and allows you to cope easily with things like alternating typefaces for the different parts, but usually only works if the text is rigorously and consistently structured. Otherwise, you are left with stuff that doesn't easily fit into either the question <DIV> or the answer <DIV>.

Method 3(a) treats each question and each answer as a DIV. Best if the questions are long.
Method 3(b) treats each question as a <HEAD> to a DIV, of which the answer forms the body. Best if the questions are consistently short.

8. When pages are in the wrong order

Question: In this text the pages on images 50-52 are in the wrong order (or rather one leaf is in the wrong place). Apex had put in missing page gaps, but nothing is actually missing. Do I change the order of the data so it makes sense or do I leave it as it is? I suppose either way could be confusing for the user.

Though some eyebrows have been raised at the practice, we have always
said--even insisted--that in matters like this the need to create an
intelligible, coherently structured text overrides any putative need to
conform to the errors of the image set, the underlying microfilm, or
even the original. So in cases of manifest disarrangement of pages, we
have been rearranging the data into the correct order--without changing,
of course, the REF values of the <PB> tags. The only consequence is
that the REF values of the PB tags will no longer be in sequence.
Sometimes the disarrangement is extreme: for example, sometimes a
page is passed over and then filmed later, so that the image appears
at the end; sometimes a series of pages is badly filmed and then
refilmed at the end of the set: place the correct pages,
or the better-filmed pages, into the text at the spot where they
belong, all the while leaving the REFs to point to the images that
they are pointing to now.

Our aim is to represent, so far as the
images allow, the BOOK--even at the risk of representing a slightly
idealized, "as intended," form of the book. We should be free
to rearrange page images, even pick and choose the best images if
we are faced with sets of duplicates, so as to create a coherent
book in the original, intended sequence. This will sometimes
create strange <PB> sequences: <PB REF="1" N="1"><PB REF="32" N="2">
<PB REF="3" N="3"> etc.