Some general notes on tagging using specific texts as examples:
P&R STC 23954 Here begynneth the lyfe of the blessed martyr Saynte Thomas [1520].
1. This book is unpaginated, so far as I can tell. Supplying the "N" value of the <PB>
tag from the quire signature is unnecessary. Quire signatures should be ignored. 2.<DIV1 TYPE="book">
is not necessary; <BODY> is correct and sufficient. The rule (as given in the instructions) is that <DIV>s are generally used only when there is more than one of something, that is, when the larger item is actually DIVIDED into two or more parts. 3. I believe that we're going to be radically simplifying the rules for illegible and indecipherable text. Till we come up with new rules, you need know only this: "$" is sufficient to indicate an unidentified character. You do not need to use this:
<UNCLEAR>$</UNCLEAR> .<UNCLEAR> is intended for large spans of text which is difficult to read. We will probably be phasing it out completely soon. 4. Virgules between words ("/") should have a space before and after them, for convenience of reading (so change "man/and toke" to "man / and toke").
5. Periods used to set off numbers should be spaced like this: .lxiii. or .23. not like this: . lxiii. . 23. that is, they should "hug" the number on front and back, without spaces.
P&R STC 15070 John Knox. The first blast of the trumpet against the monstruous regiment of women [1558]. [misnumberd 1070]
1.<DIV1 TYPE="book">
is not necessary; <BODY> is correct and sufficient. The rule (as given in the instructions) is that <DIV>s are generally used only when there is more than one of something, that is, when the larger item is actually DIVIDED into two or more parts. 2. TYPE="chapter" for the second <DIV2>
would probably be better labelled as TYPE="text"; "chapter" is perhaps more appropriate when there is more than a single part (also: more appropriate when the part is actually *called* a chapter (cap., capitulum, etc.) in the book itself.) 3. There should be no REND attributes for any tag (except occasionally MILESTONE). In STC15070, REND attributes were found wrongly attached to <HI>
, <BIBL>, and <P> tags. The general rule is: don't supply attributes unless specifically instructed to do so. (We'll probably be removing some of the surplus attributes from the DTD to remove the temptation!)
4. Notes in italics need not be captured with a
<NOTE><HI> combination: the<NOTE> tag is sufficient, since italics seems to be the default ("predominant") typeface of the notes. Indeed, since the predominant typeface of the notes is italic, it is only the exceptions (i.e., the text in upright roman type) that needs to be flagged as <HI> .5. The ANCHORED attribute is not needed in <NOTE>
tags. Same comment as with #3 above. 6. <FRONT> matter should be tagged as such, with the title page a <DIV> within that, i.e.,
<FRONT><DIV1 TYPE="title page">
Only minimum tagging should be necessary within that division; blocks of text on the title page can usually be safely tagged simply as <P>aragraphs. There is rarely if ever a need for <HEAD> on the title page.
7.<NOTE> tags in STC15070 (which are marginal notes in a prose text) seem generally placed exactly where they appear whereas instructions suggest recording them as closely as possible at the point in the text to which they relate. In particular, in prose texts without note pointers (footnote numbers, etc.), the instructions suggest placing notes at the end of the nearest sentence if no more obviously appropriate spot appears.
8. Some pieces of text tagged as a single <NOTE>
are in fact two or more notes run together. It is not always easy to see the point at which notes should divide when they are crowded together in the margin. In that case, you admittedly have little choice but to treat the whole block as a single note: in general it is better to leave things together unless you are sure that and where they should be split up. 9. The final line, tagged <P>, should be tagged
<TRAILER>. 10. As with the previous book, <UNCLEAR>
tags are not needed in instances of small illegibilities--diacritics, letters, and words that are difficult to read or to decipher. In these cases, the "$" is sufficient. (and further clarification is forthcoming). 11. Notes that begin on one page and carry over to a second should not be broken into two <NOTE>s but combined and inserted as a single <NOTE> at the point to which it relates. E.g., a note beginning "Priuate examples" begins on the bottom of p.10[left] and continues ("do not bre+ake the ge+nerall or|dinance.") on the top of p.10 [right]; it was tagged as two separate <NOTE>s.
P&R STC 18909 Thomas Overbury. Sir Thomas Ouerburie his wife [1611].
1. Unlike the previous book,
<FRONT> matter here is properly tagged as such, with the title page a <DIV>within that. The title and most of what follows can be tagged with <P> , but neither <HEAD> nor <BIBL> is needed. Just treat the pieces of the t.p. as prose to the extent that that is possible. 2. There should be no REND attributes for any tag. In STC18909, REND attributes were found wrongly attached to <HI>
, <BIBL>, <SIGNED> , <TRAILER> , <DIV1> , and <P> tags. 3. Again, <UNCLEAR>
tags are not needed in instances of small illegibilities--diacritics, letters, and words that are difficult to read; in these cases, the "$" is sufficient. 4. In STC18909, p.23[left] "of" is repeated twice in the line "The
<HI>Face</HI> we may the seat of of<HI>Beauty</HI>call,". The second "of" is lacking in the text as captured, presumably because it has been scratched out as extraneous. But we want to capture the whole text as originally printed, erroneous or not; if the deletion marks make the "of" hard to read, the extra "of" could have been included as follows:"The
<HI>Face</HI> we may the seat of<UNCLEAR CAUSE="del">call,".of</UNCLEAR> <HI> Beauty</HI>5. The
<DIV2> tags under the <DIV1> that is headed "Characters" are numbered (N="1", etc.), presumably because there are faintly pencilled numbers in the original. This is contrary to the instructions not to record any handwritten material.
S-22539 Arcadia (1590)
1. Arguments at the beginnings of chapters should be enclosed within <ARGUMENT> tags rather than broken down into individual notes.
2. The signature of a dedication ought to be treated as <SIGNED> or <CLOSER><SIGNED> rather than as a paragraph (<P>).
3. The salutation of a dedication can be captured as <HEAD>, but the more exact <OPENER><SALUTE> is preferable.
4. In the case of a letter captured within prose narrative, the beginning of it was captured as follows:
<Q><L>Philanax his letter to Basilius.</L></Q></P>
<P>Most redoubted & beloued prince...Instead of the more correct form as follows:
<LETTER>
<HEAD>Philanax his letter to Basilius.</HEAD>
<P>.....</P>
</LETTER>
</P>5.
<PB>s occurring between <DIV>s should be placed just inside the beginning of the new <DIV>, not just inside the close of the old <DIV> . Each <PB> tag should be equipped with a REF attribute supplying the image number of the source. E.g. <PB N="312" REF="318">