Production files

This directory contains files used to document the creation of SGML/XML-encoded text under the auspices of the Text Creation Partnership (EEBO-TCP, Evans-TCP, ECCO-TCP), at the University of Michigan Library. These are working files not intended for public distribution.

VENDOR DOCUMENTATION

Keying/encoding instructions, version 3 (partial revision 2004)
Detailed guidelines for capturing the textual information in the EEBO items. (Version 1 and Version 2 are still available).
Sample pages
Index to 25+ sample pages from potential EEBO items, each presented as a page image or pair of page images (in .pdf) and a corresponding transcription (in SGML).
Calculating EEBO error rates
Documentation of sampling procedures and error-rate calculations
Examples of errors
Examples of “excusable” and “inexcusable” character-level transcription errors
“Illegible” ($) overused (1)
Examples of text unnecessarily marked as illegible
“Illegible” ($) overused (2)
More examples of text unnecessarily marked as illegible
“Illegible” ($) overused (3) and (4)
Yet more examples of text unnecessarily marked as illegible
The other extreme: guessing
Examples of text captured without sufficient warrant in the damaged original
More of the same
More examples of creative capture
Roman numerals
Two special problems with roman numerals: overlining and backwards-c
TEI guidelines
TEI P3 documentation, including element-by-element descriptions
EEBO tagging “cheat sheet”
Supplies a summary description of each of elements of the EEBO tag set (prepared for purposes of internal training)
DIV TYPEs
List of common and preferred values for the TYPE attribute
Decorated initials
Page of sample decorated initials (and non-decorated large initials for comparison)
Apothecaries’ symbols
Capture of apothecaries’ symbols (ounce, dram, scruple, etc.) as found in medical recipes.
Alchemical symbols
Some samples of alchemical symbols, with suggestions for capture (draft)
Unusual symbols used as note markers
Suggested capture for notes that use unusual symbols as markers.
Noting subtle font changes
Examples of subtle typeface changes to be marked with <HI> or <Q> (etc.).
Inverted letters
Examples of letters accidentally printed upside-down.
Sample alphabets from the Caxton’s press: ; His ‘type 1’ font ; His ‘type 2’ font ; His ‘type 3’ font ; His ‘type 4’ font ; His ‘type 5’ font ; His ‘type 6’ font
Alphabets and letter combinations extracted from Caxton’s type fonts, with tentative instructions on capture (to be revised as various letter combinations are seen in context in the books themselves).
Additional symbols
A supplement to the main keying instructions.
Character capture issues (March 2005)
Five proposed areas of change and innovation in character capture:

When there’s nothing there…
Quick summary of the treatment of blanks and things missing.

INTERNAL (REVIEWERS’) DOCUMENTATION

[in progress] All characters list
Experimental list of all available character entities, with pictures. [will never be as up to date as the auto-generated charent list on which it is based.]
All character entities
List of all available charents (both TCP-created and ISO sets) with displayable forms as used in derivative XML version of texts [auto-generated from character map file (see below)].
Additional symbols/charents
Growing list of symbols for reviewers to recognize and supply beyond those in vendor instructions
More odd uses of symbols and characters
Especially math
Overview of review process
Basic guide to the inhouse review process as a whole
How to proof
Step-by-step guide to the proofing stage (preparing and proofing sample)
How to review
Step-by-step guide to the tag-review stage (reviewing and correcting book)
How to end
Step-by-step guide to the final stage (checking in and reporting)
More Latin abbrevs
Further examples of Latin abbreviations, etc.
Ambiguous abbreviations
Examples and draft policy on ambiguous characters and symbols, with examples. (also includes more examples of apothecary’s measures)
Anglo-Saxon type
[now moved to vendor area]
Greek type and ligatures</a
Early modern Greek type and its characteristic forms and ligatures: introduction and a few unorganized samples
Hijacked symbols
Some thoughts on symbols pressed into duty against their will.

Reviewers’ questions and tips relating to …

Structure
Using DIVs to group like things; Using GROUP instead of BODY for several texts with common title front and/or back matter; DIVS and LETTER tags; Songs embedded in plays; Using Q for “raisins in oatmeal”; OPENERs and CLOSERs as holdalls; Dialogues and Catechisms: Questioner and Responder. When pages are in the wrong order.
Notes and Milestones
Note markers; Note placement; Handling endnotes.; STAGE and NOTE combined; Use of MILESTONE unit attribute; MILESTONEs with illegible values; Multiple notes with a single reference
Captions, Headings, and Quotations
Captions in figures; ARGUMENTS in verse; Quotations on title pages; Authorial interjections in quotations; Changing &startq; into <Q> and <HI>; <Q>s broken by <P>s. Q+BIBL inside HEAD. Q+BIBL inside TRAILER. Using running header for division header. Placement of epigraphs.
Letters
New tag: POSTSCRIPT; DIV versus LETTER; SALUTE and SIGNED; Use of DATELINE and DATE (DATELINE and SIGNED, DATELINE without DATE, Including dating system within DATE); Sample CLOSERs with problems; Correct sample CLOSERs and SIGNEDs; Lists of signatories.
Matters philosophical
Correcting illegibilities; Counting in/excusable errors; Purpose of DIV types; Printer’s errors.
Matters miscellaneous
Superscripts, including superscript o; Clarifying UNCLEAR; Long or short lines in verse; Abbreviations and abbreviation entities; Tagging “Explicit”s; editing TABLEs; Acrostic poem; “Spoken by…” in plays; letters for rubricator.
Title Page matters
Proofing the title page; Handling epigraphs on title pages; Imprimaturs, approbations, licenses
Software tips (esp. TextPad)
TextPad clip libraries; TextPad upgrades; TextPad syntax file (for color-coding tags); downloading EEBO pdfs.
Divisions (DIVs)
Assigning div types; Sample div types; Use of “N” attribute alongside “TYPE”
Lists
Lists with curly braces; Genealogies as lists; Tables of Contents and Indexes as lists; Changes to the model of LIST; Syllogisms as lists
Character capture issues
Z and yogh in Scottish texts; Other uses of z; I/J; Illegibilities

Code

For internal use only

Vendors’ coding and capture queries (all very old)

  1. (No. A1) Re: Drama tags (<SP>, <SPEAKER>) in non-dramatic dialogs. Marginal notes and numbers in prose texts. Page-level illegibility (see now P12 instead).
  2. (No. A2) Re: Milestones.
  3. (No. A3) Re: Musical notation.
  4. (No. A6) Re: Single table, illustr., etc. spanning multiple pages.
  5. (No. P1) Re: Marginal notes and numbers in prose texts. Strange “q”-like character in Latin passage.
  6. (No. P2) Re: Odd characters: stars, pointing fingers, and dot-triplets.
  7. (No. P3) Re: Braces; <STAGE> directions; marginal notes IMPLICITLY linked to asterisks in the text.
  8. (No. P4) Re: Interlinear numbers in a “puzzle” poem; <SPEAKER> tags; <SPEAKER>s identified only by number.
  9. (No. P5) Re: ee and oo ligatures with acute accent marks
  10. (No. P7) Re: Numbers appearing usually (but not always) at beginnings of <P>s; specialized vs. default (fallback) tagging; blocks of text after FINIS.(<BACK> matter).
  11. (No. P8) Re: identifying <LETTER>s buried in running text.
  12. (No. P9) Re: missing t.p.; verse paragraphs; poetic letters; analytical summary table of contents; list vs. table; lapidary inscriptions; fractions; mismatched catchwords (missing pages?)
  13. (No. P10) Re: text attached to figures; acrostics printed at an angle.
  14. (No. P11) Re: in-line figures; overlining (of roman numerals).
  15. (No. P12) Re: damaged and illegible text; out-of-sequence pages
  16. (No. P13) Re: song lyrics interspersed with musical notation
  17. (No. P15) Re: duplicate pages: capture both or one & if the latter, which one?
  18. (No. P16) Re: right-justified words at ends of verse lines
  19. (No. P17) Re: multiple typefaces used concurrently, partly to mark quotations
  20. (No. T1) Re: miscellaneous tagging problems exemplified.
Question log (1) regarding the bidding process
Questions (with answers) received from data conversion firms, as well as updates and announcements.
Question log (2) regarding setup and production
Questions (with answers) received from data conversion firms, as well as updates and announcements.

Accumulated Wisdom garnered by the Oxford staff

N.B.: this section appeared originally on the web site of Oxford’s Bodleian Library, and represented (mostly) a compilation of email responses to particular issues in the capture and encoding of early modern books.

Encoding

<ADD>

Mistaken use of ADD tags in LETTERs;
Use of ADD tags for typewritten material;
ADD with handwritten material;
ADD in CLOSERs;
<CLOSER>

Addressee in CLOSER;
Notary signatures;
Use of POSTSCRIPT;
Problem CLOSERs;
Signatures tagged as SALUTE;
DATELINE tagged as SIGNED;
Correctly tagged CLOSER with DATELINE;
CLOSER or TRAILER?;
CLOSER, SIGNED, for “quoth x” ;
Unusual CLOSERs in LETTERs;
DATELINE or SIGNED for descriptions of person or place?;
“directions” after CLOSER in letters;
DIV types

EEBO DIV TYPEs; ECCO DIV TYPEs;
DIV TYPE=”list of authors”;
DIV type for lists of Scripture references;
DIV TYPE=”envoy” (1);
DIV TYPE=”register”;
DIV type for quoted remarks;
DIV type for kings;
DIV TYPE=”corroboration”;
Q at the end of DIV; envoy (2);
DIV TYPE=”imprimatur”;
DIV TYPE=”title”;
DIV TYPE=”publisher’s advertisement”;
DIV TYPE=”testimonial”;
DIV TYPE=”approbation”;
DIV TYPE=”envoy” (3);
DIV TYPE=”docket title”;
DIV TYPE=”attestation”;
DIV TYPE=”advertisement” etc;
DIV TYPE=”mittimus” etc;
DIV TYPE=”versions”;
DIV TYPE=”index” vs DIV TYPE=”table of contents”;
DIV TYPE or N?
Drama

Placement of STAGE directions;
Use of STAGE for heads of speeches;
Use of drama tags in other types of material;
Use of STAGE for heads of speeches;
“spoken by” as STAGE or OPENER;
“spoken by” in STAGE not BYLINE;
STAGE used in non-drama;
STAGE used in musical texts;
“enter chorus” as STAGE and SPEAKER;
use of Q within dialogues and plays;
STAGE in musical texts?;
<FIGURE>

Captions functioning as HEADs;
Uncaptured material in captions;
FIGURE before HEAD;
Tags allowed in FIGURE;
Two related thoughts on FIGUREs;
Placing FIGURE inside HEAD.;
Multiple PBs in fold-out maps;
FIGURE appearing at the end of text;
FIGURE before HEAD?;
Porphyry’s tree diagram;
TRAILER type=”illustration”;
Use of FIGURE for title page borders?;
Scheme attribute of FIGDESC;
Capturing printers’ devices;
Guide letters in rubrication;
Sample tagging of frontispiece with FIGURE, FIGDESC, and BYLINE;
<GAP>

Correcting illegibles;
“Intruder” GAPs;
Missing material of less than a page;
Another intruder GAP;
“Blank” GAPs;
Mdash or blank gap?;
Material omitted in print;
<HEAD>

Half titles captured as HEADs;
“The text” appearing before an EPIGRAPH;
ARGUMENT in HEAD;
<LETTER>

Marginal quotes;
Letters broken by comments;
Complex openings in LETTERs;
Superscription in LETTER;
<LG>

Acrostic poems;
Songs in drama;
LGs with indented text;
Obsolete type attributes for LGs;
Nested LGs for songs in drama;
Short or Long Verse Lines?;
Verse numbers in metrical psalms;
Lines entirely in HI;
Short or long lines? (3);
<LIST>

TRAILERS in LISTs;
Complex nested LISTs;
Simplifying an index;
Simplifying LISTs with nesting;
Simplifying LISTs with LABEL and ITEM;
More on LABEL and ITEM;
LISTs with just one ITEM;
Syllogisms;
ROLE=”label” attribute for LABEL and ITEM;
LISTs with curly braces;
Syllables alongside syllogisms;
Nested LISTs used for genealogies;
Options for table of contents tagging;
LIST type=”sum”;
Mixture of LISTs and verse;
Music

ij;
Hyphenated words in music;
Vocal parts in music;
Musical part book;
<NOTE>

Endnotes;
Multiple NOTE sets;
Interlinear NOTEs;
Endnotes and markers in the text;
Index fingers tagged as MILESTONE;
UNIT values in MILESTONEs;
LETTERs broken by comments;
Linking Endnotes;
Difference between REF and PTR;
Automatically moving NOTEs next to markers;
Endnote or marginal note?;
Correct use of endnotes;
Strung out marginal notes;
<OPENER>

Tagging “to the tune of”;
OPENER and CLOSER as holdalls;
DATELINE without DATE;
Complex OPENERs in songs;
“To the honour of God” as OPENER;
Complex opening to dedication;
Where to split HEADs and OPENERs;
DIV TYPE=”ballad”; more on HEADs and OPENERs in songs;
<Q>

Q versus HI REND=”marginal quotes”;
startq; and endq; to mark dialogue;
when to use HI REND=”marginal quotes”;
one-off dilemma;
Q and BIBL within HEAD;
Two Ps within one Q, or two Qs?;
authorial interjections in quotations”;
Use of Q and BIBL in HEAD (2);
Q or TRAILER at the end of chapters;
Use BIBL where there are no Q tags?;
Thus far in BIBL;
Structure

Replacing BODY with GROUP;
Wrong PB numbers
<TABLE>

Parallel text or TABLE?;
Use of TABLEs for paired quotations;
TABLE tagging: rowspans;
Point-by-point structure in parallel text;
Title Pages

Extra title page known as a “cancel”;
Engraved or handwritten title pages;

Transcription

Abbreviations and Ligatures

Esq;;
Ambiguious ae/oe forms;
German eszett/ss/tz;
Barred q symbol meaning “qui” or “quam”?;
More on barred q;
E hook, eogon, and ae ligature;
G with an abbreviation stroke;
Caxton’s barred double-l;
Abbreviations for Christ;
Removing ABBR round barred double-l;
Abbreviation for “qui”;
Caxton’s “d-flourish”;
Abbreviation for “quartern(e)”;
Fonts

Anglo-Saxon mistranscribed;
Dealing with Gaelic/Irish fonts;
Anglo-Saxon P and wyn;
z/yogh; p/thorn;
Scottish yogh and sz;
Foreign alphabets

Greek symbol made up of o and u;
Standardization of Hebrew character entities;
Breathings in Greek;
Problems with capturing Greek;
Breathings in Greek (2);
Miscellaneous

Where to find the latest charent file;
Blocks of upside down characters;
Upper-case Z in the middle of words;
Whitespace capture;
Punctuation

Half slashes captured as commas;
Use of &punc; within a word;
Full stops in the shape of crosses;
Space between elided article and noun in French;
Mdash or blank gap?;
ct ligature for ampersands;
End-of-line hyphens
Unresolved Queries

Unresolved query;
Another query;
Symbols
Handwritten Symbols;
Jupiter v. Recipe;
Date symbols;
Stand-alone symbols;
X as denarius;
Infinity symbol;
Squares and Quadrines;
Use of generic and specific cross entity references;
Rotated index fingers;
Reversed section symbol;
Rx symbol;
Fractions;
Note-markers;

Technical

DTD and image sets
Other

Miscellaneous

Matters philosophical
Correcting illegibilities; Counting in/excusable errors; Purpose of DIV types; Printer’s errors.
Matters miscellaneous
Superscripts, including superscript o; Clarifying UNCLEAR; Long or short lines in verse; Abbreviations and abbreviation entities; Tagging “Explicit”s; Editing TABLEs; Acrostic poem; “Spoken by…” in plays; Letters for rubricator.