There are contending schools of thought, but the consensus is:
We favor the searcher, at least to this extent: if I am sure what the letter should be (95% sure); if there is something there; and if what is there does not conflict with my interpretation -- then I'm willing to put in the "right" letter. If I am really not quite sure; or if there is effectively nothing there at all -- then I leave it as illegible.
(1) inexcusable character-errors (mistranscriptions)
(2) excusable character-errors (mistranscriptions)
(3) COMPLETELY unwarranted use of $ or $word$ etc.
(4) missing letters supplied
(5) spacing problems
--are mutually exclusive. That is, if you count
an error as
belonging under (3), you shouldn't count the same
error
under (1).
(b) the error count to be used in deciding whether to accept
a book is the sum of (1) and (3). You can consider
the
others as indicators of the general quality of the
text,
but we don't count them against the 1:20k spec.
[In books
received before 8/01, count only (1).]
(c) in considering "$"s for potential inclusion under (3)
we are biased in favor of the vendor in several
respects:
(1) we assume the burden of proof and award doubtful
cases to the vendor; (2) we limit ourselves to visual
evidence only (excluding our 'reading' knowledge);
and
(3) we honor the vendor's own criteria: e.g., in
the case of Apex, if there are more than two contiguous
letters doubtful enough to be called "$", we allow
them
to call the whole word $word$, annoying though this
is.
(1) Such a DIV1 is superfluous. It may nearly always be safely
removed except when doing so would leave the BODY
without
any DIVs at all.
(2) The only two places where I can think that we commonly
deliberately insert such a DIV1 occur (1)
within
the <Q><TEXT><BODY> construction, when
we frequently use a
<DIV1> of a particular TYPE in order to specify
the nature
of a quoted text (e.g. <Q><TEXT><BODY><DIV1
TYPE="document">);
and (2) when the BODY would otherwise have no DIVs
at all.
(3) Such a DIV1 is basically harmless. If removing it requires that
we change the level of many subordinate DIVs, we
usually
leave it in place.
(4) Such a DIV1 usually receives a TYPE attribute of "text," rather
than (say) "book" or "body", just by tradition;
in retrospect,
a TYPE="body" would probably make more sense and be more
consistent
with our other practice.
(5) We usually do *not* use the TYPE attribute of such a DIV1 as
a backdoor way to indicate the genre of the book
as a whole.
Such generic classification is better left to the
information
in the bibliographic record, and therefore in the
header,
where it is under cataloguers' control, than to
us. That is,
we do *not* say <BODY><DIV1 TYPE="treatise">...</DIV1></BODY>
or <BODY><DIV1 TYPE="almanac">...</DIV1></BODY>
(6) There is no, however, no harm done by using the TYPE
atribute in this way. It is just that it takes us
into
a kind of subject cataloguing that is not our job
and can
easily become quite challenging. In a few well-defined
genres,
in which we may someday wish to create restricted
searches, it may
even provide some benefit to retrieval (e.g.: type="play"
and type="letter"), but not enough to make up for
the cost it
would impose if we chose to carry it through consistently.
So feel free to say DIV1 TYPE="play" or DIV1 TYPE="letter" (and maybe a few more common genres that occur to you), even when the whole book consists of the play or letter in question--but if the genre does not fall into one of these obvious categories, just use TYPE="text" rather than searching for an appropriate genre.
(7) The most important function of the TYPE attributes on DIVs
in general is to clarify the relationship of
parts of
the book to each other and to the book as a whole.
I
realize that this too can involve some generic labelling,
but this is only a kind of byproduct of the primary
purpose
of explicating relationships.
We would go with "fonne", although the printer would not thank us for perpetuating his mistake. The only situations in which we normally correct the printed copy have to do with attribute values: if the sequence of chapters is clear, but one is misnumbered (e.g.: LI, LII, LIII, LIV, LV, LIV, LVII), we set the value of the "N" attribute to the underlying, rather than to the printed, number (N="54" N="55" N="56"), while leaving the literal text as it is. We occasionally deal with mismatched footnote numbers the same way, by using the 'correct' number as the value of the N attribute and ignoring the incorrect one. Finally, there are times when a symbol of some kind is used with a meaning quite different from its usual meaning; in that case (if the symbol is based on a letter), we often preserve only the base letter within <ABBR> tags, rather than record a semantically misleading character entity. Thus "Esq;" is captured as <ABBR>Esq</ABBR> rather than as Es&abque;.