We've been changing these to 4^o, 8^o (that is, 4 and superscripted lower-case
"o"), since the "o" is really just the final "o" of "quarto", etc. We've
done the same in the case of regnal years (e.g. "3^o Hen.8" = third year
of Henry VIII). In both cases, vendors have been been known to use °.
° may not be wholly wrong, since one could imagine that the origin
of the "degree" sign is none other than a superscripted "o"; but we have,
perhaps foolishly, tried to reserve "°" for
uses which modern usage makes us think more appropriate, degrees of latitude
or measures of angles, for example.
Question: In this text, there are two lines with numbers above certain words, which has been captured as:
<L>1 2 3 1 2 3</L>
<L>For Streams, Iuice, Balm they are, which que~ch, kils, charms,</L>
<L>1 2 3 1 2 3</L>
<L>Of GOD, Death, Hell, the wrath, the life, the harmes.</L>
How ought they to be captured?
In the past we've resorted to capturing these as simple superscripts, as follows:
<L>For Streams^1, Iuice^2, Balm^3 they are, which que~ch^1, kils^2,
charms^3,</L>
<L>Of GOD^1, Death^2, Hell^3, the wrath^1, the life^2, the harmes^3.</L>
This is a pretty simple example (they can get pretty fancy), but in every
case the numbers are intended to link separated words, sometimes so as to
provide alternative reading sequences. Here they seem simply to identify
words that should be taken together even if not read together: streams
quench, juice kils, balm charms; wrath of God, life of death, harmes of
hell. There are some fancy pointer or CORRESP things that one could do in
TEI, but we have thought simple superscripts served.
(Larks,
And then with Woodcocks and with
she must rise up and dine:
became:
And then with Woodcocks and with she must rise up and dine: Larks,
Instead of:
And then with Woodcocks and with Larks, she must rise up and dine:
I corrected these, but wasn't sure about putting additional <L>s in. The rhyme scheme indicates four line verses, and the commas at the end of the lines could be a caesura, rather than a real end of line. However, in the past I think we have put a new <l> whenever there has been a line break.
We've generally tried to take into account both rhyme scheme and
capitalization (sometimes even recourse to a modern edition) in deciding
whether to capture as short lines or as long lines with caesura. In this case,
the capitalization and rhyme scheme supportlong lines and should probably
override the existence of the physical line break as an indicator of metre.
We could capture the line above ("And ... dine") as a single verse line (<L>).
If this involves changing the existing tagging, note that often to a large
extent the change can be automated, making use of the case of the first
letter of the second half-line. E.g., replace "</L>\n<L>\([a-z]\)
with " \1"
There will of course be times when the case is too ambiguous, or
the change too inconvenient (or both) to justify doing anything but leave
in place what has been done.
The symbol represented by &abis; may better be thought
of as having several possible expansions, including -is, -es, maybe even
-s (though in the last case why not just use "s"?), and may have been used
as a plural marker without necessarily even considering how it would
be expanded if spelt out.
.
Similarly the "-er-" symbol may often be expanded as "-re-",
&abper; can often more rightly be expanded to "par" than to "per",
&quod; can be expanded as "quoth" as well as "quod", etc.
Question: What should I do with the symbol "ll" (double-ell) meaning "pounds"?
Rather than invent a new entity, follow the usual rules and capture it as "ll" within abbreviation tags, i.e. <ABBR>ll</ABBR>.
We've adopted the following rather brutal strategy for dealing with the
dual forms of "h" in some Caxton books, one of which has a squiggle over
it, and which the vendors have (following directions) treated either as h~
or placed the whole word in <ABBR> tags. The trouble is that none (or
virtually none) of the words
with this form of h seems actually to be an abbreviation of any kind. E.g.
h~im for him, th~is for this, <ABBR>knightes</ABBR> for knightes.
So we've decided (unless someone can prove otherwise) that the strokes on
these funny "h"s should be regarded as superfluous ('otiose strokes' in
paleography talk) and insignificant.
On the other hand, there are a LOT of legitimate <ABBR>s in these
files, mostly arising from the final 'd' with curlicue that stands, supposedly,
for final -e. Our solution was to exclude the abbreviations ending in -d,
then remove all the <ABBR> tags from words containing 'h', as well as
removing ~ following h. Like this:
(1) replace
h~
with h
(2) replace
d\([. ,]*\)</ABBR>
with
d\1</ABBR}
(3) replace
<aBBR>\([^<]*h[^<]*\)</ABBR>
with
\1
(4)
replace </ABBR}
with </ABBR>
<HEAD>Hymnus.</HEAD> .... <TRAILER>Explicit hymnus</TRAILER>
<HEAD>Gloss.</HEAD> .... <TRAILER>Explicit comment.</TRAILER>
<HEAD>Here beginneth the Commentary of <HI>Island.</HI></HEAD>
...
<TRAILER>Here endeth the first part of the Commentarie.</TRAILER>
<HEAD>¶ Here bynnen the chapytours of the .ii. boke / </HEAD>
...
<TRAILER>¶ Here endeth the chapytours of y^e seconde boke.</TRAILER>
<HEAD>Here beginnith the boke of the libelle of English policie</HEAD>
...
<TRAILER>Explicit libellus de Politia conseruatiua maris.</TRAILER>
<HEAD>Actus secundus, Scena prima.</HEAD> ...
<TRAILER>Explicit Actus secundus.</TRAILER>
Question: Is there a way to facilitate editing tables?
Paul replies: I'm thinking about whether we can somehow use the true HTML
table model within our dtd (instead of the TEI model). TEI and HTML are
readily
convertible at this point. Changing the vendors' dtd would be a nuisance;
changing the data already delivered would be a nuisance. One option might
be to add a
add a third dtd for reviewers that uses the HTML model, and convert back
into eebo2prf.dtd as part of the delivery process.
Less satisfactory, but much easier, would be a two-way temporary conversion
that would allow people to edit tables in HTML, then convert them back to
TEI
for reinsertion in the document.
I've had a go at producing this, as follows:
(a) a batch file tab2htm.bat that uses a perl script tab2htm.pl to change all files named table*.sgm to table*.html
(b) a batch file tab2sgm.bat that uses a perl script tab2sgm.pl to change all files named table*.html to table*.sgm
(c) to be used thus: having a file with a nasty table, cut the table out of the file and paste it into a new file. Leave a marker in the original file to show where the table belongs (e.g.: {table1}). Name the new file something similar (e.g. table1.sgm). Type tab2htm at the DOS prompt to turn it into an HTML file (e.g. table1.sgm.html). *Edit the HTML file* in textpad. Right-click periodically and select "display in browser" to see how your changes affect the layout of the table. When it looks ok, save the HTML file. Then type tab2sgm at the DOS prompt. Open the resultant file (e.g. table1.sgm.html.sgm) Cut-and-paste the table back into the original file in place of the marker {table1}.
It's a bit clumsy, but I just used it to fix two tables, so it can work.
Question: There are at least some, and maybe many, bad tables in the
texts already online--bad in the sense that they
display incorrectly and even unintelligibly. How can we fix these and
how can we prevent more bad tables from joining them?
Paul again: Let me know when you spot one. I'll try to fix it. As for preventing them, all I can think of is to make it easier to view and edit tables while doing text review.
<DIV1 TYPE="acrostic poem">
<PB REF="1" MS="y">
<HEAD> ... EDVARDO HIDE ... CARMEN GRATVLATORIVM. an. 1660</HEAD>
<DIV2 TYPE="version" LANG="lat">
<LG>
<L><HI>E</HI>rige depressam Victrix Academia cristam,</L>
<L><HI>D</HI>ii Tutelares rem Tibi nuper agunt:</L>
<L><HI>V</HI>enit lo extorris (praeclarior ostracismo)</L>
<L><HI>A</HI>nchora quassatâ deficiente rati:</L>
<L><HI>R</HI>egibus ante omnes, Patri, Natoque fidelis</L>
<L><HI>D</HI>extera; captivo deliciae populo:</L>
<L><HI>V</HI>irtutìsque supremus apex: hunc
<HI>Anglia</HI>
primùm</L>
<L><HI>S</HI>uscipit in gremium, jam capit <HI>Oxonium</HI>.</L>
</LG>
<LG>
<L><HI>H</HI>erculeus labor! at languens Ecclesia poscit,</L>
<L><HI>I</HI>nque Tuas gestit tradere sacra manus:</L>
<L><HI>D</HI>urius incumbens humeris onus utile Nobis</L>
<L><HI>E</HI>xpulsos revocas, restituisque Deos.</L>
</LG>
<LG>
<L><HI>C</HI>os Musis, cui terra minax, pelagusque pepercit</L>
<L><HI>A</HI>nnue, dans votis vela secunda meis:</L>
<L><HI>N</HI>utriat admissos (sit anus licèt)
ubere pleno</L>
<L><HI>C</HI>asta parens olim, nunc tibi filiola.</L>
<L><HI>E</HI>numerans Mecaenates sis ìpse
patronus</L>
<L><HI>L</HI>iber, ut excellens Bibliotheca crepat:</L>
<L><HI>L</HI>egìs-latores auge, jurisque peritos,</L>
<L><HI>A</HI>uthores nullo codìce deficiant:</L>
<L><HI>R</HI>edde antìqua, & erìs
Nobìs Tu Stator
Apollo;</L>
<L><HI>(J</HI>upiter hoc famam nomine nactus erat.)</L>
<L><HI>V</HI>i cum serus obìs post Te monumenta
relinquens</L>
<L><HI>S</HI>umptibus hic cedant <HI>Seldeniana</HI>
Tuis.</L>
</LG>
</DIV2>
(1) Use <HEAD> as an acceptable fallback, especially if the 'spoken by...' portion is not typographically distinct from the rest of the <HEAD>; and (2) if there is reason (e.g., separate line or other visual distinction) to tag the line separately, then we prefer <STAGE> to <BYLINE>.
Since 'spoken by' lines attribute performance, rather than authorship, we've always been dubious about tagging them as BYLINE (though responsibility for performance is analagous to responsibility for authorship and that there are things to be said in favor of both. )
These little letters are instructions to a rubricator to come along and
add hand-drawn large initials in the space--which was never done. Accordingly,
they should
be treated as if they *were* the large 'drop caps' that belong there, and
included as part of the first word on the first indented line. E.g., on
image 21, left column,
<L>nYchole le mostardier</L>
<L>A bon vinaigre</L>
which matches this in the right hand column:
<L>nYcholas the mustardmaker.</L>
<L>Hath <aBBR>good</ABBR> vynegre</L>
Likewise, further down the same page, we find
<L>oLiuier ... and <L>oLyuer ...
and on the right hand page
<L>pyere le bateure de lame</L>
corresponding to the English
<L>pEter the betar of wulle</L>