# Structure

## Replacing BODY with GROUP

Source: email
Date: 15 Mar 2002
File name:
Keywords: BODY, GROUP, TEXT, structure

Query. GROUP with multiple TEXTs did seem appropriate, as they had own title pages. However, the first title page applies to the group of texts, not just the first text ...

Answer. PFS: Although in the instructions we have rather simplified things for the vendors (offering them only the choice between TEXT and GROUP as the top-level element), in fact the DTD allows more complicated arrangments. In particular, it allows one to substitute GROUP not only for the TEXT tag but also for the BODY tag. This is commonly useful when you have a number of TEXTs (each, say, with its own pagination and title page) bound together in a book with a title page and preface that refer to all the texts. In that case, you can organize the book like this:

<TEXT> -- top level element containing the whole book
<FRONT> -- includes title page and preface that refer to whole book
<GROUP> -- in place of the usual <BODY>, use <GROUP>.
<TEXT> -- the first of the subordinate works within the book
<FRONT> -- includes the title page of the subordinate work
<BODY> -- the body of the subordinate work
<BACK>
<TEXT> -- the second of the subordinate works within the book
<FRONT> --includes the title page of the subordinate work
<BODY>
<BACK>
</GROUP>
<BACK> -- includes, say, an index to ALL the texts.
</TEXT>

## Wrong PB numbers

Source: email
Date: 07 Jun 2004
File name:
Keywords: PB, structure

Query. ... the vendors have got the PBs wrong - ref 16 should be ref 15, ref 17 should be ref 16, etc. Is there a way of globally replacing REF="n" with REF="n-1"?

Answer. A. If the sequence has no omissions, and is the usual EEBO sequence of 1, 1, 2, 2, 3, 3, 4, 4, etc., then you can simply replace all the REF values globally using TextPad. The only trick arises from the need to double every image number, because each image contains two pages. I manage this by first changing REF to something ending in an incrementing number, then change the REFs ending in an odd number and the REFs ending in an even number separately, creating two alternating sequences of incrementing numbers.

(Even if there is a break in the sequence or two, you can do the same thing to each 'chunk' of the sequence, so long as you are careful to restart the REF values at the correct number.)

Details:

• (1) Change

REF="[^"]" to REF\i=""

(This gets rid of the existing REF number and changes the REF attribute label itself to a string beginning with REF and ending with a incrementing number, e.g. REF1 REF2 REF3 REF4 etc.)

• (2) Change

REF[0-9]*[13579]="" to REF="\i"

(This changes all the new ODD ref strings to a proper REF="1" REF="2", etc.)

• (3) Change

REF[0-9]*[02468]="" to REF="\i"

(This changes all the new EVEN ref strings to a proper REF="1" REF="2" etc. with the result that you now will have:

REF="1" REF="1" REF="2" REF"2"

If you need the sequence to start at some place other than REF="1", simply put that number in parentheses after the "\i" in replacements (2) and (3) above. E.g. REF="\i(4)" will start the sequence at 4 instead of 1.)

Answer. B. Much faster, and the *only* way that I know of if the REF values need to be out of sequence, or repeat unpredictably, or are messy in any way, is to use Perl, as follows:

• (1) Copy the following four lines into a new text file:

while (<>) {

s,REF="(\d+)",q{REF="} . (\$1+1) . q{"},ge; print; }

• (2) Save the file in your working directory with a memorable name and a .pl extension, e.g. refplus1.pl

• (3) Run the script on your file, redirecting the output to a new file, by typing this at the command prompt (modifying as necessary to reflect the actual location of your installed copy of Perl):

C:\apps\perl\bin\perl refplus1.pl S12345.tech.sgm > S12345.fixed.sgm

that is, more abstractly:

[path to Perl.exe]\perl [name of script.pl] [oldfile.sgm] > [newfile.sgm]

I just tried this on a couple of files and it works magically.

pfs