The Text Creation Partnership partnered with three major commercial providers of digitally imaged historical books. Rather than start from scratch, the project was able to “leverage” these enormous existing databases of page images and focus its energies and its funds on transcription and markup.
The three text corpora were keyed from the three databases in question:
- Early English Books Online (published by ProQuest)
- Eighteenth Century Collections Online (published by Gale Cengage)
- Evans Early American Imprints (published by the Readex division of Newsbank)
The result has been a corpus of more than 70,000 transcribed and encoded historical texts, more than a billion words, most of which can now be searched online. (All will soon be released from all restrictions on use and reuse.) The scope of the project’s effort is unprecedented and unmatched among digitization and text encoding projects of its kind, and represents a significant contribution both to primary-source history and to the documentation of the language itself.
Read more about each project
Explore the three digital collections
Early English Books Online TCP (EEBO-TCP)
Phase 1 — comprising 25,000 texts available to everyone
Phase 2 — 35,000 texts available only to EEBO-TCP partners (until the end of 2020)
Eighteenth Century Collections Online TCP (ECCO-TCP)
Full text of about 3,000 books available freely to everyone
Evans Early American Imprints TCP (Evans-TCP)
Full text of about 5,000 books available to everyone