Week 8, 10/18

Text Encoding and the TEI

Mon: Austen v. Austen: Project Gutenberg file formats and a Technical Introduction to the Jane Austen Fiction Manuscripts

Summary Points:

  1. orthographic (spelling) variation; minuscule/majuscule patterns; original diction and syntax; author edits; abbreviations; etc. — all of these are potentially lost in an edited (i.e. normalized) text transcription
  2. what do readers gain from normalized versions of texts?
  3. is it possible to produce a version of a text that offers both a diplomatic transcription and a normalized version?
  4. TEI-XML (see Hafgeirs example and underlying source code; TEI video; & TEI by Example)

For Wed, I’ll continue discussion of the Jane Austen materials, but we’ll also begin a formal discussion of text encoding. Thus, I’d like you to begin reading the Introduction to TEI by Example, to review the TEI P5 example in the examples section, and to plan to take the introductory quiz .

1. TEI by Example: Module 0. Introduction (tutorial, example [scroll down to 6. TEI P5 XML], test)
2. TEI by Example: Module 1. Introduction (tutorial, example, test)

