TEI-XML and Drupal

A question* about Drupal & TEI recently arose on TEI-L where one TEI member advanced my name as that of an individual whose work results in an instance of Drupal-TEI integration.

In short, I (along with my colleagues Garrick Bodine, Helmut Doll, and Elaine Gustus) have been responsible for the research, design and development of the TEI-EJ publishing model and publishing platform (I presented a related talk at DH 2010; see the abstract here). Below I offer a very brief description of how the TEI-EJ platform leverages both Drupal and TEI-XML.

My team’s work on a Drupal-TEI integration results from implementation of the XML Content module (if you’re working in Drupal, I suspect this is all you really need to know):

“XML Content is an XML entry, XSL transformation, and XML validation module that leverages PHP xml and xsl support, and the drupal output filter system. With XML Content, you can save XML inside the body of any node type, and have it display differently with XSL, or validated against a preconfigured schema.”

What does this mean in general? It means that via the Drupal XML Content module, virtually any Drupal site text content can be authored in TEI-XML.

What does this mean for journal developers and/or editors interested in leveraging TEI in their Drupal sites? The best way for me to answer this is by explaining how TEI-EJ utilizes the module. TEI-EJ’s Drupal site infrastructure was customized to distinguish three different Content Types:

  1. Journal content: Editorials, articles, essays, interviews, reviews
  2. Community-driven content: Blog posts; featured projects; tutorials; comments; teaching and learning resources
  3. Static, informational content: ‘About’ page, submission guidelines, etc.

And the site is designed to accept and publish these content types in various Formats, including:

  1. Text-based: TEI-XML as well as .txt and .doc formats (which are converted to TEI-XML)
  2. Media-based: e.g. mp3 audio, mp4 video, image
  3. Web-based: HTML

Because the journal in particular is an excellent place to exploit TEI, any text-based manuscript content published within the journal section of the site is encoded in TEI-XML, and this content is managed by the XML Content module.

When an author submits a manuscript encoded in TEI-XML using the online submission form, it is automatically validated against a Roma-generated custom schema. Authors can also submit in other formats (e.g. .txt or .doc attachment) which can then be encoded in TEI-XML by editors, pasted into the submission form, and validated against the schema (which can be updated or modified as needed).


The display of the TEI-XML content reflects custom XSLT and CSS styling (thanks to my colleague Garrick Bodine at Penn State). At present, the transformation ignores the TEI header, displaying instead a transformation of the TEI-XML document body as well as title, author, and other relevant metadata which authors (or editors) have entered into a database field via the online submission form.

Underlying XML and Display Example

As a publishing project, TEI-EJ is interested in contributions from a broad range of readers, including those from outside the TEI community. For this reason, community-driven sections of the TEI-EJ site (including tutorials, featured projects, blog posts, comments, and teaching and learning resources) do not require contributors to submit in TEI-XML. However, whether for experimental, editorial, or any other reason, editors can adjust the XML Content module settings to permit, require, or convert text-based content in these areas, as well as in static-informational areas and the journal, in TEI-XML.

Thus, the XML Content module offers editors crucial levels of flexibility and granularity in determining which Drupal node types are to be written in TEI-XML.

Although TEI-EJ is in moratorium just now, I hope to make the (in-progress) publishing platform, which will be open-sourced to support publishing projects in general, available for open preview shortly, but if you’re interested in access even sooner, please don’t hesitate to contact me at sschlitz@bloomu.edu.

*The original question, as I understood it, was rooted in the possibility of harnessing tools such as the newly released Anthologize as well as TEI-XML to advance social media objectives. I’m reading Anthologize as a WordPress plugin not a Drupal module (though I’d love to be wrong about this and find that it’s usable in both), but one that does promise export in TEI and certainly one that will prove a lot of fun to experiment with in blogging and blogging-based pedagogical contexts.