Skip to Main Content

Text Encoding (TEI)

Guide to the Guide

What Is TEI?

A system for digitally describing texts in the humanities, using a computer mark-up language called XML (eXtensible Markup Language).

The Text Encoding Initiative (TEI) Guidelines are an international and interdisciplinary standard that facilitates libraries, museums, publishers, and individual scholars represent a variety of literary and linguistic texts for online research, teaching, and preservation.

Background

The TEI was established in 1987 to develop, maintain, and promulgate hardware- and software-independent methods for encoding humanities data in electronic form. Over nearly three decades the TEI has been extraordinarily successful at achieving its objective and it is now widely used by scholarly projects and libraries around the world.

When the Text Encoding Initiative (TEI) was originally established, scholarly projects and libraries attempting to take advantage of digital technology seemed to be faced with an overwhelming obstacle to creating sustainable and shareable archives and tools: the proliferating systems for representing textual material. These systems seemed almost always to be incompatible, often poorly designed, and multiplying at nearly the same rapid rate as the electronic text projects themselves. This situation was inhibiting the development of the full potential of computers to support humanistic inquiry by erecting barriers to access, creating new problems for preservation, making the sharing of data (and theories) difficult, and making the development of common tools impractical.

In November 1987 a meeting at Vassar College was convened to address these problems. Sponsored by the Association for Computers in the Humanities and funded by the National Endowment for the Humanities, it brought together a diverse group of scholars from many different disciplines and representing leading professional societies, libraries, archives, and projects in a number of countries in Europe, North America, and Asia. At this meeting the intellectual foundation for Text Encoding Initiative was articulated. The organization of the actual work of developing the TEI Guidelines was then undertaken by the three TEI sponsoring organizations: The Association for Computers in the Humanities, the Association for Literary and Linguistic Computing, and the Association for Computational Linguistics.

The initial phase resulted in the release of the first draft (known as "P1") of the Guidelines in June 1990. A second phase, involving an additional 15 working groups making revisions and extensions, immediately began and released its results throughout 1990–1993. Then, after another round of revisions, extensions, and supplements, the first official version of the Guidelines (‘P3’) was released in May 1994. Early on in this process a number of leading humanities textbase projects adopted the Guidelines — while they were still very much a moving target of rapidly changing drafts — as their encoding scheme, identifying problems and needs and contributing proposed solutions. In addition, workshops and seminars were conducted to introduce the wider community to the Guidelines and ensure a steady source of experience to support continuing development. As more scholars became acquainted with the Guidelines, comments, corrections, and requests for extensions arrived from around the world. In the end there were nearly 200 scholars from many disciplines, professions, and countries in the core group that was developing the TEI Guidelines.

The impact of the TEI on digital scholarship has been enormous. Today, the TEI is internationally recognized as a critically important tool, both for the long-term preservation of electronic data, and as a means of supporting effective usage of such data in many subject areas. It is the encoding scheme of choice for the production of critical and scholarly editions of literary texts, for scholarly reference works and large linguistic corpora, and for the management and production of detailed metadata associated with electronic text and cultural heritage collections of many types.

The TEI's recommendations have been endorsed by many organizations, including the US National Endowment for the Humanities, the UK's Arts and Humanities Research Board, the Modern Language Association, the European Union's Expert Advisory Group for Language Engineering Standards, and many other agencies around the world that fund or promote digital library and electronic text projects. Recognizing its importance in the emerging digital library community, the Library of Congress has produced guidelines for best practice in applying the TEI metadata recommendations for interoperability with other standards.

For More on the History of TEI

The TEI Consortium

In January of 1999, the University of Virginia and the University of Bergen (Norway) presented a proposal to the TEI Executive Committee for the creation of an international membership organization, to be known as the TEI Consortium, which would maintain, continue developing, and promote the TEI. This proposal was accepted by the TEI Executive Committee, and shortly thereafter, Virginia and Bergen added two other host institutions with longstanding ties to the TEI: Brown University and Oxford University.

The goal of establishing the TEI Consortium was to maintain a permanent home for the TEI as a democratically constituted, academically and economically independent, self-sustaining, non-profit organization. In addition, the TEI Consortium was intended to foster a broad-based user community with sustained involvement in the future development and widespread use of the TEI Guidelines. In both of these goals the creation of the Consortium has proven a positive step. Inasmuch as the original goal of the TEI was to promote collaborative research on electronic texts, by making the encoding system no longer an obstacle to such work, the Consortium's efforts are similarly directed towards making the TEI encoding system as effective a tool for creating, archiving, and sharing textual data as possible. For its members, the TEI Consortium provides valuable services to assist them in the creation and use of digital resources, and to help them stay abreast of rapidly changing technologies and practices.

Quick Links

The TEI Guidelines

The TEI Guidelines for Electronic Text Encoding and Interchange define and document a markup language for representing the structural, renditional, and conceptual features of texts. They focus (though not exclusively) on the encoding of documents in the humanities and social sciences, and in particular on the representation of primary source materials for research and analysis. These guidelines are expressed as a modular, extensible XML schema, accompanied by detailed documentation, and are published under an open-source license. The Guidelines are maintained and developed by the TEI Consortium, through its Council and editors, with the support and participation of the TEI community.

Related Guides

Hot News! Cool Tools!

Got a TEI project that needs to be documented?  Got authors working on an encoding project -- and you are trying to coordinate?

From the University of Rochester -- this new open source tool is designed just for you!  Check out -- Data Dictionary Generator

data dictionary