Introduction
to the field of Digital Editing1
- Basic Overview of the digtial editing process (see below).
- Acronyms and Terms (a smaller version to keep open as you read)
- Introductory Essay:2
Basic Overview of Digital Editing
This textbook includes a glossary defining many
of the terms used here in addition to the explanations offered in
each tutorial. If you are new to coding, I suggest reading this page slowly,
reminding yourself of the definitions of terms and acronyms by clicking on
the dictionary icon () as you
go and/or by keeping open a smaller window
containing the alphabetical list of terms, off to the side.
Be
patient: you are learning a new language. Take this tutorial slowly, making
sure that you understand what is being said, even if ultimately some of the
definitions of terms start to sound tautologous and/or seem to end in a
disatisfying place. Also, take lots of breaks to allow yourself to process
information. You are learning more than you know as you read, and
eventually, all these terms will seem so obvious to you as to need no
definition.
As we all know, printed books are made up of parts with
which we are very familiar and know almost instinctively how to
use.
We also almost instinctively repress any knowledge
that the physical and paratextual parts of a printed book contribute to
its meaning.3
And, of course, just as there are parts of a printed
book, digital editions are comprised of parts:
No single edition has to have all these parts, of
course. What is almost more important than the parts seen by users —
readers of the digital edition — are the parts of it at work behind
the scenes. Categorizing data into code elements manipulated by programming,
search categories, and/or database fields channels interpretive
possibilities.4
After our experience in the aughts with digital editions
that no longer work because code has changed, we want to make sure that
the digital edition is encoded for longevity. It needs to be "archival
quality," by which I mean only that libraries could take the digital
edition into their collections as a digital asset.
Beginning with the California Digital Libraries and the
Brown Women Writers Project (now at Northeastern University), scholars have
agreed upon, used, and together developed code for digitizing texts in order to
preserve documents crucial to understanding our cultural heritage.
The TEI code used to create digital editions has been
developed by the Text Enoding Initiative Consortium. The group has created tag, element, and attribute names
written in XML code (eXtensible Markup Language).
Why did digital humanists decide to create tag names and encode documents in XML?5 XML code is readable by humans
as well as machines — computers executing programs
— which means that, even if no
one is alive who knows how the code works, they will be able to understand
what it contains and, better yet, be able to easily re-purpose it for use in
newer technologies.
XML code consists of words, letters, and numerals that
are most often not arbitrary — that make sense semantically to
human readers. Therefore, in order to distinguish the content of the
edition (poetry, drama, etc.) from code, codes are enclosed in angle
brackets.
There are three types of code inside angle brackets:
element or tag
attribute
value
A value is text that adds information to the
element. In some cases, the attribute value is a Path.
The image above is slightly misleading insofar as, in XML,
tags must be opened and then closed using a forward slash:
<div
type="essay">...some
portion of the digital edition typed here...</div>
Library quality, scholarly digital editions are built in XML
(eXtensible Markup Language).
A minimal XML document contains a
root element
with another element or tag nested inside.
Here you see a div tag, short for
"division." In XML, every tag is opened and then closed. Close tags are
just like open tags—in both cases, angle brackets enclose the
element name. But close tags are distinct insofar as they also contain a
forward slash just before the element name. The content, the text of your edition, goes inside—not
inside the angle brackets, but rather between the open and close
tags.
To be clear: the element name that is part of the XML code goes
inside angle brackets:
<title>. The
chunk of text marked by the element goes between the
open and close tags:
<title>The
Handmaid's Tale</title>
XML code must also be properly nested, as explained in
detail in the textbook dictionary.
To the left, the code says that a division (div) contains
a poem with a title at the head of it made up of line groups (lg) that
are stanzas.
Here is what a bit of XML code — coded according to
the TEI Guidelines — looks like in the oXygen software
often used by digital editors.
XML code
describes features of the text that it represents;
it doesn't care at all about presenting a pleasing, readable view of it
to readers.6 As anyone knows
who has ever used WordPerfect, our apps and software obsolesce as soon
as the commpanies that create them go out of business. Because XML is
not owned by any company, and because it is sheerly descriptive, it
requires the use of programming to transform it into something
presentable, readable, and usable for the software of the moment: right
now, browsers (Chrome, Safari, etc.).
The code used to create web pages,
while in the same family, differs from XML. But using a computer program to transform
XML into another language for the sake of presenting it online seems
like a waste of time.
Why not just code your digital edition in the code
that browsers can read and present to viewers, skipping altogether the
laborious process of coding in XML? Why do libraries prefer storing TEI/XML,
and why do granting agencies often require it?
Because in 20, 30, or 100 years, instead of reading
web pages on the Internet, we may be inserting chips into our brains. XML
code can be transformed by a computer program into the code on those chips,
whatever it may be.However, the code used to create web pages, while in the
same family, differs from XML.

You may already know that your browser, whether Chrome, Safari,
FireFox, or Edge, is a piece of software.
It reads code and then transforms that code into the pages you see when
searching the web.
You can open any files that live on your own
computer using your browser, but it won't make much sense of them unless
they are written in Hypertext Markup Language or HTML.7 Immediately below, you can see the HTML code for an edition
of Robert Bloomfield's letters and poems: code on a server, identified by
the URL or web address
, is displayed in a browser for
user-readers.
Yes, Bloomfield did spell imagination with two m's!! The screen
shows a web page because your browser is reading the HTML from a server that
is accessible to the Internet.
Although the two markup languages are in the same family, HTML code differs from XML code: the
only HTML tag that indicating that a group of words constitute a title
appears in the head of the HTML document. This "title" appears on a
browser tab, not in the document itself. The content that will appear ON
the web page is in its "body."
But instead of "title" and "author" tags inside the
HTML "body," we have only a heading one (h1) code telling
the browser to make the words inside the tag appear as the biggest words on
the page, and a heading two (h2) code telling browser to
make the words big but not as big as h1. Furthermore, the
heading tags (h1, h2, h3, h4, etc.) may contain ANYTHING. A progam searching
the web will have no idea that either of these tags contains a title or an
author.
The browser software makes the HTML code above look like
this image (to the right) whether you open the web page — that is,
a page that ends with .html8 — on your computer, as I have
done here, or whether you, an anonymous user, go to the HTML / web page
on the Internet.
Stick with me; we're coming down the home stretch. Only two more
components of creating digital editions remain to be discussed.
The plain web page for The Handmaid's Tale,
above right, looks pretty awful. There is a kind of code for adding
color and style: Cascading StyleSheets or CSS. To the right, you can see the effects of the CSS code I
wrote for this web page.
If you go to the CSS Zen Garden website, you will find a host of web pages that
look VERY different:



As hard as it is to believe, each one of these pages
is made from EXACTLY the same HTML code and contents. Only the CSS code
differs. The site is designed to let you download all the css files to see
the code created by the web designers, and I have stolen bits from here and
there. But using other people's code, referred to in the code itself, is a
compliment.
The LAST component needed for creating an
archival-quality digital edition is the programming that will transform
the XML document created using TEI tags into the HTML document which
will "call" the CSS files you have created to style it.9
The programming language typically used is XSLT
(eXtensible Stylesheet Transformation Language), a language which makes use of XPath.
XSLT can transform an XML page into HTML, text files,
metadata records, spreadsheets, ePubs, and PDFs.
This textbook is introductory: you need only learn TEI
encoding in order to use the DigEd system to publish your digital edition.
In other words, you do NOT have to learn to code
in HTML and css, nor to create XSL Transforms, for three reasons:
- The software that we will use for coding in TEI, namely oXygen,
contains built in XSLTs, and this textbook teaches you how to use them.
- This open access textbook is written for people who do not want to become web designers or programmers, offering CSS and XSLT files (already made) to use and/or tweak for your digital edition (the "DigEd system").
- The textbook comes with support from its author: when the code and
programs in the DigEd system don't work as you and I expected, I will
troubleshoot them
with you.
That said, this textbook teaches XSLT programming as well as coding in TEI/XML, HTML, and CSS. A beginner might simply wish to use the codes and programs provided here without trying to understand how they work.10 An intermediate learner may wish to go through the tutorials so that they can learn to "read" the code and programming, and perhaps tweak it for their own needs, but not actually write any of it from scratch. But you can, based upon these tutorials, write in TEI / XML, HTML, CSS, XPath, and XSLT. And expertise can be achieved through practice as well as making use of the many resources available on the Internet for learning to code and program, discussed in the Introductory Essay.
To summarize, the image below shows the full digital editing workflow: a
TEI / XML master document is created which represents the physical documents
that you are editing. The TEI master is then transformed into HTML, the web
page displayed to viewers visiting your website, rendered aesthetically
pleasing with the help of CSS.

Back to the top
1. For information about how this textbook and DigEd publishing system
works, see About. The type of digital editions
created using this DigEd system and textbook are documentary editions insofar as
we duplicate in code everything we see in front of us on the pages of the book
or manuscript or letter. For more inforamtion, see the Association of
Documentary Editing; Mary-Jo Kline, Linda Johnson, A Guide to Documentary Editing
Mary-Jo Kline and Linda Johanson, 2nd ed., (Baltimore, MD: Johns
Hopkins University Press, 1998), and Further
Reading. Back
2. The introduction has been written in TEI, the TEI code available by clicking on the TEI icon at the top right of the document, or here. Back
3. On the difficulty of "remembering" mediation of any kind, see Katherine Bode, "What's the Matter with Computational Literary Studies," Critical Inquiry 49.4 (Summer 2023): 507-529, p. 515. Back
4. Johanna Drucker, Graphesis: Visual Forms of Knowledge Production (Cambridge, Mass.: Harvard University Press, 2014); Yannie Alexander Loukissas, All Data Are Local: Thinking Critically in a Data-Driven Society (Cambridge, Mass.: MIT Press, 2019). Back
5. When it first started, TEI was originally written in SGML (Standard
Generalized Markup Language). The consortium decided to use XML for the
release of Protocol 4. Back
6. In reality, it is impossible to separate description from presentation. See Alan Liu, "Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse," Critical Inquiry 31 (August 2004): pp. 49-84. Back
7. Nowadays browsers can handle pdfs and other document types as well,
but they do so through browser extensions.
Back
8. To see
file extensions — a necessity for creating a digital edition — you
will need to change the default settings in your finder window, or, if on a PC,
under "view" in File Explorer (see the instructions under the term "File
Extensions."
) To open an .html file on
your computer, instead of going to a web page on the Internet, click on "File"
in the top toolbar, next to the browser's name, and the select "Open File." Back
9. I'm lying: this textbook will also show you how to: 1) add a search
engine; 2) encode names and places that optimizes the capacity of web search
engines such as Google or Bing to find your documents; and 3) add the capacity
for users to annotate your editions for themselves using hypothes.is which you can see in action here.
Back
10. In the TEI section of this textbook, I show you how to use LEAF-Writer which makes encoding archival-quality digital editions even easier than does oXygen. Back