Introduction

to the field of Digital Editing1



Basic Overview of Digital Editing

This textbook includes a glossary defining many of the terms used here in addition to the explanations offered in each tutorial. If you are new to coding, I suggest reading this page slowly, reminding yourself of the definitions of terms and acronyms by clicking on the dictionary icon (a link to the term's definition) as you go and/or by keeping open a smaller window containing the alphabetical list of terms, off to the side.

Be patient: you are learning a new language. Take this tutorial slowly, making sure that you understand what is being said, even if ultimately some of the definitions of terms start to sound tautologous and/or seem to end in a disatisfying place. Also, take lots of breaks to allow yourself to process information. You are learning more than you know as you read, and eventually, all these terms will seem so obvious to you as to need no definition.




As we all know, printed books are made up of parts with which we are very familiar and know almost instinctively how to use. parts of a printed text

physical parts of a book


We also almost instinctively repress any knowledge that the physical and paratextual parts of a printed book contribute to its meaning.3


paratext


And, of course, just as there are parts of a printed book, digital editions are comprised of parts:


parts of a digital edition


No single edition has to have all these parts, of course. What is almost more important than the parts seen by users — readers of the digital edition — are the parts of it at work behind the scenes. Categorizing data into code elements manipulated by programming, search categories, and/or database fields channels interpretive possibilities.4



After our experience in the aughts with digital editions that no longer work because code has changed, we want to make sure that the digital edition is encoded for longevity. It needs to be "archival quality," by which I mean only that libraries could take the digital edition into their collections as a digital asset.

an image of library servers

Beginning with the California Digital Libraries and the Brown Women Writers Project (now at Northeastern University), scholars have agreed upon, used, and together developed codea link to the term's definition for digitizing texts in order to preserve documents crucial to understanding our cultural heritage.


The TEI code used to create digital editions has been developed by the Text Enoding Initiative Consortium.a link to the term's definition The group has created tag, element, and attribute namesa link to the term's definition written in XML code (eXtensible Markup Language).a link to the term's definition homepage for the TEI consortium


Why did digital humanists decide to create tag namesa link to the term's definition and encode documents in XML?5 XML code is readable by humans as well as machines — computers executing programsa link to the term's definition — which means that, even if no one is alive who knows how the code works, they will be able to understand what it contains and, better yet, be able to easily re-purpose it for use in newer technologies.


XML code consists of words, letters, and numerals that are most often not arbitrary — that make sense semantically to human readers. Therefore, in order to distinguish the content of the edition (poetry, drama, etc.) from code, codes are enclosed in angle brackets.



There are three types of code inside angle brackets:
element or taga link to the term's definition
attributea link to the term's definition
value
A value is text that adds information to the element. In some cases, the attribute value is a Path.a link to the term's definition
a diagram showing elements, attributes, and values in xml code

The image above is slightly misleading insofar as, in XML, tags must be opened and then closed using a forward slash:

<div type="essay">...some portion of the digital edition typed here...</div>

Library quality, scholarly digital editions are built in XML (eXtensible Markup Language). a link to the term's definition A minimal XML document contains a root element a link to the term's definition with another element or tag nested inside. a link to the term's definition


Here you see a div tag, short for "division." In XML, every tag is opened and then closed. Close tags are just like open tags—in both cases, angle brackets enclose the element name. But close tags are distinct insofar as they also contain a forward slash just before the element name.a link to the term's definition The content, the text of your edition, goes inside—not inside the angle brackets, but rather between the open and close tags.


an image of a root element with a div element inside


To be clear: the element name that is part of the XML code goes inside angle brackets: <title>. The chunk of text marked by the element goes between the open and close tags:
<title>The Handmaid's Tale</title>


XML code must also be properly nested, as explained in detail in the textbook dictionary.a link to the term's definition


To the left, the code says that a division (div) contains a poem with a title at the head of it made up of line groups (lg) that are stanzas.



Here is what a bit of XML code — coded according to the TEI Guidelines — looks like in the oXygena link to the term's definition softwarea link to the term's definition often used by digital editors.



Word Perfect for DOS, 1980s XML code describes features of the text that it represents; it doesn't care at all about presenting a pleasing, readable view of it to readers.6 As anyone knows who has ever used WordPerfect, our apps and software obsolesce as soon as the commpanies that create them go out of business. Because XML is not owned by any company, and because it is sheerly descriptive, it requires the use of programming to transform it into something presentable, readable, and usable for the software of the moment: right now, browsers (Chrome, Safari, etc.).a link to the term's definition



The code used to create web pages, while in the same family,a link to the term's definition differs from XML. But using a computer program to transform XML into another language for the sake of presenting it online seems like a waste of time.

writing a program to pass the salt


Why not just code your digital edition in the code that browsers can read and present to viewers, skipping altogether the laborious process of coding in XML? Why do libraries prefer storing TEI/XML, and why do granting agencies often require it?


Because in 20, 30, or 100 years, instead of reading web pages on the Internet, we may be inserting chips into our brains. XML code can be transformed by a computer program into the code on those chips, whatever it may be.However, the code used to create web pages, while in the same family, differs from XML.

a picture composed for Gibson, Neuromancer


You may already know that your browser, whether Chrome, Safari, FireFox, or Edge, is a piece of software. a link to the term's definition It reads code and then transforms that code into the pages you see when searching the web.

You can open any files that live on your own computer using your browser, but it won't make much sense of them unless they are written in Hypertext Markup Language or HTML.a link to the term's definition7 Immediately below, you can see the HTML code for an edition of Robert Bloomfield's letters and poems: code on a server, identified by the URL or web addressa link to the term's definition, is displayed in a browser for user-readers.


Server sending code to a desktop computer



Yes, Bloomfield did spell imagination with two m's!! The screen shows a web page because your browser is reading the HTML from a server that is accessible to the Internet.


Although the two markup languagesa link to the term's definition are in the same family, HTML code differs from XML code: the only HTML tag that indicating that a group of words constitute a title appears in the head of the HTML document. This "title" appears on a browser tab, not in the document itself. The content that will appear ON the web page is in its "body." html code


But instead of "title" and "author" tags inside the HTML "body," we have only a heading one (h1) code telling the browser to make the words inside the tag appear as the biggest words on the page, and a heading two (h2) code telling browser to make the words big but not as big as h1. Furthermore, the heading tags (h1, h2, h3, h4, etc.) may contain ANYTHING. A progam searching the web will have no idea that either of these tags contains a title or an author.


The browser software makes the HTML code above look like this image (to the right) whether you open the web page — that is, a page that ends with .html8 — on your computer, as I have done here, or whether you, an anonymous user, go to the HTML / web page on the Internet. user view of html code


Stick with me; we're coming down the home stretch. Only two more components of creating digital editions remain to be discussed.


The plain web page for The Handmaid's Tale, above right, looks pretty awful. There is a kind of code for adding color and style: Cascading StyleSheets or CSS.a link to the term's definition To the right, you can see the effects of the CSS code I wrote for this web page. a web page styled with css


If you go to the CSS Zen Garden website,link out you will find a host of web pages that look VERY different:

web pages to see at CSS Zen Garden one sample at CSS Zen Garden another sample at CSS Zen Garden


As hard as it is to believe, each one of these pages is made from EXACTLY the same HTML code and contents. Only the CSS code differs. The site is designed to let you download all the css files to see the code created by the web designers, and I have stolen bits from here and there. But using other people's code, referred to in the code itself, is a compliment.

The LAST component needed for creating an archival-quality digital edition is the programming that will transform the XML document created using TEI tags into the HTML document which will "call" the CSS files you have created to style it.9 decorative


The programming language typically used is XSLT (eXtensible Stylesheet Transformation Language)a link to the term's definition, a language which makes use of XPath.a link to the term's definition XSLT can transform an XML page into HTML, text files, metadata records, spreadsheets, ePubs, and PDFs.


This textbook is introductory: you need only learn TEI encoding in order to use the DigEd system to publish your digital edition. In other words, you do NOT have to learn to code in HTML and css, nor to create XSL Transforms, for three reasons:

  1. The software that we will use for coding in TEI, namely oXygen,a link to the term's definition contains built in XSLTs, and this textbook teaches you how to use them.
  2. This open access textbook is written for people who do not want to become web designers or programmers, offering CSS and XSLT files (already made) to use and/or tweak for your digital edition (the "DigEd system").
  3. The textbook comes with support from its author: when the code and programs in the DigEd system don't work as you and I expected, I will troubleshoot thema link to the term's definition with you.

That said, this textbook teaches XSLT programming as well as coding in TEI/XML, HTML, and CSS. A beginner might simply wish to use the codes and programs provided here without trying to understand how they work.10 An intermediate learner may wish to go through the tutorials so that they can learn to "read" the code and programming, and perhaps tweak it for their own needs, but not actually write any of it from scratch. But you can, based upon these tutorials, write in TEI / XML, HTML, CSS, XPath, and XSLT. And expertise can be achieved through practice as well as making use of the many resources available on the Internet for learning to code and program, discussed in the Introductory Essay.



To summarize, the image below shows the full digital editing workflow: a TEI / XML master document is created which represents the physical documents that you are editing. The TEI master is then transformed into HTML, the web page displayed to viewers visiting your website, rendered aesthetically pleasing with the help of CSS.

the workflow for creating digital editions



Back to the top

1. For information about how this textbook and DigEd publishing system works, see About. The type of digital editions created using this DigEd system and textbook are documentary editions insofar as we duplicate in code everything we see in front of us on the pages of the book or manuscript or letter. For more inforamtion, see the Association of Documentary Editinglink out; Mary-Jo Kline, Linda Johnson, A Guide to Documentary Editing Mary-Jo Kline and Linda Johanson, 2nd ed., (Baltimore, MD: Johns Hopkins University Press, 1998), and Further Reading. Back

2. The introduction has been written in TEI, the TEI code available by clicking on the TEI icon at the top right of the document, or here. Back

3. On the difficulty of "remembering" mediation of any kind, see Katherine Bode, "What's the Matter with Computational Literary Studies," Critical Inquiry 49.4 (Summer 2023): 507-529, p. 515. Back

4. Johanna Drucker, Graphesis: Visual Forms of Knowledge Production (Cambridge, Mass.: Harvard University Press, 2014); Yannie Alexander Loukissas, All Data Are Local: Thinking Critically in a Data-Driven Society (Cambridge, Mass.: MIT Press, 2019). Back

5. When it first started, TEI was originally written in SGML (Standard Generalized Markup Language).a link to the term's definition The consortium decided to use XML for the release of Protocol 4. Back

6. In reality, it is impossible to separate description from presentation. See Alan Liu, "Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse," Critical Inquiry 31 (August 2004): pp. 49-84. Back

7. Nowadays browsers can handle pdfs and other document types as well, but they do so through browser extensions.a link to the term's definition Back

8. dropdown menu on browser for opening a file on your computerTo see file extensions — a necessity for creating a digital edition — you will need to change the default settings in your finder window, or, if on a PC, under "view" in File Explorer (see the instructions under the term "File Extensions."a link to the term's definition) To open an .html file on your computer, instead of going to a web page on the Internet, click on "File" in the top toolbar, next to the browser's name, and the select "Open File." Back

9. I'm lying: this textbook will also show you how to: 1) add a search engine; 2) encode names and places that optimizes the capacity of web search engines such as Google or Bing to find your documents; and 3) add the capacity for users to annotate your editions for themselves using hypothes.islink out which you can see in action here. Back

10. In the TEI section of this textbook, I show you how to use LEAF-Writer which makes encoding archival-quality digital editions even easier than does oXygen. Back