Reverse detail from Kakelbont MS 1, a fifteenth-century French Psalter. This image is in the public domain. Daniel Paul O'Donnell

Forward to Navigation

Updating Textpattern and Solving a Rewrite Problem

Posted: Mar 29, 2017 18:03;
Last Modified: Mar 29, 2017 18:03

Tags: , , ,

---

Just updated my CMS (Textpattern) to the latest version (6.4.2). I had to because the University just updated the PHP on the server and this broke the old install.

Everything worked great except for one thing: I could get it to work if I put the full URL to the index page in (i.e. http://people.uleth.ca/~daniel.odonnell/index.php); but it didn’t work if I just put in the top-level directory (i.e. http://people.uleth.ca/~daniel.odonnell/). Links to other pages also didn’t work.

The error I got came from Zope, the server. And it said that it it couldn’t find http://people.uleth.ca/People.

This looks like a rewrite error. After Googling around and experimenting, I found that the issue was in my .htaccess file, which is provided by Textpattern. Basically I did the following:

  1. uncomment #RewriteBase
  2. replace Path/To/Site (or similar) in the same line with the top-level directory for my site (i.e. what comes after people.uleth.ca, or, in my case /~daniel.odonnell
  3. save and reload
----  

Cædmon Citation Network - Week 5

Posted: Jun 17, 2016 10:06;
Last Modified: Jun 17, 2016 10:06

Tags: , , , , , , , , , , , , ,

---

Hi all!

Painfully short blog entry this week, I’m afraid. A lot has been accomplished this week, but there is not a lot to report.

The bibliography has been completed, with the final count being approximately 700 pieces of Cædmon scholarship. This number may increase or decrease as I read through the actual works. Some may have nothing to do with Cædmon (I erred on the side of having to much rather than too little), and others may point me in the direction of something I might have missed.

I have also begun to search out access to the pieces that make up the bibliography. This week I have been finding most things on JSTOR, but I expect that I will be requesting a lot of inter-library loans next week! Once I have found all I can find online I can start reading while I wait for the inter-library loans to come in. As the loans come in I will be splitting my days between scanning the loans and reading. (Note to self: locate that book scanner Dan told you about.)

The database to record what I find while I read is in the works as well. I should have an update from Garret early next week, so I will have more info on that in next week’s blog!

Until then!

Colleen

----  

Problems with Cisco Anyconnect on Ubuntu 14.04 (Breaks Internet Connections)

Posted: Jan 11, 2015 22:01;
Last Modified: Jan 11, 2015 22:01

Tags: , , , ,

---

This blog is about resolving an issue I had after installing Cisco Anyconnect, the U of L’s VPN client.

This is an aide memoire for me, but might be useful to others. The information comes from, with the first being most useful for this particular case:

Contents

The symptoms

The U of L uses Cisco Anyconnect as its VPN client. I installed it two days ago (stupidly, while travelling). This produced a problem where I couldn’t access the internet: I could log in to a SSD, but couldn’t ping any sites, and none of my webbrowesers could resolve or connect to any hosts.

#h3(#diagnosis). The diagnosis

The problem is that anyconnect rewrites /etc/resolv.conf.

The original /etc/resolv.conf is a link to /run/resolv.conf/ and /run/resolvconf/resolv.conf@ contains a local address nameserver (in my case 127.0.1.1, others report 127.0.0.1).

Anyconnect backs this file up (whew!) as /etc/resolv.conf.vpnbackup and replaces it with a new resolv.conf that contains a number of different nameservers in the uleth domain (i.e. 142....).

The solution

Things that don’t work

These are the things I tried that don’t work (in the order I tried them).

What works

Because anyconnect backs things up, all you need to do is the following:

  1. cd to /etc/
  2. check that the situation matches what I’m reporting (i.e. that there are two resolv.conf files, resolv.conf and resolv.conf.backupvpn or similar.
  3. rename the current resolv.conf: mv resolv.conf resolv.conf.CISCO
  4. rename the current resolv.conf.backupvpn (or similar): mv resolv.conf.backupvpn resolv.conf
  5. check that the (now) current resolv.conf is a link to /run/resolvconf/resolv.conf by running ls -l resolv.conf on /etc/ (if it is a link, the line will include an arrow showing what it is pointing at).
  6. check that the nameserver in resolv.conf is a local address (127...).
----  

Web browsers

Posted: Nov 30, 2014 16:11;
Last Modified: Nov 30, 2014 16:11

Tags: , , ,

---

Web browsers are (quite literally) the defining feature of the World Wide Web, which was invented when Tim Berners-Lee released the first version of his HTML browser (World Wide Web) on Christmas day 1989. In other words, they are what makes the web the web.

For a variety of historical reasons, users tend to treat web browsers as utility-grade software—a part of the operating system they expect our devices to have already installed rather than a piece of software you choose to install and run. But more than one kind of browser exists and there are differences between them. Sometimes one browser is better than another for certain tasks or sites. You should know what browser you are using and you should make sure you have some alternates installed.

Contents

What browser am I using?

It is entirely possible that you don’t know what browser you are using to access the web (including this page). If you don’t, you can find out here: whatbrowser.org. This page will tell you which browser you are using, whether it is up-to-date and what other options (if any) exist for your operating system.

What browsers are available?

Browsers can be programs you have to start up on your own or built into other programs or applications. The number of choices available to you depends on the operating system you are using (i.e. Mac OS, Windows, Linux, Android, iOS, and so on). The most common full-service browsers are:

What does a web browser do?

A web browser is a program that allows you to retrieve, display, and traverse (i.e. navigate) information resources on the World Wide Web (see the Wikipedia article on this).

We think of this as a completely transparent activity, but if you think about it for a second, it is actually a pretty complex idea: a typical web page will have text, images, tables, and perhaps animations, videos, sounds clips, and so on. If you were composing these things, you’d expect to use different software for each: a word processor like Libre Office Text or Word for your text files, a graphics program like GIMP or Photoshop for graphics, a video or sound editor like Audacity for your sound and video files; you wouldn’t expect to have a single “maker program” that allowed you to do everything all at the same time. And yet when you view a web page (really a complex set of text and links to different files and locations on the web) you expect them all to appear on the right place on the screen simultaneously in the way the web designer expected. (Although the very first web browser—Tim Berners-Lee’s World Wide Web—could display text and images together, the earliest widely available browsers—such as Lynx or Gopher—could not. One of the reasons Netscape Navigator proved so popular in the early days of the web was the fact that it could display images and text).

Browsers do what they do through a number of rendering engines and interpreters. When you “visit” a web page (a metaphor we use for the act of downloading files via our browser), the browser reads through the main page at the location and follows links to other resources it is supposed to include: text, images, fragments of text from other sites (i.e. the ads), and meta information about the text and resources it is to display. It then processes these through interpreters and rendering engines and displays them on your screen. It can also place small fragments of code (known as cookies) on your computer to help it remember things about you.

If browsers are free, how do they make money?

Given how complicated browsers are, it is kind of odd that we don’t have to pay for them. After all, we expect to pay for other kinds of software, including our operating systems and programs like Word or Photoshop. And even if we don’t expect to pay for them (as we don’t with Open Source Operating Systems and programs like Linux, Open Office, or GIMP), we expect that to be seen as an alternate business model.

The original business model for browsers assumed that people would pay for them. The original model for Netscape Navigator assumed that the browser would be sold to businesses. At the time, browsers for other computer languages (such as SGML) had to be purchased.

This business model failed to take off, however, as the major browser designers competed for market share (the idea, presumably being to kill off the competition before coming back to charge for use). This led to the browser wars period in which browsers competed against each other in terms of the features they offered (this was worse than it seems: they competed in part by trying to come up with “proprietary features”—i.e. features that only worked with their software. Had this continued to its natural conclusion, the web would have been divided into sites that worked exclusively with one browser or another).

Nowadays, most browsers are funded by donations (e.g. Mozilla, the non-profit organisation behind Firefox) or by selling default access to search engine companies (i.e. the service that your browser chooses to use when you type text into the URL bard). Because the different search engines sell ads, and because most people don’t really think about what happens when they type text into their browsers, it is in the search companies’ interests to ensure that the default setting in your browser is set to point you at them.

Why should I care what browser I am using? Why should I consider having more than one on my system?

Because they are so ubiquitous and so crucial to what we do with our computers, most people don’t give their choice of browser much thought. They use the one that is easiest to access on their computer—usually the one built by the OS manufacturer. This means Internet Explorer on Windows, Safari on Apple, and Chrome on Android and Chromebooks.

There is nothing wrong with this. But it is a good idea to have more than one on your system.

This is because different browsers have different rendering and interpretation engines and understand web pages in slightly different ways. Some browsers are more efficient that others (meaning web pages will load more quickly), some are more standards compliant (meaning they will work better with a variety of different sites), some are better at rendering certain kinds of content than others. Some have more or better plugins. Having more than one browser on your system will give you some options if a website seems sluggish, slow, or looks funny. In some cases (thankfully generally quite rare nowadays) a site will be designed to work with the features of only one browser.

Which is the best?

Each browser has its own strengths and weaknesses. Internet Explorer is famously buggy and non-standards compliant; but many commercial applications are written specifically for it (my university’s financial software, for example, used to work exclusively on Internet Explorer). Firefox is often said to have the most add-ons and plug-ins (Zotero, for example, which is a citation manager I shall be recommending in a future post, runs best in Firefox). Chrome and Opera are both fast, good looking, and easy-to-use browsers that use the same rendering engine. Safari the default browser in Apple is said to be weaker than others in terms of its speed, accuracy, and the number of plugins and addons it supports.

Personally, I prefer using Firefox, Chrome, or Opera. My default browser is usually Firefox, but occasionally I get tired of that and use Chrome. I do generally recommend that students stay away from Internet Explorer and Safari. In my experience they tend to be noticeably poorer than browsers that are not able to depend for market share on the fact that they are made by the same people who wrote the Operating System your computer uses. If you do any web-programming at all, you should definitely avoid using Internet Explorer as your default, as it is often extremely buggy—but you should also check how your site appears on Explorer for the same reason.

Tips for getting the most out of my browser?

There are various small techniques for getting the most out of your browser, which ever one you use. Search for “[your browser name] tips” to find out quick ways of closing all tabs but the one you are looking at, recovering recent history, organising book marks and so on. Some of these can be worth learning.

Some useful background material is available at techmadeplain.com. For example,

Other useful sites include the Sunspider (a benchmarking site you can use to compare web browsers on your computer in terms of speed and memory use) and “What browser”: http://whatbrowser.org/, which will tell you what you are using, whether it is up-to-date, and what other options are available.

----  

Essential computer tools and skills for humanities students

Posted: Nov 30, 2014 15:11;
Last Modified: Dec 27, 2014 22:12

Tags: , , , , , ,

---

The Digital Humanities is a hot new field within the Arts. Its practitioners are often at the forefront of developing new topics within ICT itself.

But what about if you are not interested in the Digital Humanities? Or are interested in them, but don’t consider yourself particularly computer literate? What are the computer skills you need to thrive in the traditional humanities or get started in DH?

This is the first in what I hope will be a series of tutorials on basic computer skills and tools for students of the Humanities. It should be of use to those just beginning their undergraduate careers, for graduate students hoping to professionalise their research and study, and for researchers and teachers who have other things to do that follow the latest trends and software.

Contents

What kind of thing can I learn from this series?

The focus of this series is going to be on basic tools. It is going to assume you know nothing other than how to turn on a computer and get on the internet. It will make some recommendations about basic software, starting with such simple things as browsers. It will also cover some basic techniques: how to use styles in word processors, how to use a citation manager or spreadsheet.

How often will they appear?

I’m going to mark this as a special cluster in my blog (using a special tag, basic computer skills). But I’ll publish them irregularly, as the mood strikes and I have the time. I’m also hoping to get some guest authors involved. Mostly students who have done presentations on these things in my classes.

What if I have an idea for a tutorial? What if I disagree with you?

If you have an idea for an article in this series, I’d love to hear from you. If you have already written something on a topic I’m covering and would like me to know about it or link to you, please let me know as well!

Articles in this series

The following are links to the other articles in this series. You can also find them using the tag basic computer skills

----  

Code for table of contents in text pattern

Posted: Nov 01, 2014 15:11;
Last Modified: Dec 27, 2014 15:12

Tags: , , ,

---

Use the following to put in a table of contents in a text pattern page.

<div id="TOC">
   <txp:soo_toc label="Contents" labeltag="h3"/>
</div>

The code will build a TOC for every header that has an IDREF. An example would look like this: h3(#thisIsTheID).

----  

Managing class webpages and mailing lists at the University of Lethbridge

Posted: Aug 26, 2014 11:08;
Last Modified: Sep 16, 2015 12:09

Tags: , , , , ,

---

For years, every class at the University of Lethbridge has been given webspace and a mailing list. The now also get a Moodle space. While the mailing list and Moodle space is well-known to instructors (it is the list “XXXXNNNNx@uleth.ca” that you use to make announcements to the class as a whole), the webspace is far less well known. This document (mostly a reminder to myself) shows you how you can use online tools to manage these resources.

Contents

If you do nothing

First thing is to realise what happens if you do nothing. A student you has found your course online through the registrar’s office and wants to know more about your section goes through the following depressing sequence:

Default Sequence of Class Websites

The thing to realise is that this is bad for everybody. It tells the student nothing, meaning they might decide not to take your course (and even if they do, poor websites leave a bad impression). But if they persist, it is going to mean more work for you: the only thing they can do to find out what they were looking for is back up one page, then following the links for the instructor until they find your email address and send you an email asking about something you could have easily posted online.

So it is a good idea to get in the habit of fixing this space… even (and perhaps especially if) you have a class webspace elsewhere on the internet. This is a first port of call for many students. You can easily make it a helpful one.

Login

To manage your classes, you first need to login to the classes.uleth.ca admin page: classes.uleth.ca/ClassAdmin

There you will see the following login page:

Class Administration Portal

A successful login will take you to a splash page which, apparently, shows you the current (or most recent) and upcoming semesters:

Class Administration Splash

It is from this page that you will manage your mailing lists and class webpages.

Managing your class webpage

First thing to do is manage your class webpage.

You have three options here:

  1. delegate it to somebody else on campus (a student, the department administrator, etc.)
  2. redirect it to some other URL (e.g. an off campus blog, your on-campus personal space (people.uleth.ca/~$USERNAME)
  3. default to the current page (in which case you will add something to the current space)

Class Administration Web Page Management

Upload pages to your default webspace

The most difficult if the third option. This will require you to upload individual HTML pages to the space for this one class—and do it again year after year. If you want to post a PDF there, then you have to upload at least two pages (and maintain them by hand): an HTML page explaining something about the site and containing a link to the PDF page, and the PDF. This is very 1995 and so not something you want to get started on.

You so don’t want to do this, that I’m not even going to say how. If you really want to, call 2490 and ask IT for help. But seriously, you don’t want to do this.

Delegate to somebody else

This is really easy: you simply enter the uleth.ca username of the person you want to maintain the site (i.e. the bit before the @ in a uleth.ca email address). When you click save, this person now can manage your site for you.

This is just punting the problem, of course: the big difference is that now you delegate has to decide whether to upload a single page (which they probably still shouldn’t do, even if that is no longer your problem) or redirect somewhere else.

Redirect to another webspace

This is probably the best option: point the class space to somewhere else where it is easier to manage things. This could be an external blog that you use to manage your teaching (e.g. at wordpress.com or some other blog site), your personal uleth webspace (i.e. at people.uleth.ca/~$USERNAME), or even your class Moodle or Turnitin site.

Mailing list management

You can also manage your mailing list from here. You can change the posting permissions and the membership.

Class Administration Mailing List Page

Posting permissions

Your options here are

  1. Anybody on the entire internet can post to your class mailing list
  2. Anybody who subscribes to your class mailing list (normally the instructor(s), T.A.s, and all registered students) can post to the list
  3. Only Instructors can post to the list

The first option is an invitation to spammers and should only be used under very special circumstances—so special in fact that I can’t think of any.

The second option is the default option and it works well for most.

The third option makes sense if you have trouble with students misbehaving on the list (e.g. sending spam or unauthorised messages) or if you want to deemphasise the list in favour of some other communication platform (e.g. the blog and forum capabilities in Moodle). If you select this, then the list becomes a one-way channel, useful for announcements for which you don’t want any feedback.

Subscription options

This is the important set of options. You can use this to add people to the default subscription list for your class (i.e. the teacher(s), T.A.(s), and registered students.

You have two options here:

  1. add additional teachers
  2. add additional students

The first option adds subscribers to the list who will have “teacher” privileges. This is only meaningful if you have set the posting privileges above to “teachers only.” Under those circumstances, any email addresses you add here still will be able to post. You might want to use this to add additional T.A.s (perhaps unofficial ones) or guest speakers to the list.

The second option is the one you are likely to use more often. This is where you can add additional, unregistered students (e.g. friends, members of the community, etc.).

If you keep the default permissions (i.e. that anybody subscribed to the list can post), then it actually doesn’t matter to which category you add people. The important thing is that you can add people to this important tool.

Adding TAs to Moodle

Another task you may need to do early on in the semester is adding TAs to Moodle. The instructions for doing that are here.

In short, however, the method is as follows:

  1. Go to the Moodle space for the class you want to add a TA to (i.e. log in to Moodle and select the class you want for your TA).
  2. Once you are inside the class, click on “Users” in the “Settings” block. On the University of Lethbridge’s default installation, this block is on the left hand side, bottom (in the default view) or second from the bottom (if editing is on).
  3. Clicking on “Users” expands the menu item. Under “Users” you will see “Enrolled Users.” Choose that.
  4. On the “Enrolled Users” dialogue screen, you will see a small button, “Enroll User” at the top of the form on the right hand side. Click that.
  5. In the dialogue that appears, select the type of user you are trying to enrol (in this case, that means Basic TA or Advanced TA) then using the search form, look for your TA’s name (they must be in the U of L’s system).
  6. After you click “search,” all users matching your search term will show up in the window. Find your TA and click on the “Enroll” button to the right of their name.
  7. Repeat the previous two steps for each TA you want to add.

When should you do this?

The best time to do this is just before the registration period opens for next semester. This is when students are going through the registrar site, looking for classes and the time when an appropriate redirect will have the maximum benefit.

----  

Two tips that will improve the lives of all students and researchers in the Humanities and Social Sciences

Posted: Aug 16, 2014 13:08;
Last Modified: Aug 16, 2014 13:08

Tags: , , , , , ,

---

Introduction

A recent question on Linked-in asked how important the formatting guides for journals are in preparing submissions.

Although this question was about submitting to journals, its context is relevant to all students and researchers in the Social Sciences and Humanities (although the problem also exists in the sciences, the solutions there are in some cases different). Humanities and Social Science study in University is largely about the collection of bibliography and the presentation of findings in written form. And that invariably involves questions of formatting: different disciplines and even different journals (or for students, instructors) within a discipline can require work to be submitted in quite different styles.

Contents

The bad old days

Twenty-five years ago, when I was an undergraduate, keeping track of and implementing these styles was a major problem: most students still typed their essays (some still even wrote them out by hand); and what wordprocessors there were were quite primitive (the first popular version of Windows, for example, and with it, the first really successful version of Word, came out the year after I went to Graduate School). Public library catalogues were still largely on paper or microfiche, and, most crucially, there was no World Wide Web.

In those days, ensuring your essay or article submission followed the correct format was a very time consuming task. And things got a lot worse if you needed to reformat something for submission elsewhere—e.g. sending a rejected article to a different journal or reformatting an essay for use as a dissertation chapter or to journal submission. Moreover, authors needed to know a number of different citation formats: APA, MLA, Harvard, and so on. There were few if any tools to help you automate this task (or if there were, I didn’t know about them). The only way of doing it accurately was to consult the relevant style guide and look for examples of the type of work you wanted to format (e.g. single-author mongraphs, chapters in edited collections, etc.).

Modern tools and practices

Things are a lot different today. Wordprocessors are much better and (free) tools exist to take care of your bibliographic management. If you still find yourself stressed by formatting tasks, it means you doing things wrong.

The next two posts will explain two basic practices and tools that anybody who works or studies the Social Sciences and Humanities should know about—and use if they would prefer to spend their time researching and writing rather than formatting their work.

The first post, on using wordprocessor styles, addresses the issue of formatting text for submission to instructors or journals for publication. It shows you how you can use the “style” function found in all popular contemporary wordprocessors to ensure consistency across the entire document: make sure all your headings are formatted the same way (or, if you have different levels of headings, that the headings at each level are formatted the same way); make sure that all block quotations have the same margins; that all paragraphs have the same first line indentations. And then change all of these across the entire document automatically and in seconds, if you discover that a particular instructor or journal wants things formatted differently.

The second post, on using citation managers, addresses the more specialised issue of collecting bibliography and formatting citations correctly. It shows you how a citation manager can take over this task almost entirely. Modern citation managers allow you to collect bibliography directly and automatically from library catalogues, many journal articles, and sites like Amazon.com (some even allow you to add books by photographing the barcodes on their dust jackets). They then integrate with your wordprocess to allow you to add citations as you write—they allow you to go through the material you have collected looking for the items you want and then, once you have found what you want, insert the citation into the text and bibliography of your essay at the click of a button. As with wordprocessor styles, moreover, citation managers also automatically handle the tediously detailed work of making sure your bibliographic entries conform to the format demanded by your instructor or journal: with a click of a button, you can change from APA to MLA to Chicago, or even, in some cases, design your own format.

Why you should care

Too many students and even professional researchers in the humanities and social sciences waste time doing unnecessary formatting tasks. By using wordprocessing styles and citation managers, it is possible to reduce the amount of effort these basic tasks require to almost nothing. If you are an advanced student or researcher, adopting these approaches will improve your efficiency several fold as soon as you can get used to the new way of working. This is especially true if you are working on a book or thesis that will involve maintaining consistency of format and citation style across a number of different chapters (in fact, if you are in that situation, I’d recommend stopping what you are doing and taking a couple of days to implement these right now—they improve things that much).

If you are just beginning your time as a university student, I recommend adopting them even more strongly: now is the time to get used to good habits that will save you time further down the road and you might be surprised how often you end up reusing citations and bibliography you acquire this year for the first time.

----  

How to do a table of contents in text pattern

Posted: Feb 19, 2014 15:02;
Last Modified: Feb 19, 2014 16:02

Tags: , , ,

---

My teaching pages are served out using Textpattern, a relatively light CMS that uses textile wiki-like markup.

Because adding an excerpt by hand wrecks the syndication of this site through Wordpress to my other blog, I don’t usually add a text summary. Instead, I do something similar to the Wikipedia or Wordpress: I begin articles with an abstract like first paragraph, then include a table of contents, then have the rest of the body.

I used to make up these tables of content by hand, cursing all the time that Textile wasn’t XML. Then I discovered soo_toc, a Textpattern plugin that builds tables of contents dynamically. Joy!

Of course, now I need to remember to add the template that calls the TOC to each page (as I type this, I wonder if there might not be a simple variable I could develop that does this, but that’s for later). In the interests of remembering, here it is:

<div id="TOC">
   <txp:soo_toc label="Contents" labeltag="h3"/>
</div>

The only downside of this plugin is that you need to have IDs attached to every header you want to show up in the TOC. You add an ID in textpattern like this: h2(#IDREF). If you are in a rush, you can always just use (#A), (#B), (#C) (remember, IDREFs can’t start with a number in XML).

----  

Mounting University of Lethbridge "P," "R," and "W" drives under Linux

Posted: Feb 19, 2014 14:02;
Last Modified: Jul 20, 2016 16:07

Tags: , , , , , , , , ,

---

Here’s how to mount “P” (Personal), “R” (shared research), “W” (web), and department/committee drives at the University of Lethbridge.

Contents

“P” drives

Your “P” drive is the windows share that represents your standard network desktop (i.e. the thing you see if you log into a classroom or other computer on campus).

The address is ulhome.uleth.ca/$USER where $USER is your account username (the same as the lefthand side of your uleth email account, or, in my case, daniel.odonnell.

ulhome is a CIFS drive. To mount it, you seem to have to use the commandline (I can’t seem to find the right protocol to use to use the GUI that comes with the file navigator in Ubuntu. I found instructions that worked for me here: http://www.tonido.com/support/display/cloud/Howto+properly+mount+a+CIFS+share+on+Linux+for+FileCloud

And, to solve the permissions problem that first arose, http://ubuntuforums.org/showthread.php?t=1409720

One-time mount

To mount the drive by hand for a single session, do the following:

  1. Make sure cifs-utils is installed
  2. Choose a mount point. This can be an existing directory (if the directory has local content, it will not be available while the network drive is mounted). Or you can create a custom mount point. I did the latter: mkdir ~/ulhome
  3. Mount the remote drive. sudo mount -t cifs -o username=$USER,password=$PWORD,rw,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777 //$REMOTEURL $MOUNTPOINT (where $USER = username; $PWORD = password; $REMOTEURL = url of CIFS drive; and $MOUNTPOINT = the directory you chose or created in step 2. Note: your IT department may not give you the full remote URL, since Windows can use the first part of the subdomain; at the U of L, for example, IT tell you the share is called \\ULHOME. I guessed it is probably in the University’s main domain and was correct: \\ULHOME is the same as //ulhome.uleth.ca/)

Automount

To permanently mount the drive you need to create a password file and use that in /etc/fstab:

1. Create a file /root/.smbcredentials with the following content:

username=$USER
    password=$PWORD

p.2. Change the permissions such that only root can read the file.
sudo chmod 700 /root/.smbcredentials

3. Now add the following line in /etc/fstab file.

//$REMOTEURL $MOUNTPOINT cifs default,uid=1000,gid=1000,credentials=/root/.smbcredentials,rw,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777 0 0

nb: the default,uid=1000,gid=1000 part comes from http://ubuntuforums.org/showthread.php?t=1409720 This is to answer the problem of them mounting RO. When I tried this out, I then had to go into the directory on my local machine and manually change the permissions from access only to read and create.

4. Test if the line added in the fstab file works.

# sudo mount -a

5. Now the remote share should be mounted at /mnt/storage.

Your “R” drive

The “R” or research drive is a shared drive you can use for collaborative research projects. It is found at uleth.ca/research/$DRIVENAME where $DRIVENAME is the name IT gives the space (e.g. genee_students).

You access this using smb (Microsoft’s workgroup protocol)

  1. In Nautilus, choose “Connect to Server”
  2. In the dialogue that pops up enter the network name, prefixed by the smb protocol (smb://uleth.ca/research/$DRIVENAME).
  3. In the authentication dialogue, your username is your (full) uleth email address; password is the same as your uleth network password.
  4. That’s it.

Your “W” drive

The “W” or public drive (the drive your web files are on) is found at files.uleth.ca. This can be ssh’d into and so is a lot easier to use.

ssh $USER@files.uleth.ca

Department and committee drives

A third kind of drive is department and committee drives. These are often made by IT with spaces in the name (grrr). An example might be: cifs://uldept.uleth.ca/ResearchServices/BoGRC Committee

There are different ways of handling spaces in file names depending on how you are mounting things. For use from the commandline for one-off mounting, several normal options (e.g. \ , \040, and %20) don’t seem to work. What does seem to work is putting the whole directory with the space in quotation marks. So in the above example, the following works (where $USER is your uleth username) : sudo mount.cifs //uldept.uleth.ca/ResearchServices/"BoGRC Committee" BOG -o username=$USER

----  

Bibliophilia: Why books don't mean what they used to

Posted: Dec 25, 2013 14:12;
Last Modified: Jan 25, 2014 14:01

Tags: , , , , , , , ,

---

My wife, Inge Genee, and I have moved house nine times in our life together.

In most cases, this involved a move across water: New Haven to Amsterdam, Amsterdam to Baton Rouge, Baton Rouge back to Amsterdam, a summer in Toronto, Amsterdam to York, York to Lethbridge, and twice within Lethbridge.

Especially during our transatlantic moves, we became very adept at minimal packing. Nobody was paying for our moves and, since we had no money, we moved each time with what we could bring with our airline baggage allowance: 32 kilo per bag, two bags per person (most of our moves were in the 1990s, when this was standard across all airlines).

We had a basic procedure we followed, especially on moves from Amsterdam to North America (we had a long term lease on an apartment in Amsterdam and so, apart from a few things I had in storage in Toronto, we left our bulky things there): we’d fill our four suitcases with 128 killos of books, throw some absolutely necessary clothes into our carry on luggage, and head off to the airport. When we got to the other end, we’d rent a furnished apartment and then head to the local Walmart or similar type of store to buy any additional things we needed.

In other words, our most important possessions were our books. We were both graduate students at the time and over long careers as students and bibliophiles, we had both built up significant collections. These included (since we were both working with medieval languages) many of the most important dictionaries, manuscript handbooks, and other reference works we needed for our dissertation research. Although we usually had library access in our new location, we could never know exactly what books would be available to us at the other end (it was difficult to check library catalogues from overseas in those days), or precisely what privileges we would have. Even if we could access the same dictionaries, grammars, and atlases in the local university library, moreover, they would usually be non-circulating. A personal second copy meant we could work at home as well.

A few years ago, the issue of our libraries came up again.

We had refinanced our house and were planning a major renovation of our kitchen and toilets—all the plumbing in the house, in fact—as well as some structural work. Because you can’t live in a house with no water, we had to move out for the duration of the renovations. But because the work was going to be restricted to only a few rooms and our basement, we did not have to move all our stuff as well. We could move our furniture and books into a couple of rooms in which the contractor would not be working, seal them up against the dust with some plastic and tape, and rent the furnished house of a colleague who was heading off on sabbatical.

This time, we didn’t really think we needed to move our books, or anything else. We’d both been working at the University of Lethbridge for some time and had established offices with many of the books we needed for our day-to-day work located there. And if we discovered after we moved that we needed anything at all from our place—books, furniture, or pots and pans, we could always come back to the house and get them.

At the very last minute, however, this changed. The day we were to hand the house over to the contractor, we got a call from our insurance agent, who had been double checking our policy and discovered that our house would be classified as a “construction site” while it was unihabited for the renovations. This meant that our insurance would cover the building’s fabric—its walls, floors, roof, etc.—but not its contents. Anything we left inside the building would be, in effect, uninsured. And anything we wanted our insurance to cover had to come with us or be put in offsite storage.

This threw off all our assumptions. We’d thought we were just borrowing a house for the summer so we’d have somewhere to stay. Now it looked like we were going to have to arrange a move of everything—and since the call from the insurance agent came about 4pm, we didn’t have a lot of time to do it. We’d need to find and rent a moving truck and then figure out what to do with the things we were removing: rent some storage space or figure out a way of cramming them into our friends’ already furnished house.

Our insurance agent said she would look around to see if she could find us a different policy. And in the meantime she advised us to prioritise. That is to decide which things we could absolutely not leave uninsured and which things we could afford to risk, if only while we arranged storage or otherwise found a way to get them safely out of the house.

Inge and I sat down to decide what we would take that night and what we would leave behind. Initially, we figured the choice was easy. We’d moved eight times before with only our reference books. Presumably if we were going to do it again, the books were the things we had to take—especially since, after more than a decade working as faculty members at the University of Lethbridge, we had expanded our already large libraries considerably.

As we thought more about it, however, we began to think that we should actually be thinking more about the furniture. Although both of us had large libraries, there were no longer that many books that were absolutely essential for our work. I have a manuscript catalogue that I would take with me anywhere (it contains notes on manuscripts I have personally examined and so is a unique repository) and there were a few other books that had similar additional value. But, we began to realise, it was simply no longer the case that our books were irreplaceable. It would be terrible to lose them and many had great sentimental value. But it the worst thing happened and our libraries were lost, it would now be much easier—and in some ways less essential—to rebuild them.

This was not primarily because much of the information they contained could now be found on the internet—although this was definitely a factor. Rather, it was because we realised that everything was simply much easier to replace now than it had been 20 years ago.

When we had started building our libraries in the 1980s and 1990s, books were hard to find. A large collection of (relatively) obscure grammars and dictionries such as we had each built said something about you as a person: that you knew where to go to find things; that you knew what were the essential books you needed for your discipline; that you had the patience and experience to keep looking until you found them; that you had bought them to ensure you had the right resources at your finger tips. A large scholarly library, in other words, was an index for your learnedness: good scholars had large libraries because they had the disciplinary skill and patience to build one up and a practical need of access to the volumes they contained. You could not build a large disciplinary library quickly. If you had one, it was evidence that you were a scholar; and if you lost it, it was a life-changing setback.

What we realised when the insurance company told us we had to decide what we needed to take with us was that that was what was no longer true. A large library was no longer necessarily evidence of great disciplinary effort. Now that nearly half of all books are sold online, finding rare books is simply a matter of having good search skills (interestingly, our upstairs neighbour in Amsterdam in the early 1990s was something of a pioneer in this regard: he ran an email-only “online” medical antiquitarian bookshop). Had our libraries been destroyed now, it would have been very unpleasant; but it would not be impossible to build back up the most essential parts of our collections relatively quickly. What 20 years ago would have been a setback of catastrophic proportions would today be an unpleasant and expensive inconvenience. Or, in other words: as much as we love our libraries, they were simply no longer the irreplaceable foundation of our academic careers.

This, I think, is by far the biggest change in our information world. Twenty years ago, when I read of a antiquitarian bookstore closing, I saw the loss as a loss of access to books I might not otherwise be able to find. Now, when I read of the same thing, I see it as simply the loss of a fun place to hang out. Places that were once essential to my work are now really just fun places to visit.

----  

MySQL cheatsheet

Posted: Aug 27, 2012 13:08;
Last Modified: Aug 27, 2012 14:08

Tags: , , , , , ,

---

From http://www.patrickpatoray.com/?Page=30

MySQL Dump/Restore

Dump ALL MySQL Databases

mysqldump —user=XXXXXXXX —password=XXXXXXX -A > /PATH/TO/DUMPFILE.SQL

Dump Individual or Multiple MySQL Databases

mysqldump —user=XXXXXXXX —password=XXXXXXX —databases DB_NAME1 DB_NAME2 DB_NAME3 > /PATH/TO/DUMPFILE.SQL

Dump only certain tables from a MySQL Database

mysqldump —user=XXXXXXXX —password=XXXXXXXX —databases DB_NAME —tables TABLE_NAME > /PATH/TO/DUMPFILE.SQL

Restoring dbase

  1. Use the following procedure to reload the contents of a database:
  2. Unzip the backup file you wish to use.
  3. Open it up and pull out only the information that you will need.
  4. Save this text file.
  5. Use the following command to feed back in the contents of a text file:
mysql —verbose —user=XXXXXXXX —password=XXXXXXXX DB_NAME < /PATH/TO/DUMPFILE.SQL
----  

Yii Basic Steps

Posted: Apr 06, 2012 16:04;
Last Modified: May 23, 2012 18:05

Tags: , , , , ,

---

This is just a reminder to myself about setting up a Yii install. There are much more detailed examples on the web.

Build database

Build a database (I use MySql with innodb engine). The innodb engine is important because it allows you to document foreign key associations

Copy yii to top level of webdirectory

If you haven’t already done so, get a copy of the latest version of yii, uncompress it, and install it in a webdirectory immediately below the web root.

In my case, I tend to put it under /var/www/ and call the directory yii.

Use yiic to build initial site

navigate to the root of your web directory. Assuming yii is an immediate child of this directory, use the following to build the initial site of your website, where $site is the directory name you want to use for your site and $path the full path to the current directory:

yii/framework/yiic webapp $site

The response should be: Your application has been created successfully under $path/$site

Prepare for building the scaffolding

change directory to $path/$site/protected/config/

edit main.php:

  1. remove the comments surrounding the section on the gii tool (you’ll want to put these back in production)
  2. make sure you put a password in
  3. (for MySql) remove the comments surrounding the MySql Stanza and modify/supply the required information (usernames, passwords, servers, etc.)
  4. fix the admin email.

Use gii to establish the basic scaffolding

  1. point your webbrowser at the site (e.g. http://localhost/$site/ or http://example.com/$site/)
  2. click on the “home” link in the menubar to get the top URL (it will be something like this: http://$domain/$site/index.php?r=site/index)
  3. in the location bar, replace site/index with gii
  4. enter the database root password

Develop models

Most if not all dbases will have a model in the system.

  1. Select “Model Generator”.
  2. enter * in the “Table name” field to generate models for all tables; otherwise type the table name in the box for a specific table.
  3. click on preview.
  4. if everything is good, click on generate (need to make sure permissions all generation).

Develop CRUD

  1. select “crud generator”
  2. enter Model names for the models for which you want to have specific forms and actions (not all tables will require this: some will be written automatically or through relations)
----  

Adding an attribute for title to a yii widget

Posted: Feb 24, 2012 15:02;
Last Modified: May 23, 2012 18:05

Tags: , , , , ,

---

The Yii file view.php by default uses the [zii.widgets.CDetailView] to display all examples of a given model. In the standard scaffolding produced by the gii utility, this widget consists of references to attributes of the model without any further information (e.g. attribute names and the like):

<?php $this->widget('zii.widgets.CDetailView', array(
	'data'=>$model,
	'attributes'=>array(
		'editorialInstance_id',
		'journal.shortTitle', // a reference to a relational attribute
		'type',
                'LastNamesFirstName', // a reference to a compound attribute
	),

In this minimalist form, yii will calculate an appropriate label for the attribute on the basis of the attribute name: so, for example, in this case, editorialInstance_id will be appear in the view labelled “Editorial Instance” because Yii understands camelCase naming conventions and knows to strip off _id (it’s that good!).

A problem with this, however, is that we also provide customised label names as part of the attributeLabels() method in our Model controller. Since that method allows arbitrary names, and since CDetailView attempts to calculate labels on the basis of the attribute name, it is highly likely that the labels for different attributes will get out of synch in different places in your site. To give an example: in this particular case, the model for editorialInstance might have defined the label for editorialInstance_id as “ID” rather than “Editorial Instance”: since CDetailView doesn’t check to see what you had on attributeLabels() in the model class, switching from an edit view to an index will mean that the label of the attribute switches.

Ideally what you want to do is keep CDetailView and attributeLabels() in synch automatically. I’m sure there must be a way of doing this. In the meantime, however, here’s how you can set an arbitrary label in the widget (I’ve used it on the first and last):

<?php $this->widget('zii.widgets.CDetailView', array(
	'data'=>$model,
	'attributes'=>array(
                array('value' => 
                      $model->editorialInstance_id, 'label' => 'ID'),
		'journal.shortTitle',
		'type',
		array('value' => $model->person->lastNameFirstNames, 'label' => 'Person'),
	),
----  

Understanding Relation Models in Yii

Posted: Feb 24, 2012 13:02;
Last Modified: May 23, 2012 18:05

Tags: , , , , , ,

---

The core of any database driven website is its ability to handle table relations (if that sentence didn’t mean anything to you, you should first do some reading about relational databases, database design, and normalising data: an introduction aimed at textual editors can be found in my article “What digital editors can learn from
print editorial practice.” Literary and Linguistic Computing 24 (2009): 113-125)

One of the really useful things about the Yii MVC framework is the extent to which it allows you to systematise and automate the process of establishing these relations.

The relations() method

The most important part of this system is in the Yii model classes. When you first scaffold a new website in Yii (see the Yii website for the extremely easy to implement details of how this is done), the gii utility will build a series of standard model classes, each of which corresponds to a table in your database. A core method, included in every one of these models by default, is relations() (note: the following is how an empty relation method looks):

/**
	 * @return array relational rules.
	 */
	public function relations()
	{
		// NOTE: you may need to adjust the relation name and the related
		// class name for the relations automatically generated below.
		return array(
                );
        }

To indicate that the table represented by this model is related to other tables in your database, you construct a number of relation key => value pairs using a series of pre-defined terms (see these sections of the Yii Blog Tutorial and of the Yii documentation for details).

You can do this quite easily by hand. But if your database is designed using an engine that supports explicit information about relations (such as MySQL’s InnoDB engine [but not the default MYISAM]), Yii’s scaffolding utility gii will do much of the work in populating this method automatically.

Here’s an example of a relationset, built for a table in one of my databases (the model is called Journal and describes a table containing information about journals in a publishing workflow):

public function relations()
	{
		// NOTE: you may need to adjust the relation name and the related
		// class name for the relations automatically generated below.
		return array(
			'articles' => array(self::HAS_MANY, 'Article', 'journal_id'),
			'editorialInstances' => array(self::HAS_MANY, 'EditorialInstance', 'journal_id'),
		);
}

In human terms, this is what the method is indicating:

  1. the journal table is directly related to two other tables in my database: article and editorialInstance (in my database, tables are named using camelCase starting with an initial lowercase letter; Yii’s naming convention is that Model Classes [i.e. the models that describe tables] begin with a capital letter: so Article is the model for the database table article).
  2. the relationship between journal and these two tables is
    1. parent to child (journal HAS article and editorialInstance)
    2. one to many (journal HASMANY_ article and editorialInstance
  3. the key names in the relations array article*s* and editorialInstance*s* are themselves arrays of all the possible values in these child tables
  4. both the child tables contain journal_id as a foreign key (FK)

How the relations() method makes your life easier

The great thing about this relations() method is that it turns relations into attributes of the model itself. That is to say, attributes of the related tables can be access directly from the model in which they are declared.

This is easiest to see with the BELONGS_TO (many-to-one child-to-parent) relation, which isn’t instanced above. Here’s an example from the EditorialInstance model, however: i.e. one of the children of Journal in my database:

public function relations()
	{
		// NOTE: you may need to adjust the relation name and the related
		// class name for the relations automatically generated below.
		return array(
			'journal' => array(self::BELONGS_TO, 'Journal', 'journal_id'),
			'person' => array(self::BELONGS_TO, 'Person', 'person_id'),
		);

In this case, you can see tat EditorialInstance is the child of two databases (that is to say it BELONGS_TO them).

When a BELONGS_TO relationship is declared in a model, the attributes of the parent table are treated exactly like the attributes of the child table in the declaring model. I.e. let’s say the editorialInstance table has an attribute called type and we are referencing it like this in an editorialInstanceView: $data->type; we can also access all the attributes of the parent tables as well through this same model using language: so the lastName attribute on person would be referenced in this same context $data->person->lastName

Relational queries

----  

Creating compound attributes in Yii

Posted: Feb 24, 2012 12:02;
Last Modified: May 23, 2012 18:05

Tags: , , , , , ,

---

Let’s say you have a database table called persons with separate attributes (fields) for lastName and firstNames. Elsewhere in your website, you want to refer to the underlying record in this table using the person’s whole name as a single entity (e.g. to provide a link, for example: <a href="http://example.com/index.php?r=Person/view&id=1">Jane Q. Public</a>.

Sometimes, you might be able to refer to the two attributes separately. For example, if you simply wanted to echo the content in a view somewhere you could use code like this:

<?php echo CHtml::encode($data->person->firstNames) . ' ' . CHtml::encode($data->person->lastName); ?>

This is a little inefficient if you are doing it a lot throughout your site, because you need to keep re-entering the same code correctly over and over again (and avoiding this is a main reason for going with an MVC framework like Yii in the first place). But the real trouble comes if you want to use the attributes in a context where the method you are invoking expects a single attribute—as is the case, for example, with the yii linking method CHtml::link.

The way round this is to create a new compound attribute in your model. This means taking the two underlying attributes in your table and combining them into a single attribute already arranged the way you want that you can then invoke in other methods. Or in this case, adding something like this to the Person model:

public function getFirstNamesLastName()
    {
        return $this->firstNames . ' ' . $this->lastName;
    }

Once this function is declared, you can invoke it, in exactly the same way you would have invoked the “original” attributes lastName or [firstNames], using the attribute name FirstNamesLastName: e.g.

<?php 
CHtml::link(CHtml::encode($data->person->FirstNamesLastName),
array('person/view', 'id'=>$data->person->person_id)); ?>

(Note: there is a much more efficient way of encoding this last example using the same principle. See this post).

----  

Yii Ensuring that key terms are always linked

Posted: Feb 24, 2012 11:02;
Last Modified: May 23, 2012 18:05

Tags: , , , , , ,

---

As we are building our workflow manager, we are discovering that we develop a more intuitive interface if some terms are always hyperlinked and point to a standard presentation of the relational information.

One example of this might be names of people associated with the workflow (editors, authors, copyeditors, production assistants). An intuitive internal navigation method seems to be to have the names of these people always hyperlinked with the hyperlink always pointing to the person’s profile page.

One way of doing this in Yii would be to modify the views associated with each table in the site so that every time a name is called, you get a link. This is contrary to spirit of the MVC model, however, since it means you are using a view to present logic about the underlying model. And it is also prone to error, since it means you a) need to find every possible invocation of the name in all you various views and b) not make an error as you enter the same code over an over again in all these different views.

The better approach is to add this functionality to the underlying datamodel that supplies the information to the entire site in the first place—that is, to the model for the database that is providing the name information and the page you want to link to in the end.

Here’s some code for your model that would allow you produce a linked name anywhere in your yii site (for simplicity’s sake in this example, I am wrapping a single attribute from my database in a hyperlink. This post shows you how to use a similar method to first make compound attributes):

public function getLastNameLink()
    {
    return CHtml::link(CHtml::encode($this->lastName),
    array('person/view', 'id'=>$this->person_id));
    }

Here are some underlying premises behind this code:

  1. There is a table in my database called person
  2. I have written a view for this database (either by hand or using the gii utility to build one automatically): person/view is the URL fragment CHtml::link will use to build the link to the profile page for a given person (note: it is tempting to just use view for the URL because we are already in the person model; you should use the “full” yii-path however because you will be invoking this throughout the site from views associated with all sorts of other models)
  3. The table person has an attribute (column) called person_id.

Once this has been added to my Person model, I can call the code (and add a link to the person profile) in any view by just invoking the method: from now on, the LastNameLink functions as an attribute of the Person model and can be used in exactly the same way actual direct, table-based attributes can be invoked. For example, in a different model’s view:

<?php echo $data->person->LastNameLink; ?>

This code will produce a link to index.php?r=person/view&id=n where n is the id number of a given record in the table. If I hadn’t added the above code to the Person model, the code required to do the equivalent would have been:

<?php echo 
CHtml::link(CHtml::encode($data->person->lastNameFirstNames),
array('person/view', 'id'=>$data->person->person_id)); ?>
----  

Yii Authentication

Posted: Feb 19, 2012 23:02;
Last Modified: May 23, 2012 18:05

Tags: , , ,

---

admin authentication

In the controllers established by gii, yii’s scaffolding tool, there is a standard method called accessRules() that defines what users can do what actions. A common set is:

public function accessRules()
	{
		return array(
			array('allow',  // allow all users to perform 'index' and 'view' actions
				'actions'=>array('index','view'),
				'users'=>array('*'),
			),
			array('allow', // allow authenticated user to perform 'create' and 'update' actions
				'actions'=>array('create','update'),
				'users'=>array('@'),
			),
			array('allow', // allow admin user to perform 'admin' and 'delete' actions
				'actions'=>array('admin','delete'),
				'users'=>array('admin'),
			),
			array('deny',  // deny all users
				'users'=>array('*'),
			),
		);
	}

The comments explain what each array means. An interesting question, however, is how you get a user to count as an ‘admin’? Is there some method or class somewhere that store this information? And if so, how do I get in it?

If you create new users you might devote considerable amount of time trying to get them into the admin class, all to no avail. As far as I can tell, the ‘admin’ refers to usernames rather than a class of user. So if your username is ‘admin’ you can do the restricted actions. If it isn’t, you can’t.

There are a couple of choices here. One is to keep a user whose name is ‘admin’: this has the virtue of simplicity and, since yii will always generate this condition everytime you generate a new site, it also means you’ll not have to go changing ever constructor in your new site as well.

The other choices are to change the code, either to allow some other determinant there. One approach, modified very slightly from an example in Larry Ullman’s blog, is to change ‘admin’ to the wildcard '<nowiki>‘@ (=logged in users) and then add an expression for some other condition:

array('allow',
    'actions'=>array('admin','delete'),
    'users'=>array('@'),
    'expression'=>'isset($user->role) && ($user->role==="editor")'
),

You could even just hardwire somebody’s name in or some other attribute:

bc.. array('allow',
    'actions'=>array('admin','delete'),
    'users'=>array('@'),
    'expression'=>'isset($user->email) && ($user->role==="admin@example.com")'
),

For more discussion, see

----  

set date and time from commandzone

Posted: Nov 22, 2011 21:11;
Last Modified: Jun 07, 2012 13:06

Tags: , , ,

---

http://codeghar.wordpress.com/2007/12/06/manage-time-in-ubuntu-through-command-line/

----  

How to "clone" a test in Moodle 2.0

Posted: Mar 27, 2011 21:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , ,

---

Here’s how to clone a test in Moodle 2.0 (i.e. make an exact copy so that both appear in the course; this is useful for making practice tests or copying a basic test format so that it can be reused later in the course):

  1. Backup the test. Exclude all user data but include activities, blocks, and filters.
  2. Select “Restore.” Your backup should be listed under user private backups. Simply restore the file to create a second instance.
  3. Treat one of the instances as your clone: move it, edit it, change its titles and questions. It is a completely independent version of the original file.
----  

Organising Quizzes in Moodle 2.0

Posted: Mar 27, 2011 21:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , ,

---

Moodle 2.0 allows designers to divide questions into pages. But while this introduces great flexibility, it can be quite a cumbersome system to use at first. Here’s a method for making it more efficient:

  1. When you first build a test, put all questions on one page.
  2. Once you have the questions in the order you want, divide the test into different pages by selecting the last question for each page and selecting the “Begin new page after selected question.

This will cut down on your server calls (and hence time) immensely.

----  

Differences between Moodle and Blackboard/WebCT short answer questions

Posted: Mar 27, 2011 20:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , , , , ,

---

There is an important difference between Moodle and Blackboard (WebCT) short answer questions that instructors should be aware of, namely that Moodle short answer questions allow only one answer field.

This means, for example, that you can’t easily import blackboard questions of the type “Supply the part of speech, person, tense, and number for the following form.” In Blackboard, you can present the student with four blanks for them to fill in, each with a different answer. When these are imported into Moodle, the question is converted into a form in which there is a single blank that has four possible correct answers.

There are various ways of asking the same kinds of questions in Moodle. The easiest when you are dealing with imported questions is to ask for a single quality in each answer. So instead of one question asking for part of speech, person, tense, and number, you might have four different questions, one for part of speech, another, for person, a third for tense, and a fourth for number.

A second way of asking this kind of question in Moodle is to use the embedded answer type. These are harder to write, but are arguably closer to the paper equivalent of the same type of question:

For the following Old English word supply the requested information:

clipode

Part of Speech: ____________
Tense: ____________
Number: ____________

----  

Multiple Choice Questions in Moodle

Posted: Mar 27, 2011 18:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , ,

---

Here are some tips for the composition of Multiple Choice Questions in Moodle.

  1. If students are allowed to mark more than one option correct and you intend to include at least one question where none of the offered options are correct, include as a possible answer “None of the listed options.”
    1. Do not call it “none of the above” since if (as you normally should) you have selected “shuffle answers,” you have no guarantee that it will be the final answer in the sequence.
    2. You should include this option in all questions in the set (including those for which some of the options are correct) to avoid giving the answer away when it appears.
    3. When “none of the listed options” is not the right answer, it should be scored at -100%, to avoid a student hedging his or her bets by selecting it and all the other answers.
  2. If you anticipate having a question for which all the answers are correct, you do not need a “All of the listed answers,” since selecting all will give students 100%.
  3. The correct options should be scored so they add up to 100%, of course!
  4. Incorrect options (exclusive of other than “None of the listed forms”) can be scored in a number of different ways:
    1. So that the total for all incorrect options (except “none of the listed forms”) is -100% (this stops a student hedging his or her bets by selecting all options); if you do not have a “none of the listed options” answer, you almost certainly should score this way.
    2. So that each negative is the reciprocal of a correct answer, regardless of whether all the incorrect answers add up to -100%. Use this if you don’t mind that a student selecting everything except a “None of the listed options” might end up with part marks.
----  

How to build a randomised essay/translation question in Moodle 2.0

Posted: Mar 20, 2011 16:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , ,

---

In my courses I often use a question of the following format:

  1. Common introduction
  2. Two or more sample passages or questions requiring an essay response
  3. A common form field for the answer to the student’s choice from #2.

Here is an example:

Write a modern English translation of one of the following passages in Old English in the space provided below.

1. Hæfst þū ǣnige ġefēran?
2. Hwæt māre dēst þū? Hæfst þū ġīet māre tō dōnne?

[Essay answer box for translation].

The point of this format is to provide the student with a choice of topics. If students all write their essays or translations at the same time, you can build your choice of topics by hand and write them into a single question. The challenge comes if you want to be able to allow your students to write the test asynchronously, as is common with Learning Management Software. In such cases you want to be able to draw your essay topics or translation passages randomly from a test bank.

All the basic elements you would need to do this are available in Moodle, both 1.x and 2.0+. You can use the “description” question type to put in the general instructions at the beginning; you can use the essay format question to provide the answer box. And you can use Moodle’s ability to assign random questions to draw your topics or translation passage from your test bank.

But there are also some problems:

  1. Description questions are unnumbered, meaning your introduction will not start with the question number
  2. Although there was some discussion before the release of Moodle 2.0 about allowing description questions to be randomised, this appears not to have been implemented. All questions that can be randomised must have an action associated with them. This means that every topic or translation passage must ask the student to do something. And also that each topic or translation will have a number.

What I do is the following:

  1. I write the introduction as a description question (and just accept that it has no number assigned).
  2. I write my translation passage or topics as “true / false” questions. Each consists of the topic or passage, followed by the question “I am writing on this topic/passage…” as the prompt for a true/false answer.
  3. I use the essay topic question to provide the common answer box. Since you need to have some text in an essay question, I use an anodyne instruction like “Write your essay/translation in the following space” to fill out the question.
  4. I assign a grade value of 0 to the two random topic/passages and assign the full grade value of the question to the essay answer box. The result is not elegant, but it works.
----  

How to setup a signup sheet in Moodle

Posted: Mar 15, 2011 14:03;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , ,

---

You can create a signup sheet for Moodle using the “Choice” activity.

A video showing how to do this can be found here: https://ctl.furman.edu/main/index.php?option=com_content&task=view&id=78&Itemid=90

In brief, however, here’s how to do it:

  1. Go to the section of your course in which you want the signup sheet to appear.
  2. With editing on, select the “Choice” activity.
  3. Fill in the title and description information.
  4. If you are restricting attendance, set the “Limit the number of responses allowed” option under “Limit” to “enabled.” Setting this allowed you to set how many people are allowed to choice any one option. If it is disabled, any number of participants may sign up for any particular session.
  5. Each “Option” represents an entry on the signup sheet. Write in the date and time (or anything else you require) in the “Option” field and, if you have enabled limits, the maximum number of participants for the entry in the “limit” field. If you need more than the standard five options, select “Add three more options” after you’ve filled in the first five.
----  

Byte me: Technological Education and the Humanities

Posted: Dec 20, 2008 14:12;
Last Modified: May 23, 2012 19:05

Tags: , , , , ,

---

Note: Published in "_Heroic Age_ 12":http://www.heroicage.org/issues/12/em.php

I recently had a discussion with the head of a humanities organisation who wanted to move a website. The website was written using Coldfusion, a proprietary suite of server-based software that is used by developers for writing and publishing interactive web sites (Adobe nd). After some discussion of the pros and cons of moving the site, we turned to the question of the software.

Head of Humanities Organisation: We'd also like to change the software.
Me: I'm not sure that is wise unless you really have to: it will mean hiring somebody to port everything and you are likely to introduce new problems.
Head of Humanities Organisation: But I don't have Coldfusion on my computer.
Me: Coldfusion is software that runs on a server. You don't need it on your computer. You just need it on the server. Your techies handle that.
Head of Humanities Organisation: Yes, but I use a Mac.

I might be exaggerating here—I can't remember if the person really said they used a Mac. But the underlying confusion we faced in the conversation was very real: the person I was talking to did not seem at all to understand the distinction between a personal computer and a network server— basic technology by which web pages are published and read.

This is not an isolated problem. In the last few years, I have been involved with a humanities organisation that distributes e-mail by cc:-list to its thirty-odd participants because some members believe their email system can't access listervers. I have had discussions with a scholar working on a very time-consuming web-based research project who was intent on inventing a custom method for indicating accents because they thought Unicode was too esoteric. I have helped another scholar who wrote an entire edition in a proprietary word-processor format and needed to recover the significance of the various coloured fonts and type faces he had used. And I have attended presentations by more than one project that intended to do all their development and archiving in layout-oriented HTML.

These examples all involve basic technological misunderstandings by people actively interested in pursuing digital projects of one kind or another. When you move outside this relatively small subgroup of humanities scholars, the level of technological awareness gets correspondingly lower. We all have colleagues who do not understand the difference between a blog and a mailing list, who don't know how web-pages are composed or published, who can't insert foreign characters into a word-processor document, and who are unable to backup or take other basic precautions concerning the security of their data.

Until very recently, this technological illiteracy has been excusable: humanities researchers and students, quite properly, concerned themselves primarily with their disciplinary work. The early Humanities Computing experts were working on topics, such as statistical analysis, the production of concordances, and building the back-ends for dictionaries, that were of no real interest to those who intended simply to access the final results of this work. Even after the personal computer replaced the typewriter, there was no real need for humanities scholars to understand technical details beyond such basics as turning a computer on and off and starting up their word-processor. The principal format for exchange and storage of scholarly information remained paper and the few areas where paper was superseded—such as in the use of email to replace the memo—the technology involved was so widely used, so robust, and above all so useful and so well supported that there was no need to learn anything about it: if your email and word-processor weren't set up at the store when you bought a computer, you could expect this work to be done for you by the technicians at your place of employment or over the phone by the Help Desk at your Internet Service Provider: nothing about humanities scholars' use of the technology required special treatment or distinguished them from the University President, a lawyer in a one-person law office... or their grandparents.

In the last half-decade, this situation has changed dramatically. The principal exchange format for humanities research is no longer paper but the digital byte—albeit admittedly as represented in PDF and word-processor formats (which are intended ultimately for printing or uses similar to that for which we print documents). State agencies are beginning to require open digital access to publicly-funded research. At humanities conferences, an increasing number of sessions focus on digital project reports and the application. And as Peter Robinson has recently argued, it is rare to discover a new major humanities project that does not include a significant digital component as part of its plans (Robinson 2005). Indeed some of the most interesting and exciting work in many fields is taking advantage of technology such as GPS, digital imaging, gaming, social networking, and multimedia digital libraries that was unheard of or still very experimental less than a decade ago.

That humanists are heavily engaged with technology should come, of course, as no real surprise. Humanities computing as a discipline can trace its origins back to the relatively early days of the computer, and a surprising number of the developments that led to the revolution in digital communication over the last decade were led by people with backgrounds in humanities research. The XML specification (XML is the computer language that underlies all sophisticated web-based applications, from your bank statement to Facebook) was edited under the supervision of C. Michael Sperberg-McQueen, who has a PhD in Comparative Literature from Stanford and was a lead editor of the Text Encoding Initiative (TEI) Guidelines, the long-standing standard for textual markup in the humanities, before he moved to the W3C (Sperberg-McQueen 2007). Michael Everson, the current registrar and a co-author of the Unicode standard for the representation of characters for use with computers, has an M.A. from UCLA in Indo-European linguistics and was a Fulbright Scholar in Irish at the University of Dublin (Evertype 2003-2006). David Megginson, who has also led committees at the W3C and was the principal developer of SAX, a very widely used processor for XML, has a PhD in Old English from the University of Toronto and was employed at the Dictionary of Old English and the University of Ottawa before moving to the private sector (Wikipedia Contributors 2008).

Just as importantly, the second generation of interactive web technology (the so-called "Web 2.0") is causing the general public to engage with exactly the type of questions we research. The Wikipedia has turned the writing of dusty old encyclopedias into a hobby much like ham-radio. The social networking site Second Life has seen the construction of virtual representations of museums, and libraries. Placing images of a manuscript library or museum's holding on the web is a sure way of increasing in-person traffic at the institution. The newest field for the study of such phenomenon, Information Studies, is also one of the oldest: almost without exception, departments of Information Studies are housed in and are extensions of traditional Library science programmes.

The result of this technological revolution is that very few active humanists can now truthfully say that they have absolutely no reason to understand the technology underlying their work. Whether we are board members of an academic society, working on a research project that is considering the pros and cons of on-line publication, instructors who need to publish lecture notes to the web, researchers who are searching JSTOR for secondary literature in our discipline, or the head of a humanities organisation that wants to move its web-site, we are all increasingly involved in circumstances that require us to make basic technological decisions. Is this software better than that? What are the long-term archival implications for storing digital information in format x vs. format y? Will users be able to make appropriate use of our digitally-published data? How do we ensure the quality of crowd-sourced contributions? Are we sure that the technology we are using will not become obsolete in an unacceptably short period of time? Will on-line publication destroy our journal's subscriber base?

The problem is that these are not always questions that we can "leave to the techies." It is true that many universities have excellent technical support and that there are many high-quality private contractors available who can help with basic technological implementation. And while the computer skills of our students is often over-rated, it is possible to train them to carry out many day-to-day technological tasks. But such assistance is only as good as the scholar who requests it. If the scholar who hires a student or asks for advice from their university's technical services does not know in broad terms what they want or what the minimum technological standards of their discipline are, they are likely to receive advice and help that is at best substandard and perhaps even counter-productive. Humanities researchers work on a time-scale and with archival standards far beyond those of the average client needing assistance with the average web-site or multimedia presentation. We all know of important print research in our disciplines that is still cited decades after the date of original publication. Not a few scholarly debates in the historical sciences have hinged on questions of whether a presentation of material adequately represents the "original" medium, function, or intention. Unless he or she has special training, a technician asked by a scholar to "build a website" for an editorial project may very well not understand the extent to which such questions require the use of different approaches to the composition, storage, and publication of data than those required to design and publish the athletic department's fall football schedule.

Even if your technical assistant is able to come up with a responsible solution for your request without direction from somebody who knows the current standards for Digital Humanities research in your discipline, the problem remains that such advice almost certainly would be reactive: the technician would be responding to your (perhaps naive) request for assistance, not thinking of new disciplinary questions that you might be able to ask if you knew more about the existing options. Might you be able to ask different questions by employing new or novel technology like GPS, serious gaming, or social networking? Can technology help you (or your users) see your results in a different way? Are there ways that your project could be integrated with other projects looking at similar types of material or using different technologies. Would your work benefit from distribution in some of the new publication styles like blogs or wikis? These are questions that require a strong grounding in the original humanistic discipline and a more-than-passing knowledge of current technology and digital genres. Many of us have students who know more than than we do about on-line search engines; while we might hire such students to assist us in the compilation of our bibliographies, we would not let them set our research agendas or determine the contours of project we hire them to work on. Handing technological design of a major humanities research project over to a non-specialist university IT department or a student whose only claim to expertise is that they are better than you at instant messaging is no more responsible.

Fortunately, our home humanistic disciplines have had to deal with this kind of problem before. Many graduate, and even some undergraduate, departments require students to take courses in research methods, bibliography, or theory as part of their regular degree programmes. The goal of such courses is not necessarily to turn such students into librarians, textual scholars, or theorists—though I suppose we wouldn't complain if some of them discovered a previously unknown interest. Rather, it is to ensure that students have a background in such fundamental areas sufficient to allow them to conduct their own research without making basic mistakes or suffering unnecessary delays while they discover by trial-and-error things that might far more efficiently be taught to them upfront in the lecture hall.

In the case of technology, I believe we have now reached the stage where we need to be giving our students a similar grounding. We do not need to produce IT specialists—though it is true that a well-trained and knowledgeable Digital Humanities graduate has a combination of technological skills and experience with real-world problems and concepts that are very easily transferable to the private sector. But we do need to produce graduates who understand the technological world in which we now live—and, more importantly, how this technology can help them do better work in their home discipline.

The precise details of such an understanding will vary from discipline to discipline. Working as an Anglo-Saxonist and a textual critic in an English department, I will no doubt consider different skills and knowledge to be essential than I would if I were a church historian or theologian. But in its basic outlines such a orientation to the Digital Humanities probably need not vary too much from humanities department to humanities department. We simply should no longer be graduating students who do not know the basic history and nature of web technologies, what a database is and how it is designed and used, the importance of keeping content and processing distinct from each other, and the archival and maintenance issues involved in the development of robust digital standards like Unicode and the TEI Guidelines. Such students should be able to discuss the practical differences (and similarities) of print vs. web publication; they should be able to assess intelligently from a variety of different angles the pros and cons of different approaches to basic problems involving the digitisation of text, two and three-dimensional imaging, animation, and archival storage and cataloguing; and they should be acquainted with basic digital pedagogical tools (course management and testing software; essay management and plagiarism detection software) and the new digital genres and rhetorics (wikis, blogs, social networking sites, comment boards) that they are likely to be asked to consider in their future research and teaching.

Not all humanists need to become Digital Humanists. Indeed, in attending conferences in the last few years and observing the increasingly diverging interests and research questions pursued by those who identify themselves as "Digital Humanists" and those who define themselves primarily as traditional domain specialists, I am beginning to wonder if we are not seeing the beginnings of a split between "experimentalists" and "theorists" similar to that which exists today in some of the natural sciences. But just as theoretical and experimental scientists need to maintain some awareness of what each branch of their common larger discipline is doing if the field as a whole is to progress, so too must there remain an interaction between the traditional humanistic and digital humanistic domains if our larger fields are also going to continue to make the best use of the new tools and technologies available to us. As humanists, we are, unavoidably, making increasing use of digital media in our research and dissemination. If this work is to take the best advantage of these new tools and rhetorics—and not inadvertently harm our work by naively adopting techniques that are already known to represent poor practice, we need to start treating a basic knowledge of relevant digital technology and rhetorics as a core research skill in much the same way we currently treat bibliography and research methods.

Works Cited

Adobe. nd. "Adobe Coldfusion 8." http://www.adobe.com/products/coldfusion/

Evertype 2003-2006. "Evertype: About Michael Everson." http://www.evertype.com/misc/bio.html

Robinson, Peter. 2005. "Current issues in making digital editions of medieval texts—or, do electronic scholarly editions have a future?" DM 1.1 (2005): http://www.digitalmedievalist.org/journal/1.1/robinson/

Sperberg-McQueen, C. M. 2007. "C.M. Sperberg-McQueen Home Page." http://www.w3.org/People/cmsmcq/

Wikipedia contributors. 2008. "David Megginson." Wikipedia. http://en.wikipedia.org/w/index.php?title=David_Megginson&oldid=257685665

.
----  

Digital Plagiarism

Posted: Dec 15, 2008 13:12;
Last Modified: Mar 04, 2015 05:03

Tags: , , , , , , , ,

---

Essay and test management software

I have recently started using plagiarism detection software. Not so much for the ability to detect plagiarism as for the essay submission- and grading- management capabilities it offered. Years ago I moved all my examinations and tests from paper to course management software (WebCT originally, now Blackboard, and soon Moodle). I discovered in my first year using that software that simply delivering and correcting my tests on-line—i.e. without making any attempt to automate any aspect of the grading—reduced the time I spent marking exams by an immediate 50%: it turned out that I had been spending as much time handling tests (sorting, adding, copying grades, etc.) as I had marking them—more, in fact, if you included the in-class time lost to proctoring and returning corrected work to students.

I long wondered whether I could capture the same kind of efficiencies by automating my essay administration. Here too, I thought that I spent a lot of time handling paper rather than engaging with content. In this case, however, I was not sure I would be able to gain the same kind of time-saving. While I was sure that I could streamline my workflow, I was afraid that marking on screen might prove much less efficient than pen and paper—to the point perhaps of actually hurting the quality and speed of my essay-grading.

My experience this semester has been that my fears about lack of efficiency in the intellectual aspects of my correction were largely unfounded. And that my hopes for improving my administrative efficiency closely reflected the actual possibilities. The amount of time I spend handling a given set of essays has now dropped by approximately the expected 50%. While marking on screen is slower than marking with a pencil (a paper that used to take me 20 minutes to mark now will take 24 to 25 minutes), the difference is both smaller than I originally feared and more than compensated by the administrative time-savings, again including the class time freed up from the need to collect and redistribute papers.

Detecting plagiarism

Although I use it primarily for essay management, plagiarism dedection software such as turnitin, the system I use, was, of course, originally designed to detect plagiarism—which means that I too can use it to check my students’ originality. The developers remind users that a lack of originality is not the same thing as plagiarism: plagiarism is a specific type of lack of originality and even good pieces of work will have numerous passages in common with other texts in the software’s database. Obvious examples of this include quotations from works under discussion and bibliographic entries. It is also quite common to see the occasional short phrase or clause flagged in otherwise original work, especially at the beginning of paragraphs or in passages introducing or linking quotations. Presumably there are only so many ways of saying “In Pride and Prejudice, Jane Austen writes…”. In shorter papers, in fact, it is not unusual to see non-plagiarised student papers with as much as 30%-40% of their content flagged initially as being as “non-original.”

Some students, however, actually do plagiarise—which I understand to mean the use of arguments, examples, or words of another as if they were one’s own. When marking by hand, I’ve generally considered this to be a relatively small problem. In twelve years at the University of Lethbridge, I’ve caught probably less than ten students whose work was significantly plagiarised. Obviously I’ve never been able to say whether this was because my methods for discovering such work were missing essays by more successful plagiarists or because the problem really wasn’t that significant. Using plagiarism detection software gave me the opportunity of checking how well I had been doing catching plagiarists the old fashioned way, when I was marking by hand.

To the extent that one semesters’ data is a sufficient sample, my preliminary conclusions are that the problem of plagiarism, at least in my classes, seems to be more-or-less as insignificant as I thought it was when I graded by hand, and that my old method of discovering plagiarism (looking into things when a paper didn’t seem quite “right”) seemed to work.1 This past semester, I caught two people plagiarising. But neither of them had particularly high unoriginality scores: in both cases, I discovered the plagiarism after something in their essays seemed strange to me and caused me to go through originality reports turnitin provides on each essay more carefully. I then went through the reports for every essay submitted by that class (a total of almost 200), to see if I had missed any essays that turnitin’s reports suggested might be plagiarised. None of the others showed the same kind of suspicious content that had led me to suspect the two I caught. So for me, at least, the “sniff test” remains apparently reliable.

How software improves on previous methods of detecting plagiarism

Even though it turns out that I apparently can still rely on my ability to discover plagiarism intuitively, there are two things about plagiarism detection software that do mark an improvement over previous methods of identifying such problems by hand. The first is how quickly such software lets instructors test their hunches. In the two cases I caught this semester, confirming my hunch took less than a minute: I simply clicked on the originality report and compared the highlighted passages until I discovered a couple that were clearly copied by the students without ackowledgement in ways that went beyond reasonable use, unconscious error, or unrealised intellectual debt. Working by hand would have required me to Googling specific phrases from the paper one after the other and/or go to the library and to find a print source for the offending passages. In the past it has often taken me hours to make a reasonable case against even quite obvious examples of plagiarism.

The second improvement brought on by plagiarism detection software lies in the type of misuse of sources it uncovers. Although I became suspicious about the originality of the two papers I caught this semester on my own rather than through the software’s originality report, the plagiarism I uncovered from the originality report was in both cases quite different from anything I have seen in the past. Instead of the wholesale copying from one or two sources I used to see occasionally when I marked by hand, the plagiarism I found this year with turnitin involved the much more subtle use of unacknowledged passages, quotations, and argument and at key moments in the students’ papers. In the old days, my students used to plagiarise with a shovel; these students were plagiarising with a scalpel. I’m not completely sure I would have been able to find the sources for at least some of this unacknowledged debt if I had been looking by hand.

A new kind of plagiarism

This is where my title comes in. It is of course entirely possible that students always have plagiarised in this way and that I (and many of my colleagues) simply have missed it because it is so hard to spot by hand. But I think that the plagiarism turnitin caught in these two essays this semester actually may represent a new kind of problem involving the misappropriation of sources in student work—a problem that has different origins, and may even involve more examples of honest mistake, than we used to see when students had to go to the library to steal their material. Having interviewed a number of students in the course of the semester, I am in fact fairly firmly convinced that what turnitin found is a symptom of new problems in genre and research methodology that are particularly to the current generation of students—students who are undergoing their intellectual maturation as young adults in a digital culture that is quite different from that of even five years ago. What they were doing was still culpable—the great majority of my students were able to avoid misappropriating other peoples’ ideas in their essays. But new technologies, genres, and student approaches to note-taking are making it easier for members of the current generation to “fall into” plagiarism without meaning to in ways that previous generations of students would not. In the old days, you had to positively decide to plagiarise an essay by buying one off a friend or going to the library and actually typing text out that you were planning to present as your own. Nowadays, I suspect, students who plagiarise the way my two students did this semester do so because they haven’t taken steps to prevent it from happening.

Digital students, the essay, and the blog

This first thing to realise about how our students approach our assignments has to do with genre. For most (pre-digital) university instructors, the essay is self-evidently the way one engages with humanistic intellectual problems. It is what we were taught in school and practiced at university. But more importantly, it was almost exclusively how argument and debate were conducted in the larger society. The important issues of the day were discussed in magazines and newspapers by journalists whose opinion pieces were also more-or-less similar to the type of work students were asked to do at the university: reasoned, original, and polished pieces of writing in which a single author demonstrated his or her competence by the focussed selection of argument and supporting evidence. The value of a good essay—at the university or in the newspaper—lay in the author’s ability to digest arguments and evidence and make it his or her own: find and assimilate the most important material into an original argument that taught the reader a new way of understanding the information and opinions of others.

For most contemporary students, however, the essay is neither the only nor the most obviously appropriate way of engaging with the world of ideas, politics, and culture. Far more common, certainly numerically and, increasingly, in influence, is the blog—and making a good blog can often involve skills that are anathemetic to the traditional essay. While it is possible to publish essays using blog software, the point of blogs, increasingly, is less to digest facts and arguments than to accumulate and react to them. Political blogs—like the Ed Morrisey’s Captain’s Quarters (now at Hot Air, or Dan Froomkin’s (Whitehouse Watch)—tend to consist of collections of material from other on-line sources interspaced with opinion. The skill an accomplished blogger brings to this type of material lies in the ability to select and organise these quotations. A good blog, unlike a good essay, builds its argument and topic through the artful arrangement and excerpting of usually verbatim material passages from other people’s work—in much the same way that some types of music are based on the original use and combination of digitised sound samples from earlier recordings.

In other forums this method of “argument by quotation” is the norm: every video worth anything on YouTube has at least one response—a companion video where somebody else picks up on what the original contributor has done and answers back, usually with generous visual or verbal quotation. Professional examples include the various Barack Obama tributes that were a defining feature of the 2008 Democratic Primary in the U.S. (examples include the work of Obama Girl= and will.i.am= ). But amateur examples are also extremely common—as was the case with the heavy amateur response to the question of whether the “(lonelyGirl15)” series of 2005 was actually a professional production.

The real evidence of the evolving distinction between the essay and the blog as methods of argumentation and literary engagement, however, can be seen in the blogs that newspapers are increasingly asking their traditional opinion columnists to write. It is no longer enough to write essays about the news, though the continued existence and popularity of the (on-line and paper) newspaper column shows that there is still an important role for this kind of work. Newspapers (and presumably their readers) also now want columnists to document the process by which they gather the material they write about—creating a second channel in which they accumulate and react to facts and opinions alongside their more traditional essays. Among the older journalists, an example of this is Nicholas Kristof at the New York Times, who supplements his column with a blog and other interactive material about the subjects he feels most passionate about. In his column he digests evidence and makes arguments; in his blog he accumulates the raw material he uses to write his columns and presents it to others as part of a process of sharing his outrage.

In the case of our students, the problem this generic difference between the blog and the essay causes is magnified by the way they conduct their research. On the basis of my interviews, it appears to me that most of my first year students now conduct their research and compile their notes primarily by searching the Internet, and, when they find an interesting site, copying and pasting large sections of verbatim quotation into their word processor. Often they include the URL of this material with the quotations; but because you can always find the source of a passage you are quoting from the Internet, it is easy for them to get sloppy. Once this accumulation of material is complete, they then start to add their own contribution to the collection, moving the passages they have collected around and interspacing them with their opinions, arguments, and transitions.

This is, of course, how bloggers, not essayists, work. Unfortunately, since we are asking them to write essays, the result if they are not careful is something that belongs to neither genre: it is not a good blog, because it is not spontaneous, dynamic, or interactive enough; and it is not a good traditional essay, because it is more pastiche than an original piece of writing that takes its reader in an new direction. The best students working this way do in the end manage to overcome the generic mismatch between their method of research and their ultimate output, producing something that is more controlled and intellectually original than a blog. But less good students, or good students working under mid- or end-of-term pressure, are almost unavoidably leaving themselves open to producing work that is, in a traditional sense at least, plagiarised—by forgetting to distinguish, perhaps even losing track of the distinction, between their own comments and opinions and those of others, or by collecting and responding exclusively to passages mentioned in the work of others rather than finding new and original passages that support their particular arguments.

This is still plagiarism: it is no more acceptable to misrepresent the words and ideas of others as your own in the blogging world as it is in the world of the traditional essay. And in fact it is more invidious that the older style of plagiarism that involved copying large chunks out of other people’s work: in the new, digital plagiarism, the unackowledged debt tends to come in the few places that really matter in a good essay: the interesting thesis, the bold transition, the surprising piece of evidence that make the work worth reading. Because it is so closely tied to new genres and research methods, however, this type of plagiarism may also have as much a cultural as a “criminal” motivation. In preventing it, instructors will need to take into account the now quite different ways of working and understanding intellectual argument that the current generation of students bring with them into the classroom.

Advice to the Digital Essayist

So how can the contemporary student avoid becoming a Digital Plagiarist?

The first thing to do is realise the difference between the essay and the blog. When you write an essay, your reader is interested in your ability to digest facts and arguments and set your own argumentative agenda. A blog that did not allow itself to be driven by current events, incidents, and arguments in its field of endeavour—whether this is an event in the blogger’s personal life or the ebb and flow of an election campaign—would not be much of a blog. Essays are not bound by this constraint, however: they can be about things nobody is talking about and make arguments that don’t respond to anybody. Even when, as is more normal and probably better, essays do engage with previous arguments and topics that are of some debate, the expectation is that the essayist will digest this evidence and these opinions and shape the result in ways that point the reader in new directions—not primarily to new sources, but rather to new claims and ideas that are not yet part of the current discourse.

The second thing to realise is just how dangerous the approach many students take to note-taking is in terms of inviting charges of plagiarism. In a world of Google, where text is data that can be found, aggregated, copied, and reworked with the greatest of ease, it is of course very tempting to take notes by quotation. When people worked with paper, pens, and typewriters, quotation was more difficult and time-consuming: when you had to type out quotations by hand, writing summaries and notes was far quicker. Nowadays, it is much easier and less time-consuming to quote something than it is to take notes: when you find an interesting point in an on-line source, it uses far fewer keystrokes (and less intellectual effort) to highlight, copy, and paste the actual verbatim text of the source in a file than it does to turn to the keyboard and compose a summary statement or not. And if you are used to reading blogs, you know that this method can be used to summarise even quote long and complex arguments.

There are two problems, however. The first is that this method encourages you to write like a blogger rather than an essayist: your notes are set up in a way that makes it easier to write around your quotations (linking, organising, and responding to them) than to digest what they are saying and produce a new argument that takes your reader in unexpected directions.

The second problem is that it is almost inevitable that you will end up accidentally incorporating the words and ideas of your sources in your essay without acknowledgement. It is easy, in reworking your material, to drop a set of quotation marks, or to start paraphrasing something and then end up editing it back into an almost verbatim quotation—without realising what you’ve done. And it is even easier to get sloppy in your initial note-taking—forgetting to put quotation marks around passages you’ve copied or losing the source URL. Once you add your own material to this collection of quotations in the file that will eventually become your essay, you will discover that it is almost impossible to remember or distinguish between what you have added and what you got from somebody else.

One way of solving this is to change the way you take notes, doing less quoting and more summarising. Doing this might even help you improve the originality of your essays by forcing you to internalise your evidence and arguments. But cutting and pasting from digital sources is so easy that you are unlikely ever to stop doing it completely—and even if your do, you are very likely to run into trouble again the moment you face the pressure of multiple competing deadlines.

A better approach is to develop protocols and practices that help you reduce the chances that your research method will cause you to commit unintentional plagiarism. In other words to find a way of working that allows you to keep doing the most important part of what you currently do (and are going to continue to do no matter what your instructors say), but in a fashion that won’t lead you almost unavoidably into plagiarising from your sources at some point in your career.

Perhaps the single most important thing you can do in this regard is to establish a barrier between your research and your essay. In a blog, building your argument around long passages of text that you have cut and pasted into your own document is normal and accepted; in essay writing it isn’t. So when you come to write an essay, create two (or more) files: one for the copying and pasting you do as part of your research (or even better, one file for each source from which you copy and paste or make notes), and, most importantly, a separate file for writing your essay. In maintaining this separate file for you essays, you should establish a rule that nothing in this file is to be copied directly from an outside source. If you find something interesting in your research, you should copy this material into a research file; only if you decide to use it in your essay should should you copy it from your research file into your essay file.In other words, your essay file is focussed on your work: in that file, the words and ideas of others appear only when you need them to support your already existing arguments.

An even stricter way of doing this is to establish a rule that nothing is ever pasted into your essay file: if you want to quote a passage in your text, you can decide that you will only type it out by hand. This has the advantage of discouraging you from over-quoting or building your essay around the words of others—something that is fine in a blog, but bad in an essay. If this rule sounds too austere and difficult to enforce, at least make it a rule that you paste nothing into you essay before you have composed the surrounding material—i.e. the paragraph in which the passage is to appear and the sentence that is supposed to introduce it. Many professional essayists, especially those who learned to write before there were word-processors, actually leave examples and supporting quotations out of their earliest drafts—using place holders like “{{put long quotation from p35 here}}” to represent the material they are planning to quote until they have their basic argument down.

Another thing you could try is finding digital tools that will make your current copy-and-paste approach to note-taking more valuable and less dangerous. In the pre-digital era, students often took notes on note cards or in small notebooks. They would read a source in the library with a note card or notebook in front of them. They would begin by writing basic bibliographic information on this card or notebook. Then, when they read something interesting, they would write a note on the card or in the notebook, quoting the source if they thought the wording was particularly noteworthy or apt. By the time they came to write their essays, they would have stacks of cards or a series of notebooks, one dedicated to each work or idea.

There are several ways of replicating (and improving on) this method digitally. One way is to use new word-processor files for each source: every time you discover a new source, start a new file in your word-processor, recording the basic information you need to find the source again (URL, title, author, etc.). Then start pasting in your quotations and making your notes in this file. When you are finished you give your file a descriptive name that will help you remember where it came from and save it.

Using your word-processor for this method will be cumbersome (you’ll spend a lot of time opening and closing files), difficult to use when you come to write (in a major essay you might end up with tens of files open on your desktop alongside the new file for your essay), and difficult to oversee (unless you have an excellent naming system, you will end up with a collection of research files with cryptic sounding names of which you have forgotten the significance). And if you can’t remember the specific source of a given quotation or fact, it will be hard to find later without special tools or opening and closing each file.

But other tools exist that allow you to implement this basic method more easily. Citation managers such as Endnote or Refworks, for example, tie notes to bibliographic entries. If you decide to try one of these, you start your entry for a new source (i.e. the equivalent of your paper notebook or note card) by entering it in the bibliographic software (This will also allow you to produce correctly formatted bibliographies and work cited lists quickly and automatically later on when you are ready to hand in your paper in). You then use the “notes” section as the place for pasting quotations and adding comments and notes that you might want to reuse in your paper. There is no problem with naming files (your notes are all stored under the relevant bibliographic entry in a single database), with moving between sources (you call up the each source by the bibliographic reference), and in most cases you will be able to use a built in search function to find passages in your notes if you forget which particular work you read them in.

Bibliographic databases and citation managers are great if all your notes revolve around material from text-based sources. But what if you also need to record observations, evidence, interviews, and the like that cannot easily be tied to specific references? In this case, the best tool may be a private wiki—for example at PbWiki (or if you are computer literate, and have access to a server, a private installation of MediaWiki, the software that runs the Wikipedia).

We tend to think of wikis as being primarily media for the new type of writing that characterises collaborative web applications like the Wikipedia or Facebook. In actual fact, however, wikis have a surprising amount in common with the notebooks or stacks of note cards students used to bring with them to the library. Unlike an entry in citation management software, wiki entry pages are largely free-form space on which you can record arbitrary types of information—a recipe, an image (more accurately a link to an image rather than the image itself), pasted text, bibliographic information, tables of numerical data, and your own annotations and comments on any of the above. As with an index card, you can return to your entry whenever you want in order to add or erase things (though a wiki entry, unlike an index card preserves all your original material as well), or let others comment on. And as with note cards you can shuffle and arrange them in various different ways depending on your needs—using the category feature, you can create groupings that collect all the pages you want to use in a given essay, or that refer to a specific source, or involve a particular topic. Of course unlike notes cards which had to be sorted physically, wiki entries can simultaneously belong to more than one grouping; and because they are stored in a database, you can search your wiki automatically, looking for passages and ideas even if you don’t remember where you saw them.

However you decide to solve this problem, the most important thing is to avoid the habit which is most likely to lead you into (unintentionally) plagiarising from your sources: starting an essay by copying and pasting large passages of direct quotation into the file that you ultimately intend to submit to your instructor. In an essay, unlike a blog, the point is to hear what you have to say.


1 I now take back the claim that this is as insignificant as I thought. In the year-end papers, I found a surprisingly large number of papers with plagiarised passaged in them (five or six out of sixty with perhaps one or two doubtful cases). At the same time, a paper-by-paper review of the originality reports still seems to confirm that one can rely on one’s hunches—I’ve not yet found plagiarism in a paper that didn’t seem right as I was reading it. The larger number of hits is coming from the ability turnitin is giving me to check my hunches more easily and quickly, The pattern I describe above of writing between large quotations and paraphrases still seems to be holding true, however—as is the age or generational difference: my senior students are not nearly as likely to write essays like this.

----  

Using Oxygen and Subversion client

Posted: Aug 20, 2008 15:08;
Last Modified: May 23, 2012 18:05

Tags: , , , , , , ,

---

Here are instructions for using Oxygen for accessing the Littlechief Project Subversion server.

1) Open Oxygen. It should looks something like this (if you’ve used it before there may be files loaded already in the main window):

2) Select Tools then SVN Client in order to open the SVN Client. You should then be presented with a screen that looks something like this:

3) The two panels we need to use are in the top and bottom left (Repositories and Working Copy). In Repositories we will place the address of our SVN client; in Working Copies we will put the directory on our local machine (i.e. the computer we are using) where we want the files to be stored.

a) Check if the SVN repository address for the Littlechief project is listed as it is here:

If it isn’t, install the repository by selecting Repository > New Repository Location. You will be presented with a small dialogue like this:

Enter the repository address you have received separately in the blank and click O.K.

b) Check that the working directory you wish to use is loaded in the bottom left panel.

If it isn’t, add a new working directory by selecting Repository and then Check Out. A working directory dialogue like this will open:

Browse to an appropriate directory or create a new folder for the repository. Click on O.K. and Oxygen will start downloading the files to your local machine.

4) To edit a document in Oxygen, select the file you want in the “working directory” panel in SVN Client (bottom left—all our working xml files are in the folder 3_workingXML). Right click on the file and select “Open in Oxygen”:

Normally Oxygen will then pop up on your screen with the file loaded. If it doesn’t, select Oxygen from the file chooser bar (along the bottom of the desktop in Windows).

----  

An Anglo-Saxon Timeline

Posted: Jul 20, 2008 11:07;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , , ,

---

The following link is to an experiment in constructing a timeline of the Anglo-Saxon period: http://people.uleth.ca/~daniel.odonnell/Anglo-Saxon_Kings.xml It is very much a work in progress at the moment. The ultimate goal will be to have a synoptic oversight and index that will allow students to click on major events, persons, or cultural artefacts and then see how they fit in with other milestones.

At the moment, the chart only includes Kings. And even then still in fairly rough fashion.

----  

Transcription Guidelines

Posted: Nov 19, 2007 12:11;
Last Modified: May 23, 2012 19:05

Tags: , , , , , , , , , , ,

---

The following is a list of typographical conventions to use when transcribing medieval manuscripts in my classes.



deletion

Strikethrough indicates the physical deletion of text in a witness. Deletion may be by any method (underlining, punctum delens, erasure, overwriting, etc). You should indicate the precise method of deletion by a note at the end of your transcription. The deleted text is recorded whenever possible. If deleted text cannot be recovered, it is replaced by colons.

You indicate strikethrough in HTML as follows <strike>Text struck through</strike>


\addition/

Upward sloping brackets indicate that the enclosed text has been added above the manuscript line. If a caret was used, this is indicated with a preceding comma or caret symbol (⁁): ⁁\addition above the line/.

|addition|

Vertical brackets indicate that the enclosed text has been inserted between existing characters within the manuscript line. Insertion is distinguished from overwriting (i.e. the conversion of one character to another or the addition of a new character in the space left by a previously deleted form).

{addition}

Brackets indicate that the enclosed text has been added over some pre-existing form. This addition may involve the conversion of one letter to another (for example, the conversion of to by the addition of an ascender), or the addition of new text in the place of a previous erasure. The overwritten text is treated as a deletion.

/addition\

Downward sloping brackets indicate that the enclosed text has been added below the manuscript line.

addition| or |addition

A single vertical bar indicates that the text has been added at the beginning or end of a manuscript line. Text preceded by a single vertical bar has been added at the end of a manuscript line. Text followed by a single vertical bar has been added at the beginning of a manuscript line. Text between two vertical bars has been added “within the line” (i.e. between pre-existing letters or words).

damage

Underlining indicates that text has been damaged. When damaged text is unclear or illegible, additional symbols are used.

In HTML, you indicate text is underlined as follows: <u>Underlined text</u>.


〈unclear〉

Angle brackets indicate that the enclosed text is unclear for some physical reason (e.g. rubbing, flaking, staining, poorly executed script).

In HTML, there is a distinction between angled brackets ( and ) and the greater than and less than signs (> and <). If you use the greater and less than signs, your text will not appear as the browser will think your text is an HTML code.


[supplied] or [emended]

Square brackets indicate that the enclosed text is being supplied or emended. “Supplied text” refers to the hypothetical restoration of original readings from a specific witness that have become lost or illegible due to some physical reason. “Emended text” refers to the replacement of legible text from extant witnesses by a modern editor or transcriber.
::

Colons represent text that is completely effaced or illegible. The number of colons used corresponds roughly to the number of letters the transcriber believes are missing. Note that colons are used for text that was in the manuscript but is not physically missing due to erasure or other damage. They are not used to indicate text that has not been copied into the manuscript but appears in other versions.

----  

Back to the future: What digital editors can learn from print editorial practice.

Posted: Feb 09, 2007 18:02;
Last Modified: May 23, 2012 19:05

Tags: , , , , , ,

---

A ersion of this essay was published in literary and Linguistic Computing

Digital Editing and Contemporary Textual Studies

The last decade or so has proven to be a heady time for editors of digital editions. With the maturation of the digital medium and its application to an ever increasing variety of cultural objects, digital scholars have been led to consider their theory and practice in fundamental terms (for a recent collection of essays, see Burnard, O’Keeffe, and Unsworth 2006). The questions they have asked have ranged from the nature of the editorial enterprise to issues of academic economics and politics; from problems of textual theory to questions of mise-en-page and navigation: What is an Edition? What kinds of objects can it contain? How should it be used? Must it be critical? Must it have a reading text? How should it be organised and displayed? Can intellectual responsibility be shared among editors and users? Can it be shared across generations of editors and users? While some of these questions clearly are related to earlier debates in print theory and practice, others involve aspects of the production of editions not relevant to or largely taken for granted by previous generations of print-based editors.

The answers that have developed to these questions at times have involved radical departures from earlier norms1. The flexibility inherent to the electronic medium, for example, has encouraged editors to produce editions that users can manipulate interactively, displaying or suppressing different types of readings, annotation, and editorial approaches, or even navigate in rudimentary three-dimensional virtual reality (e.g. Railton 1998-; Foys 2003; O’Donnell 2005a; Reed Klein 2001; Ó Cróinín nd). The relatively low production, storage, and publication costs associated with digital publication, similarly, have encouraged the development of the archive as the de facto standard of the genre: users of digital editions now expect to have access to all the evidence used by the editors in the construction of their texts (assuming, indeed, that editors actually have provided some kind of mediated text): full text transcriptions, high-quality facsimiles of all known witnesses, and tools for building alternate views of the underlying data (e.g. Kiernan 1999/2003; Robinson 1996). There have been experiments in editing non-textual objects (Foys 2003; Reed-Kline 2001), in producing image-based editions of textual objects (Kiernan 1999/2003), and in recreating digitally aspects of the sensual experience users might have had in consulting the original objects (British Library nd). There have been editions that radically decenter the reading text (e.g. Robinson 1996), and editions that force users to consult their material using an editorially imposed conceit (Reed-Kline 2001). Even elements carried over from traditional print practice have come in for experimentation and redesign: the representation of annotation, glossaries, or textual variation, for example, are rarely the same in any two electronic editions, even in editions published by the same press (see O’Donnell 2005b, § 5)2.

Much of the impetus behind this theoretical and practical experimentation has come from developments in the wider field of textual and editorial scholarship, particularly work of the book historians, new philologists, and social textual critics who came into prominence in the decade preceding the publication of the earliest modern digital editorial projects (e.g. McKenzie 1984/1999; McGann 1983/1992; Cerquiglini 1989; Nicols 1990; for a review see Greetham 1994, 339-343). Despite significant differences in emphasis and detail, these approaches are united by two main characteristics: a broad interest in the editorial representation of variance as a fundamental feature of textual production, transmission, and reception; and opposition to earlier, intentionalist, approaches that privileged the reconstruction of a hypothetical, usually single, authorial text over the many actual texts used and developed by historical authors, scribes, publishers, readers, and scholars. Working largely before the revolution in Humanities Computing brought on by the development of structural markup languages and popularity of the Internet, these scholars nevertheless often expressed themselves in technological terms, calling for changes in the way editions were printed and organised (see, for example, the call for a loose leaf edition of Chaucer in Pearsall 1985) or pointing to the then largely incipient promise of the new digital media for representing texts as multiforms (e.g. McGann 1994; Shillingsburg 1996).

Digital Editing and Print Editorial Tradition

A second, complementary, impetus for this experimentation has been the sense that the digital editorial practice is, or ought to be, fundamentally different from and even opposed to that of print. This view is found to a greater or lesser extent in both early speculative accounts of the coming revolution (e.g. McGann 1994; the essays collected in Finneran 1996 and Landow and Delaney 1993) and subsequent, more sober and experienced discussions of whether digital practice has lived up to its initial promise (e.g. Robinson 2004, 2005, 2006; Karlsson and Malm 2004). It is characterised both by a sense that many intellectual conventions found in print editions are at their root primarily technological in origin, and that the new digital media offer what is in effect a tabula rasa upon which digital editors can develop new and better editorial approaches and conventions to accommodate the problems raised by textual theorists of the 1980s and 1990s.

Of course in some cases, this sense that digital practice is different from print is justified. Organisational models such as the Intellectual Commons or Wiki have no easy equivalent in print publication (O’Donnell Forthcoming). Technological advances in our ability to produce, manipulate, and store images cheaply, likewise, have significantly changed what editors and users expect editions to tell them about the primary sources. The ability to present research interactively has opened up rhetorical possibilities for the representation of textual scholarship difficult or impossible to realise in the printed codex.

But the sense that digital practice is fundamentally different from print has been also at times more reactionary than revolutionary. If digital theorists have been quick to recognise the ways in which some aspects of print editorial theory and practice have been influenced by the technological limitations of the printed page, they have been also at times too quick to see other, more intellectually significant aspects of print practice as technological quirks. Textual criticism in its modern form has a history that is now nearly 450 years old (see Greetham 1994, 313); seen more broadly as a desire to produce “better” texts (however “better” is defined at the moment in question), it has a history stretching back to the end of the sixth century BCE and is “the most ancient of scholarly activities in the West” (Greetham 1994, 297). The development of the critical edition over this period has been as much an intellectual as a technological process. While the limitations of the printed page have undoubtedly dictated the form of many features of the traditional critical edition, centuries of refinement—by trial-and-error as well as outright invention—also have produced conventions that transcend the specific medium for which they were developed. In such cases, digital editors may be able to improve upon these conventions by recognising the (often unexpressed) underlying theory and taking advantage of the superior flexibility and interactivity of the digital medium to improve their representation.

The Critical Text in a Digital Age

Perhaps no area of traditional print editorial practice has come in for more practical and theoretical criticism than the provision of synthetic, stereotypically eclectic, reading texts3. Of course this criticism is not solely the result of developments in the digital medium: suspicion of claims to definitiveness and privilege is, after all, perhaps the most characteristic feature of post-structuralist literary theory. It is the case, however, that digital editors have taken to avoiding the critical text with a gusto that far outstrips that of their print colleagues. It is still not unusual to find a print edition with some kind of critical text; the provision of similarly critical texts in digital editions is far less common. While most digital projects do provide some kind of top-level reading text, few make any strong claims about this text’s definitiveness. More commonly, as in the early ground breaking editions of the Canterbury Tales Project (CTP), the intention of the guide text is, at best, to provide readers with some way of organising the diversity without making any direct claim to authority (Robinson nd):

We began… work [on the CTP] with the intention of trying to recreate a better reading text of the Canterbury Tales. As the work progressed, our aims have changed. Rather than trying to create a better reading text, we now see our aim as helping readers to read these many texts. Thus from what we provide, readers can read the transcripts, examine the manuscripts behind the transcripts, see what different readings are available at any one word, and determine the significance of a particular reading occurring in a particular group of manuscripts. Perhaps this aim is less grand than making a definitive text; but it may also be more useful.

There are some exceptions to this general tendency—both in the form of digital editions that are focussed around the provision of editorially mediated critical texts (e.g. McGillivray 1997; O’Donnell 2005a) and projects, such as the Piers Plowman Electronic Archive (PPEA), that hope ultimately to derive such texts from material collected in their archives. But even here I think it is fair to say that the provision of a synthetic critical text is not what most digital editors consider to be the really interesting thing about their projects. What distinguishes the computer from the codex and makes digital editing such an exciting enterprise is precisely the ability the new medium gives us for collecting, cataloguing, and navigating massive amounts of raw information: transcriptions of every witness, collations of every textual difference, facsimiles of every page of every primary source. Even when the ultimate goal is the production of a critically mediated text, the ability to archive remains distracting4.

In some areas of study, this emphasis on collection over synthesis is perhaps not a bad thing. Texts like Piers Plowman and the Canterbury Tales have such complex textual histories that they rarely have been archived in any form useful to the average scholar; in such cases, indeed, the historical tendency—seen from our post-structuralist perspective—has been towards over-synthesis. In these cases, the most popular previous print editions were put together by editors with strong ideas about the nature of the textual history and/or authorial intentions of the works in question. Their textual histories, too, have tended to be too complex for easy presentation in print format (e.g. Manley and Rickert 1940). Readers with only a passing interest in these texts’ textual history have been encouraged implicitly or explicitly to leave the question in the hands of experts.

The area in which I work, Old English textual studies, has not suffered from this tendency in recent memory, however. Editions of Old English texts historically have tended to be under- rather than over-determined, even in print (Sisam 1993; Lapidge 1994, 1991). In most cases, this is excused by the paucity of surviving witnesses. Most Old English poems (about 97% of the known canon) survive in unique manuscripts (O’Donnell 1996a; Jabbour 1968; Sisam 1953). Even when there is more primary material, Anglo-Saxon editors work in a culture that resists attempts at textual synthesis or interpretation, preferring parallel-text or single-witness manuscript editions whenever feasible and limiting editorial interpretation to the expansion of abbreviations, word-division, and metrical layout, or, in student editions, the occasional normalisation of unusual linguistic and orthographic features (Sisam 1953). One result of this is that print practice in Anglo-Saxon studies over the last century or so has anticipated to a great extent many of the aspects that in other periods distinguish digital editions from their print predecessors.

Cædmon’s Hymn: A Case Study

The scholarly history of Cædmon’s Hymn, a text I have recently edited for the Society of Early English and Norse Electronic Texts series (O’Donnell 2005a), is a perfect example of how this tendency manifests itself in Old English studies. Cædmon’s Hymn is the most textually complicated poem of the Anglo-Saxon period, and, for a variety of historical, literary, and scholarly reasons, among the most important: it is probably the first recorded example of sustained poetry in any Germanic language; it is the only Old English poem for which any detailed account of its contemporary reception survives; and it is found in four recensions and twenty-one medieval manuscripts, a textual history which can be matched in numbers, but not complexity, by only one other vernacular Anglo-Saxon poem (the most recent discussion of these issues is O’Donnell 2005a).

The poem also has been well studied. Semi-diplomatic transcriptions of all known witnesses were published in the 1930s (Dobbie 1937)5. Facsimiles of the earliest manuscripts of the poem (dating from the mid-eighth century) have been available from various sources since the beginning of the twentieth century (e.g. Dobiache-Rojdestvensky 1928) and were supplemented in the early 1990s by a complete collection of high quality black and white photos of all witnesses in Fred C. Robinson and E.G. Stanley ‘s Old English Poems from Many Sources (1991). Articles and books on the poem’s transmission and textual history have appeared quite regularly for over a hundred years. The poem has been at the centre of most debates about the nature of textual transmission in Anglo-Saxon England since at least the 1950s. Taken together, the result of this activity has been the development of an editorial form and history that resembles contemporary digital practice in everything but its medium of production and dissemination. Indeed, in producing a lightly mediated, witness- and facsimile-based archive, constructed over a number of generations by independent groups of scholars, Cædmon’s Hymn textual criticism even anticipates several recent calls for the development of a new digital model for collective, multi-project and multi-generational editorial work (e.g. Ore 2004; Robinson 2005).

The print scholarly history of the poem anticipates contemporary digital practice in another way as well: until recently, Cædmon’s Hymn had never been the subject of a modern critical textual edition. The last century has seen the publication of a couple of student editions of the poem (e.g. Pope and Fulk 2001; Mitchell and Robinson 2001), and some specialised reconstructions of one of the more corrupt recensions (Cavill 2000, O’Donnell 1996b, Smith 1938/1978, Wuest 1906). But there have been no critical works in the last hundred years that have attempted to encapsulate and transmit in textual form what is actually known about the poem’s transmission and recensional history. The closest thing to a standard edition for most of this time has been a parallel text edition of the Hymn by Elliot Van Kirk Dobbie (1942). Unfortunately, in dividing this text into Northumbrian and West-Saxon dialectal recensions, Dobbie produced an edition that ignored his own previous and never renounced work demonstrating that such dialectal divisions were less important that other distinctions that cut across dialectal lines (Dobbie 1937)6.

The Edition as Repository of Expert Knowledge

The problem with this approach—to Cædmon’s Hymn or any other text—should be clear enough. On the one hand the poem’s textual history is, by Anglo-Saxon standards, quite complex and the subject of intense debate by professional textual scholars. On the other, the failure until recently to provide any kind of critical text representing the various positions in the debate has all but hidden the significance of this research—and its implications for work on other aspects of the Hymn_—from the general reader. Instead of being able to take advantage of the expert knowledge acquired by editors and textual scholars of the poem over the last hundred years, readers of _Cædmon’s Hymn instead have been forced either to go back to the raw materials and construct their own texts over and over again or rely on a standard edition that misrepresents its own editor’s considered views of the poem’s textual history.

This is not an efficient use of these readers’ time. As Kevin Kiernan has argued, the textual history of Cædmon’s Hymn is not a spectacle for casual observers (Kiernan 1990), and most people who come to study Cædmon’s Hymn are not interested in collating transcriptions, deciphering facsimiles, and weighing options for grouping the surviving witnesses. What they want is to study the poem’s sources and analogues, its composition and reception, its prosody, language, place in the canon, significance in the development of Anglo-Saxon Christianity, or usefulness as an index in discussions of the position of women in Anglo-Saxon society—that is, all the other things we do with texts when we are not studying their transmission. What these readers want—and certainly what I want when I consult an edition of a work I am studying for reasons other than its textual history—is a text that is accurate, readable, and hopefully based on clearly defined and well-explained criteria. They want, in other words, to be able to take advantage of the expert knowledge of those responsible for putting together the text they are consulting. If they don’t like what they see, or if the approach taken is not what they need for their research, then they may try to find an edition that is better suited to their particular needs. But they will not—except in extreme cases I suspect—actually want to duplicate the effort required to put together a top-quality edition.

The Efficiency of Print Editorial Tradition

The failure of the print editors of Cædmon’s Hymn over the last hundred years to provide a critical-editorial account of their actual knowledge of the poem is very much an exception that proves the rule. For in anticipating digital approaches to textual criticism and editorial practice, textual scholars of Cædmon’s Hymn have, ironically, done a much poorer job of supplying readers with information about their text than the majority of their print-based colleagues have of other texts in other periods.

This is because, as we shall see, the dissemination of expert knowledge is something that print-based editors are generally very good at. At a conceptual level, print approaches developed over the last several hundred years to the arrangement of editorial and bibliographic information in the critical edition form an almost textbook example for the parsimonious organisation of information about texts and witnesses. While there are technological and conventional limitations to the way this information can be used and presented in codex form, digital scholars would be hard pressed to come up with a theoretically more sophisticated or efficient organisation for the underlying data.

Normalisation and Relational Database Design

Demonstrating the efficiency of traditional print practice requires us to make a brief excursion into questions of relational database theory and design7. In designing a relational database, the goal is to generate a set of relationship schemas that allow us to store information without unnecessary redundancy but in a form that is easily retrievable (Silberschatz, Korth, and Sudarshan 2006, 263). The relational model organises information into two-dimensional tables, each row of which represents a relationship among associated bits of information. Complex data commonly requires the use of more than one set of relations or tables. The key thing is to avoid complex redundancies: in a well designed relational database, no piece of information that logically follows from any other should appear more than once8.

The process used to eliminate redundancies and dependencies is known as normalisation. When data has been organised so that it is free of all such inefficiencies, it is usually said to be in third normal form. How one goes about doing this can be best seen through an example. The following is an invoice from a hypothetical book store (adapted from Krishna 1992, 32):

Invoice: JJSmith0001
Customer ID: JJS01
Name: Jane J. Smith
Address: 323 Fifteenth Street S., Lethbridge, Alberta T1K 5X3.
ISBN Author Title Price Quantity Item Total
0-670-03151-8 Pinker, Stephen The Blank Slate: The Modern Denial of Human Nature $35.00 1 $35.00
0-8122-3745-5 Burrus, Virginia The Sex Lives of Saints: An Erotics of Ancient Hagiography $25.00 2 $50.00
0-7136-0389-5 Dix, Dom Gregory The Shape of the Liturgy $55.00 1 $55.00
Grand Total $140.00

Describing the information in this case in relational terms is a three step process. The first step involves identifying what is that is to be included in the data model by extracting database field names from the document’s structure. In the following, parentheses are used to indicate information that can occur more than once on a single invoice:

Invoice: invoice_number, customer_id, customer_name, customer_address, (ISBN, author, title, price, quantity, item_total), grand_total

The second step involves extracting fields that contain repeating information and placing them in a separate table. In this case, the repeating information involves bibliographical information about the actual books sold (ISBN, author, title, price, quantity, item_total). The connection between this new table and the invoice table made explicit through the addition of an invoice_number key that allows each book to be associated with a specific invoice9:

Invoice: invoice_number, customer_id, customer_name, customer_address, grand_total

Invoice_Item: invoice_number, ISBN, author, title, price, quantity, item_total

The final step involves removing functional dependencies within these two tables. In this database, for example, information about a book’s author, title and item_price are functionally dependent on its ISBN: for each ISBN, there is only one possible author, title, and item_price. Likewise customer_id is associated with only one customer_name and customer_address. These dependencies are eliminated by placing the dependent material in two new tables, Customer and Book, which are linked to rest of the data by the customer_id and ISBN keys respectively.

At this point the data is said to be in third normal form: we have four sets of relations, none of which can be broken down any further:

Invoice: invoice_number, customer_id, grand_total

Invoice_Item: invoice_number, ISBN, quantity, item_total

Customer: customer_id, customer_name, customer_address

Book: ISBN, author, title, price

Normalising Editorial Data

The normalisation process becomes interesting when one applies it to the type of information editors commonly collect about textual witnesses. The following, for example, is a simplified version of a sheet I used to record basic information about each manuscript witness to Cædmon’s Hymn:

Shelf-Mark: B1 Cambridge, Corpus Christi College 41
Date: s. xi-1
Scribe: Second scribe of the main Old English text.
Location: Copied as part of the main text of the Old English translation of the Historia ecclesiastica (p. 332 [f. 161v]. line 6)
Recension: West-Saxon eorðan recension
Text: Nuweherigan sculon

heofonrices weard metodes mihte

&hismod ge þanc weorc wuldor godes

[etc]

From the point of view of the database designer, this sheet has what are essentially fields for the manuscript sigil, date, scribe, location, and, of course, the text of the poem in the witness itself, something that can be seen, on analogy with our book store invoice, as itself a repeating set of (largely implicit) information: manuscript forms, normalised readings, grammatical and lexical information, metrical position, relationship to canonical referencing systems, and the like.

As with the invoice from our hypothetical bookstore, it is possible to place this data in normal form. The first step, once again, is to extract the relevant relations from the manuscript sheet and, in this case, the often unstated expert knowledge an editor typically brings to his or her task. This leads at the very least to the following set of relations10:

Manuscript: shelf_mark, date, scribe, location, (ms_instance, canonical_reading, dictionary_form, grammatical_information, translation)

Extracting the repeating information about individual readings, leaves us with two tables linked by the key shelf_mark:

Manuscript: shelf_mark, date, scribe, location
bq(code). Text: shelf_mark, ms_instance, canonical_reading,
bq(code). dictionary_form, grammatical_information, translation

And placing the material in third normal form generates at least one more:

Manuscript: shelf_mark, date, scribe, location

Text: shelf_mark, ms_instance, canonical_reading

Glossary: canonical_reading, dictionary_form, grammatical_information, translation

At this point, we have organised our data in its most efficient format. With the exception of the shelf_mark and canonical_reading keys, no piece of information is repeated in more than one table, and all functional dependencies have been eliminated. Of course in real life, there would be many more tables, and even then it would be probably impossible—and certainly not cost effective—to treat all editorial knowledge about a given text as normalisable data.

What is significant about this arrangement, however, is the extent to which our final set of tables reflects the traditional arrangements of information in a stereotypical print edition: a section up front with bibliographic (and other) information about the text and associated witnesses; a section in the middle relating manuscript readings to editorially privileged forms; and a section at the end containing abstract lexical and grammatical information about words in the text. Moreover, although familiarity and the use of narrative can obscure this fact in practice, much of the information contained in these traditional sections of a print edition actually is in implicitly tabular form: in structural terms, a glossary are best understood as the functional equivalent of a highly structured list or table row, with information presented in a fixed order from entry to entry. Bibliographical discussions, too, often consist of what are in effect, highly structured lists that can easily be converted to tabular format: one cell for shelf-mark, another for related bibliography, provenance, contents, and the like11.

Database Views and the Critical Text

This analogy between the traditional arrangement of editorial matter in print editions and normalised data in a relational database seems to break down, however, in one key location: the representation of the abstract text. For while it is possible to see the how the other sections of a print critical edition might be rendered in tabular form, the critical text itself—the place where editors present an actual reading as a result of their efforts—is not usually presented in anything resembling the non-hierarchical, tabular form a relational model would lead us to expect. In fact, the essential point of the editorial text—and indeed the reason it comes in for criticism from post-structuralists—is that it eliminates non-hierarchical choice. In constructing a reading text, print editors impose order on the mass of textual evidence by privileging individual readings at each collation point. All other forms—the material that would make up the Text table in a relational database—is either hidden from the reader or relegated, and even then usually only as a sample, to appearance in small type at the bottom of the page in the critical apparatus. Although it is the defining feature of the print critical edition, the critical text itself would appear to be the only part that is not directly part of the underlying, and extremely efficient, relational data model developed by print editors through the centuries.

But this does not invalidate my larger argument, because we build databases precisely in order to acquire this ability to select and organise data. If the critical text in a print edition is not actually a database table, it is a database view—that is to say a “window on the database through which data required for a particular user or application can be accessed” (Krishna 1992, 210). In computer database management systems, views are built by querying the underlying data and building new relations that contain one or more answers from the results. In print editorial practice, editors build critical texts by “querying” their knowledge of textual data at each collation point in a way that produces a single editorial reading. In this understanding, a typical student edition of a medieval or classical text might be understood as a database view built on the query “select the manuscript or normalised reading at each collation point that most closely matches paradigmatic forms in standard primers.” A modern-spelling edition of Shakespeare can be understood as the view resulting from a database query that instructs the processor to replace Renaissance spellings for the selected forms with their modern equivalents. And an edition like the Kane-Donaldson Piers Plowman can be understood as a view built on basis of a far more complex query derived from the editors’ research on metre, textual history, and scribal practice. Even editorial emendations are, in this sense, simply the result of a query that requests forms from an unstated “normalised/emended equivalent” column in the editors’ intellectual understanding of the underlying textual evidence: “select readings from the database according to criteria x; if the resulting form is problematic, substitute the form found in the normalised/emended_equivalent column.”12.

How Digital Editors can Improve on Print Practice

If this understanding of the critical text and its relationship to the data model underlying print critical practice is correct, then digital editors can almost certainly improve upon it. One obvious place to start might seem to lie in the formalising and automating the process by which print editors process and query the data upon which their editions are based. Such an approach, indeed, would have two main advantages: it would allow us to test others’ editorial approaches by modelling them programatically; and it would allows us to take advantage of the inherent flexibility of the digital medium by providing users with access to limitless critical texts of the same work. Where, for economic and technological reasons, print editions tend to offer readers only a single critical approach and text, digital editions could now offer readings a series of possible approaches and texts built according to various selection criteria. In this approach, users would read texts either by building their own textual queries, or by selecting pre-made queries that build views by dynamically modelling the decisions of others—a Kane-Donaldson view of Piers Plowman, perhaps, or a Gabler reading text view of Ulysses.

This is an area of research we should pursue, even though, in actual practice, we are still a long way from being able to build anything but the simplest of texts in this manner. Certain processes can, of course, be automated and even improved upon electronically—we can use computers to collate readings from different witnesses, derive manuscript stemma, automatically normalise punctuation and spelling, and even model scribal performance (see Ciula 2005; O’Donnell 2005c). And it is easy to see how it we might be able to build databases and queries so that we could model human editorial decisions in relatively simple cases—reproducing the flawed dialectal texts of Cædmon’s Hymn discussed above, perhaps, or building simple student editions of small poems.

Unfortunately, such conceptually simple tasks are still at the extreme outer limits of what it is currently possible, let alone economically reasonable, to do. Going beyond this and learning to automate higher-level critical decisions involving cultural, historical, or literary distinctions, is beyond the realm of current database design and artificial intelligence even for people working in fields vastly better funded than textual scholarship. Thus, while it would be a fairly trivial process to generate a reading text based on a single witness from an underlying relational database, building automatically a best text edition—that is to say, an edition in which a single witness is singled out automatically for reproduction on the basis of some higher-level criteria—is still beyond our current capabilities. Automating other distinctions of the type made every day by human editors—distinguishing between good and bad scribes, assessing difficilior vs. facilior readings, or weighing competing evidence of authorial authorisation—belong as yet to the realm of science fiction.13.

This doesn’t let us off the hook, however. For while we are still far away from being able to truly automate our digital textual editions, and we do need to find some way of incorporating expert knowledge into digital editions that are becoming ever more complex. The more evidence we cram into our digital editions, the harder it becomes for readers to make anything of them. No two witnesses to any text are equally reliable, authentic, or useful for all purposes at all times. In the absence of a system that can build custom editions in response to naïve queries—“build me a general interest text of Don Juan”, “eliminate unreliable scribes”, or even “build me a student edition“—digital editors still need to provide readers with explicit expert guidance as to how the at times conflicting data in their editions is to be assessed. In some cases, it is possible to use hierarchical and object-oriented data models to encode these human judgements so that they can be generated dynamically (see note 14 above). In other cases, digital editors, like their print predecessors, will simply have to build critical texts of their editions the old fashioned way, by hand, or run the risk or failing to pass on the expert knowledge they have built up over years of scholarly engagement with the primary sources.

It is here, however, that digital editors can improve theoretically and practically the most on traditional print practice. For if critical reading texts are, conceptually understood, the equivalent of query-derived database views, then there is no reason why readers of critical editions should not be able to entertain multiple views of the underlying data. Critical texts, in other words—as post-structuralist theory has told us all along—really are neither right nor wrong: they are simply views of a textual history constructed according to different, more or less explicit, selection criteria. In the print world, economic necessity and technological rigidity imposed constraints on the number of different views editors could reasonably present to their readers—and encouraged them in pre post-structuralist days to see the production of a single definitive critical text as the primary purpose of their editions. Digital editors, on the other hand, have the advantage of a medium that allows the inclusion much more easily of multiple critical views, a technology in which the relationship between views and data is widely known and accepted, and a theoretical climate that encourages an attention to variance. If we are still far from being at the stage in which we can produce critical views of our data using dynamic searches, we are able even now to hard-code such views into our editions in unobtrusive and user-friendly ways.14. By taking advantage of the superior flexibility inherent in our technology and the existence of a formal theory that now explains conceptually what print editors appear to have discovered by experience and tradition, we can improve upon print editorial practice by extending it to the point that it begins to subvert the very claims to definitiveness we now find so suspicious. By being more like our print predecessors, by ensuring that our expert knowledge is carefully and systematically encoded in our texts, we can, ironically, use the digital medium to offer our readers a greater flexibility in how they use our work.

Conclusion

And so in the end, the future of digital editing may lie more in our past than we commonly like to consider. While digital editorial theory has tended to define its project largely in reaction to previous print practice, this approach underestimates both the strength of the foundation we have been given to build upon and the true significance of our new medium. For the exciting thing about digital editing is not that it can do everything differently, but rather that it can do some very important things better. Over the course of the last half millennium, print editorial practice has evolved an extremely efficient intellectual model for the organisation of information about texts and witnesses—even as, in the last fifty years, we have become increasingly suspicious of the claims to definitiveness this organisation was often taken to imply. As digital editors, we can improve upon the work of our predecessors by first of all recognising and formalising the intellectual strength of the traditional editorial model and secondly reconciling it to post-structuralist interest in variation and change by implementing it far more fully and flexibly than print editors themselves could ever imagine. The question we need to answer, then, is not whether we can do things differently but how doing things differently can improve on current practice. But we won’t be able to answer this question until we recognise what current practice already does very very well.

Works Cited

Bart, Patricia R. 2006. Controlled experimental markup in a TEI-conformant setting. Digital Medievalist 2.1 <http://www.digitalmedievalist.org/article.cfm?RecID=10>.

British Library, nd. Turning the Pages. <http://www.bl.uk/onlinegallery/ttp/ttpbooks.html>.

Cavill, Paul. 2000. The manuscripts of Cædmon’s Hymn. Anglia 118: 499-530.

Cerquiglini, Bernard. 1989. Éloge de la variante: Histoire critique de la philologie. Paris: Éditions de Seuil.

Ciula, Arianna. 2005. Digital palaeography: Using the digital representation of medieval script to support palaeographic analysis. Digital Medievalist 1.1 <http://www.digitalmedievalist.org/article.cfm?RecID=2>

Dobbie, Elliott Van Kirk. 1937. The manuscripts of Cædmon’s Hymn and Bede’s Death Song with a critical text of the Epistola Cuthberti de obitu Bedæ. Columbia University Studies in English and Comparative Literature, 128. New York: Columbia University Press.

───, ed. 1942. The Anglo-Saxon minor poems. The Anglo-Saxon Poetic Records, a Collective Edition, 6. New York: Columbia University Press.

Dobiache-Rojdestvensky, O. 1928. Un manuscrit de Bède à Léningrad. Speculum 3: 314-21.

Finneran, Richard J., ed. 1996. The literary text in the digital age. Ann Arbor: University of Michigan Press.

Foys, Martin K., ed. 2003. The Bayeux Tapestry: Digital Edition. Leicester: SDE.

Greetham, D.C. 1994. Textual Scholarship. New York: Garland.

Jabbour, A. A. 1968. The memorial transmission of Old English poetry: a study of the extant parallel texts. Unpublished PhD dissertation, Duke University.

Karlsson, Lina and Linda Malm. 2004. Revolution or remediation? A study of electronic scholarly editions on the web. HumanIT 7: 1-46.

Kiernan, Kevin S. 1990 Reading Cædmon’s Hymn with someone else’s glosses. Representations 32: 157-74.

───, ed. 1999/2003. The electronic Beowulf. Second edition. London: British Library.

Krishna, S. 1992. Introduction to database and knowledge-base systems. Singapore: World Scientific.

Landow, George P. and Paul Delaney, eds. 1993. The digital word: text-based computing in the humanities. Cambridge, MA, MIT Press.

Lapidge, Michael. 1991. Textual criticism and the literature of Anglo-Saxon England. Bulletin of the John Rylands University Library. 73:17-45.

───. 1994. On the emendation of Old English texts. Pp. 53-67 in: D.G. Scragg and Paul Szarmach (ed.), The editing of Old English: Papers from the 1990 Manchester conference.

Manly, John M. and Edith Rickert. 1940. The text of the Canterbury tales. Chicago: University of Chicago Press.

McGann, Jerome J. 1983/1992. A critique of modern textual criticism. Charlottesville: University of Virginia Press.

───. 1994. Rationale of the hypertext. <http://www/iath.virginia.edu/public/jjm2f/rationale.htm>

McGillivray, Murray, ed. 1997. Geoffrey Chaucer’s Book of the Duchess: A hypertext edition. Calgary: University of Calgary Press.

McKenzie, D.F. 1984/1999. Bibliography and the sociology of texts. Cambridge: Cambridge University Press.

Mitchell, Bruce and Fred C. Robinson, eds. 2001. A guide to Old English. 6th ed. Oxford: Blackwell.

Nicols, Stephen G. Jr., ed. 1990. Speculum 65.

Ó Cróinín, Dáibhí. nd. The Foundations of Irish Culture AD 600-850. Website. <http://www.foundationsirishculture.ie/>.

O’Donnell, Daniel Paul. 1996a. Manuscript Variation in Multiple-Recension Old English Poetic Texts: The Technical Problem and Poetical Art. Unpubl. PhD Dissertation. Yale University.

───. 1996b. A Northumbrian version of “Cædmon’s Hymn” (eordu recension) in Brussels, Bibliothèque Royale MS 8245-57 ff. 62r2-v1: Identification, edition and filiation. Beda venerabilis: Historian, monk and Northumbrian, eds. L. A. J. R. Houwen and A. A. MacDonald. Mediaevalia Groningana, 19. 139-65. Groningen: Egbert Forsten.

───. 2005a. Cædmon’s Hymn: A multimedia study, edition, and archive. SEENET A.8. Cambridge: D.S. Brewer.

───. 2005b. O Captain! My Captain! Using Technology to Guide Readers Through an Electronic Edition. Heroic Age 8. <http://www.heroicage.org/issues/8/em.html>

───. 2005c. The ghost in the machine: Revisiting an old model for the dynamic generation of digital editions. HumanIT 8 (2005): 51-71.

───. Forthcoming. If I were “You”: How Academics Can Stop Worrying and Learn to Love “the Encyclopedia that Anyone Can Edit.” Heroic Age 10.

Ore, Espen S. 2004. Monkey Business—or What is an Edition? Literary and Linguistic Computing 19: 35-44.

Pearsall, Derek. 1985. Editing medieval texts. Pp. 92-106 in Textual criticism and literary interpretation. Ed. Jerome J. McGann. Chicago: U Chicago.

Pope, John C. and R. D. Fulk, eds. 2001. Eight Old English poems. 3rd ed. New York: W. W. Norton.

Railton, Stephen, ed. 1998-. Uncle Tom’s Cabin and American Culture. Charlottesville: University of Virginia. Institute for Advanced Technology in the Humanities. <http://www.iath.virginia.edu/utc/>.

Reed Kline, Naomi, ed. 2001. A Wheel of Memory: The Hereford Mappamundi. Ann Arbor: University of Michigan Press

Robinson, Fred C. and E. G. Stanley, eds. 1991. Old English verse texts from many sources: a comprehensive collection. Early English Manuscripts in Facsimile, 23. Copenhagen: Rosenkilde & Bagger.

Robinson, Peter. nd. New Methods of Editing, Exploring, and Reading the Canterbury Tales. <http://www.cta.dmu.ac.uk/projects/ctp/desc2.html>.

───, ed. 1996. The Wife of Bath’s Prologue on CD-ROM. Cambridge, Cambridge University Press.

───. 2004. Where are we with electronic scholarly editions, and where to we want to be? Jahrbuch für Computerphilologie Online at <http://computerphilologie.uni-muenchen.de/ejournal.html>. Also available in print: Jahrbuch für Computerphilologie. 123-143.

───. 2005. Current issues in making digital editions of medieval texts—or, do electronic scholarly editions have a future? Digital Medievalist 1.1 <http://www.digitalmedievalist.org/article.cfm?RecID=6>

───. 2006. The Canterbury Tales and other medieval texts. In Burnard, O’Brian O’Keefe, and Unsworth. New York: Modern Language Association of America.

Shillingsburg, Peter L. 1996 Electronic editions. Scholarly editing in the computer age: Theory and practice. Third edition.

Silberschatz, Avi, Hank Korth, and S. Sudarshan. 2006. Database system concepts. New York: McGraw-Hill.

Sisam, Kenneth. 1953. Studies in the history of Old English literature. Oxford: Clarendon Press.

Smith, A.H., ed. 1938/1978. Three Northumbrian poems: Cædmon’s Hymn, Bede’s Death Song, and the Leiden Riddle. With a bibliography compiled by M. J. Swanton. Revised ed. Exeter Medieval English Texts. Exeter: University of Exeter Press.

Wuest, Paul. 1906. Zwei neue Handschriften von Cædmons Hymnus. ZfdA 48: 205-26.

Notes

1 In a report covering most extant, web-based scholarly editions published in or before 2002, Lina Karlsson and Linda Malm suggest that most digital editors up to that point had made relatively little use of the medium’s distinguishing features: “The conclusion of the study is that web editions seem to reproduce features of the printed media and do not fulfil the potential of the Web to any larger extent” (2004 abstract).

2 As this list suggests, my primary experience with actual practice is with digital editions of medieval texts. Recent theoretical and practical discussions, however, suggest that little difference is to be found in electronic texts covering other periods.

3 Synthetic here is not quite synonymous with eclectic as used to describe the approach of the Gregg-Bower’s school of textual criticism. Traditionally, an eclectic text is a single, hypothetical, textual reconstruction (usually of the presumed Authorial text) based on assumption of divided authority. In this approach, a copy text is used to supply accidental details of spelling and punctuation and (usually) to serve as a default source for substantive readings that affect the meaning of the abstract artistic work. Readings from this copy text are then corrected by emendation or, preferably, from forms found in other historical witnesses. In this essay, synthetic is used to refer to a critical text that attempts to summarise in textual form an editorial position about an abstract work’s development at some point in its textual history. All eclectic texts are therefore synthetic, but not all synthetic texts are eclectic: a best text (single witness) edition is also synthetic if, as the name implies, an editorial claim is being made about the particular reliability, historical importance, or interest of the text as represented in the chosen witness. A diplomatic transcription, however, is not synthetic: the focus there is on reporting the details of a given witness as accurately as possible. For a primer on basic concepts in textual editing, excluding the concept of the synthetic text as discussed here, see Greetham 1994.

4 It is indeed significant that the PPEA —the most ambitious digital critical edition of a medieval text that I am aware of—is at this stage in its development publishing primarily as an archive: the development of critical texts of the A-, B-, and C-text traditions has been deferred until after the publication of individual edition/facsimiles of the known witnesses (Bart 2006).

5 Transcriptions, editions, facsimiles, and studies mentioned in this paragraph in many cases have been superseded by subsequent work; readers interested in the current state of Cædmon’s Hymn should begin with the bibliography in O’Donnell 2005a.

6 While there is reason to doubt the details of Dobbie’s recensional division, his fundamental conclusion that dialect did not play a crucial role in the poem’s textual development remains undisputed. For recent (competing) discussions of the Hymn’s transmission, see O’Donnell 2005a and Cavill 2000.

7 There are other types of databases, some of which are at times more suited to representation of information encoded in structural markup languages such as XML, and to the type of manipulation common in textual critical studies (see below, note 14). None of these other models, however, express information as parsimoniously as does the relational model (see Silberschatz, Korth, and Sudarshan 2006, 362-365).

8 This is a rough rather than a formal definition. Formally, a well-designed relational database normally should be in either third normal form or Boyce-Codd normal form (BCNF). A relation is said to be in third normal form when a) the domains of all attributes are atomic, and b) all non-key attributes are fully dependent on the key attributes (see Krishna 1992, 37). A relation is said to be in BCNF if whenever a non-trivial functional dependency → A holds in R, X is a superkey for R (Krishna 1992, 38). Other normal forms exist for special kinds of dependencies (Silbertschatz, Korth, Sudarshan 2006, 293-298).

9 In actual fact, the model for a real bookstore invoice would be more complex, since the example here does not take into account the possibility that there might be more than one copy of any ISBN in stock. A real bookstore would need additional tables to allow it to keep track of inventory.

10 In actual practice, the model would be far more complex and include multiple levels of repeating information (words within lines and relationships to canonical reference systems, for example). This example also assumes that the word is the basic unit of collation; while this works well for most Old English poetry, it may not for other types of literature.

11 Of course, critical editions typically contain far more than bibliographic, textual, and lexical/grammatical information. This too can be modelled relationally, however, although it would be quixotic to attempt to account for the infinite range of possible material one might include in a critical edition in this essay. Thus cultural information about a given text or witnesses is functionally dependent on the specific text or witness in question. Interestingly, the more complex the argumentation becomes, the less complex the underlying data model appears to be: a biographical essay on a text’s author, for example, might take up but a single cell in one of our hypothetical tables.

12 The critical apparatus in most print and many digital editions is itself also usually a view of an implicit textual database, rather than the database itself. Although it usually is presented in quasi-tabular form, it rarely contains a complete accounting for every form in the text’s witness base.

13 This is not to say that it is impossible to use data modelling to account for these distinctions—simply that we are far from being able to derive them arbitrarily from two dimensional relational databases, however complex. Other data models, such as hierarchical or object-oriented databases can be used to build such distinctions into the data itself, though this by definition involves the application of expert knowledge. In O’Donnell 2005a, for example, the textual apparatus is encoded as a hierarchical database. This allows readers to in effect query the database, searching for relations pre-defined as significant, substantive, or orthographic by the editor. See O’Donnell 2005a, §§ ii.7, ii.19, 7.2-9.

14 In the case of my edition of Cædmon’s Hymn, this takes the form of multiple critical texts and apparatus: several reconstructions of the poem’s archetypal form, and various critical views of the poem’s five main recensions and collations. The criteria used to construct these views is indicated explicitly in the title of each page and explained in detail in the editorial introductions. The individual editions were extracted from an SGML encoded text using stylesheets—in essence hard-wired database queries reflecting higher-level editorial decisions—but presented to the reader as a series of progressively abstract views. In keeping with the developing standard for digital textual editions, the edition also allows users direct access to the underlying transcriptions and facsimiles upon which it is based. The result is an edition that attempts to combine the best of the digital and print worlds: the archiving function common to most electronic editions (and traditionally the focus of Cædmon’s Hymn textual research in print), with the emphasis on the presentation of expert knowledge characteristic of traditional print editorial practice.

----  

The Ghost in the Machine: Revisiting an Old Model for the Dynamic Generation of Digital Editions

Posted: Dec 16, 2006 00:12;
Last Modified: May 23, 2012 20:05

Tags: , , , , , ,

---

First Published: HumanIT 8.1 (2005): 51-71. http://www.hb.se/bhs/ith/1-8/dpo.pdf

“The Electronic Cædmon’s Hymn Editorial Method” (1998)

In 1998, a few months into the preparation of my electronic edition of the Old English poem Cædmon’s Hymn (O’Donnell forthcoming), I published a brief prospectus on the “editorial method” I intended to follow in my future work (O’Donnell 1998). Less a true editorial method than a proposed workflow and list of specifications, the prospectus called for the development of an interactive edition-processor by which “users will […] be able to generate mediated (‘critical’) texts on the fly by choosing the editorial approach which best suits their individual research
or study needs” (O’Donnell 1998, ¶ 1).

The heart of the prospectus was a diagram of the “Editorial Process Schema” I intended to follow (figure 1). The edition was to be based on TEI (P2) SGML-encoded diplomatic transcriptions of all twenty-one known witnesses to the poem. Its output was to consist of dynamically generated “HTML/XML” display texts that would allow users access to different views of the underlying textual data depending on their specific interests: e.g. editions containing reconstructions of archetypal texts, student texts based on witnesses showing the simplest vocabulary and grammar, “best text” editions of individual witnesses or recensions, etc. The production of these display texts was to be handled by a series of SGML “filters” or “virtual editions” that would be populated by the
unspecified processor used to format and display the final output. [Begin p. 51]

Figure 1. Editorial Process Schema (O’Donnell 1998)

Goals

The initial impetus for this approach was practical. Although it is quite short, Cædmon’s Hymn has a relatively complex textual history for an Anglo-Saxon poem. Even in print, it has always been edited as a multitext. The standard print edition (Dobbie 1942) reproduces two editorial versions of the poem without commenting on their relative priority. Few other studies have managed to be even this decisive. Dobbie’s text was the last (before my forthcoming edition) to attempt to produce critical texts based on the entire manuscript tradition. Most editions before and
since have concentrated on individual recensions or groups of witnesses[1[. Anticipating great difficulty in proof-reading an electronic edition that might have several editorial texts and multiple textual apparatus2. I was at this early stage keenly interested in reducing the opportunity for typographical error. A workflow that would allow me to generate a number of [Begin p. 52] different critical texts from a single set of diplomatic transcriptions without retyping was for this reason an early desideratum.

This convenience, however, was not to come at the expense of editorial content: a second important goal of my prospectus was to find an explicit home for the editor in what Murray McGillivray recently had described as a “post-critical” world (McGillivray 1994; see also Ross 1996; McGann 1997). In medieval English textual studies in 1998, indeed, this post-critical world seemed to be fast approaching: the first volume of the Canterbury Tales Project, with its revolutionary approach to electronic collation and stemmatics and a lightly-edited guide text, had been published two years earlier (Robinson 1996). Forthcoming publications from the Piers Plowman Electronic Archive (Adams et al. 2000) and Electronic Beowulf (Kiernan 1999) projects, similarly, promised a much heavier emphasis on the manuscript archive (and less interest in the critical text) than their more traditional predecessors. My initial work with the Cædmon’s Hymn manuscripts (e.g. O’Donnell
1996a; O’Donnell 1996b), however, had convinced me that there was a significant need in the case of this text for both user access to the witness archive and editorial guidance in the interpretation of this primary evidence – or, as Mats Dahlström later would point out, that the two approaches had complementary strengths and weaknesses:

The single editor’s authoritative control in the printed SE [Scholarly Edition], manifested in e.g. the versional prerogative, isn’t necessarily of a tyrannical nature. Conversely, the much spoken-of hypermedia database exhibiting all versions of a work, enabling the user to choose freely between them and to construct his or her “own” version or edition, presupposes a most highly competent user, and puts a rather heavy burden on him or her. Rather, this kind of ultra-eclectic archive can result in the user feeling disoriented and even lost in hyperspace. Where printed SE:s tend to bury rival versions deep down in the variant apparatuses, the document architecture of extreme hypertext SE:s, consequential to the very nature of digitally realised hypertext, threatens to bury the user deep among the mass of potential virtuality. (Dahlström 2000, 17) [Begin p. 53]

Keen as I was to spare myself some unnecessary typing, I did not want this saving to come at the expense of providing access to the “insights and competent judgement” (Dahlström 2000, 17) I hoped to acquire in several years’ close contact with the manuscript evidence. What I needed, in other words, was a system in which the computer would generate, but a human edit, the final display texts presented to the reader.

Theory

In order to accomplish these goals, the prospectus proposed splitting the editorial process into distinct phases: a transcription phase, in which human scholars recorded information about the text as it appeared in the primary sources (the “Witness Archive”); an editorial (“Filtering”) phase, in which a human editor designed a template by which a display text was to be produced from the available textual evidence (“Virtual Editions”); a processing phase, in which a computer applied these filters to the Witness Archive; and a presentation phase, in which the resultant output was presented to the reader. The first and second stages were to be the domains of the human editor; the third and fourth that of the computer. An important element of this approach was the assumption that the human editor, even in traditional print sources, functioned largely as a rules-based interpreter of textual data – or as I (in retrospect unfortunately) phrased it, could be “reduced to a set of programming instructions”3 – in much the same way as a database report extracts and format specific information from the underlying data table of a database:

bq..In my view, the editor of a critical edition is understood as being functionally equivalent to a filter separating the final reader from the uninterpreted data contained in the raw witnesses. Depending on the nature of the instructions this processor is given, different types of manipulation will occur in producing the final critical edition. An editor interested in producing a student edition of the poem, for example, can be understood to be manipulating the data according to the instructions “choose the easiest (most sensible) readings and ignore those which raise advanced textual problems”; an editor interested in producing the “original” text can be seen as a processor performing the instruction “choose readings from the earliest manuscript(s) when these are available [Begin p. 54] and sensible; emend or normalise readings as required”; and an editor interested in producing an edition of a specific dialectal version of a text is working to the instruction “choose readings from manuscripts belong to dialect x; when these are not available, reconstruct or emend readings from other manuscripts, ensuring that they conform to the spelling rules of the dialect”. (O’Donnell 1998, ¶¶ 4 f.)

Advantages

From a theoretical perspective, the main advantage of this approach was that it provided an explicit location for the encoding of editorial knowledge – as distinct from textual information about primary sources, or formatting information about the final display. By separating the markup used to describe a text’s editorial form from that used to describe its physical manifestation in the witnesses, or its final appearance to the end user, this method made it easier in principle both to describe phenomena at a given level in intrinsically appropriate terms and to modify, reuse, or revise information at each level without necessarily having to alter other aspects of the edition design – in much the same way as the development of structural markup languages themselves had freed text encoders from worrying unduly about final display. Scholars working on a diplomatic transcription of a manuscript in this model would be able to describe its contents without having to ensure that their markup followed the same semantic conventions (or even DTD) as that used at the editorial or display levels.

Just as importantly, the approach was, in theory at least, infinitely extensible. Because it separated transcription from editorial activity, and because it attempted to encode editorial activity as a series of filters, users were, in principle, free to ignore, adapt, add to, or replace the work of the original editor. Scholars interested in statistical or corpus work might choose to work with raw SGML data collected in the witness archive; those interested in alternative editorial interpretations might wish to provide their own filters; those wishing to output the textual data to different media or using different display formats were free to adapt or substitute a different processor. Espen S. Ore recently has discussed how well-made and suitably-detailed transcriptions of source material might be used or adapted profitably by other scholars and projects as the basis [Begin p. 55] for derivative work (Ore 2004); from a theoretical perspective the “Editorial Method” proposed for use in Cædmon’s Hymn offered an early model for how such a process might be built into an edition’s design. Indeed, the method in principle allowed editors of new works to operate in the other direction as well: by building appropriate filters, editors of original electronic editions could attempt to model the editorial decisions of their print-based predecessors, or apply techniques developed for other texts to their own material4.

Implementation (1998)

Despite its theoretical attractiveness, the implementation of this model proved, in 1998, to be technically quite difficult. The main problem was access to technology capable of the type of filtering envisioned at the Virtual Edition level. In the original model, these “editions” were supposed to be able both to extract readings from multiple source documents (the individual witness transcriptions) and to translate their markup from the diplomatic encoding used in the original transcriptions to that required by the new context – as a reading used in the main text of a critical edition, say, or a form cited in an apparatus entry, textual note, or introductory paragraph. This type of transformation was not in and of itself impossible to carry out at the time: some SGML production environments and several computer languages (e.g. DSSSL or, more generally, Perl and other scripting languages) could be used to support most of what I wanted to do; in the days before XSL, however, such solutions were either very much cutting edge, or very expensive in time and/or resources. As a single scholar without a dedicated technical staff or funding to purchase commercial operating systems, I was unable to take full advantage of the relatively few transformation options then available.

The solution I hit upon instead involved dividing the transformation task into two distinct steps (extraction and translation) and adding an extra processing level between the witness and virtual edition levels in my original schema: [Begin p. 56]

Figure 2. Implemented Schema

Instead of acting as the locus of the transformation, the editorial filters in this revised model provided a context for text that had been previously extracted from the witness archive and transformed for use in such circumstances. The text these filters called upon was stored in a textual database as part of the project’s entity extension file (project.ent, see Sperberg-McQueen and Burnard 2004, § 3.3), and hence resident in the project DTD. The database itself was built by extracting marked-up readings from the original witness transcription files (using grep) and converting them (using macros and similar scripts) to entities that could be called by name anywhere in the project. Transformations involving a change in markup syntax or semantics (e.g. from a diplomatic-linguistic definition of a word in witness transcriptions to a syntactic and morphological definition in the edition files) also generally were performed in this DTD extension file. [Begin p. 57]

First two lines of a TEI SGML transcription of Cædmon’s Hymn witness T1:

〈l id=“t1.1” n=“1“〉
 〈seg type=“MSWord” id=“t1.1a.1“〉Nu〈space extent=“0“〉〈/seg〉
 〈seg type=“MSWord” id=“t1.1a.2“〉〈damage type=“stain” degree=“moderate“〉sculon〈/damage〉〈space〉〈/seg〉
 〈note id=“t1.1a.3.n” type=“transcription” target=“t1.1a.2 t1.1a.4 t1.1b.1 t1.2b.3 t1.3a.1 t1.4a.1 t1.4a.2 t1.4b.1 t1.6a.1 t1.6a.2 t1.7b.1 t1.7b.2 t1.9b.2“〉&copyOft1.1a.2;…&copyOft1.9b.2;] Large stain obscures some text down inside (right) margin of p. 195 in facsimile. Most text is readable, however.〈/note〉
 〈seg type=“MSWord” id=“t1.1a.3“〉〈damage type=“stain” degree=“moderate“〉herigean〈/damage〉〈space〉〈/seg〉
 〈caesura〉
 〈seg type=“MSWord” id=“t1.1b.1“〉〈damage type=“stain” degree=“light“〉he〈/damage〉ofon〈lb〉rices〈space〉〈/seg〉
 〈seg type=“MSWord” id=“t1.1b.2“〉&wynn;eard〈space〉〈/seg〉
〈/l〉
〈l id=“t1.2” n=“2“〉
 〈seg type=“MSWord” id=“t1.2a.1“〉meotodes〈space〉〈/seg〉
 〈seg type=“MSWord” id=“t1.2a.2“〉me〈corr sic=“u” cert=“50%“〉〈del rend=“overwriting“〉u〈/del〉〈add rend=“overwriting” place=“intralinear“〉a〈/add〉〈/corr〉hte〈space〉〈/seg〉
 〈note type=“transcription” id=“t1.2a.2.n” target=“t1.2a.2” resp=dpod〉&copyOft1.2a.2;] Corrected from 〈foreign〉meuhte〈/foreign〉?〈/note〉
 〈caesura〉
 〈seg type=“MSWord” id=“t1.2b.1“〉&tyronianNota;〈space extent=“0“〉〈/seg〉
 〈seg type=“MSWord” id=“t1.2b.2“〉his〈space〉〈/seg〉
 〈seg type=“MSWord” id=“t1.2b.3“〉〈damage type=“stain” degree=“severe“〉〈unclear reason=“stain in facsimile” cert=“90%“〉mod〈/unclear〉〈/damage〉〈damage type=“stain” degree=“moderate“〉geþanc〈/damage〉〈space〉〈/seg〉
 〈note type=“transcription” id=“t1.2b.3.n” target=“t1.2b.3“〉&copyOft1.2b.3;] 〈c〉mod〈/c〉 obscured by stain in facsimile.〈/note〉
〈/l〉

Same text after conversion to entity format (Information from the original l, w, caesura, and note elements are stored separately).

〈!ENTITY t1.1a.1 ‘Nu〈space type=“wordBoundary” extent=“0“〉‘〉
〈!ENTITY t1.1a.2 ‘sc〈damage type=“stain” rend=“beginning“〉ulon〈/damage〉〈space type=“wordBoundary” extent=“1“〉‘〉

[Begin p. 58]

〈!ENTITY t1.1a.3 ‘〈damage type=“stain” rend=“middle“〉herıgean〈/damage〉〈space type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.1b.1 ‘〈damage type=“stain” rend=“end“〉heo〈/damage〉fon〈lb〉rıces〈space
type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.1b.2 ‘&mswynn;eard〈space type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.2a.1 ‘meotodes〈space type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.2a.2 ‘me〈damage type=“stain” rend=“complete“〉a〈/damage〉hte〈space type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.2b.1 ‘〈abbr type=“scribal” expan=“ond/and/end“〉&tyronianNota;〈/abbr〉〈expan type=“scribal“〉ond〈/expan〉〈space type=“wordBoundary” extent=“0“〉‘〉
〈!ENTITY t1.2b.2 ‘hıs〈space type=“wordBoundary” extent=“1“〉‘〉
〈!ENTITY t1.2b.3 ‘〈damage type=“stain” rend=“beginning“〉〈unclear rend=“complete“〉mod〈/unclear〉geþanc〈/damage〉〈space type=“wordBoundary” extent=“1“〉‘〉

Same text after conversion to editorial format for use in editions.

〈!ENTITY ex.1a.1 ‘Nu‘〉
〈!ENTITY ex.1a.2 ‘sculon‘〉
〈!ENTITY ex.1a.3 ‘herigean‘〉
〈!ENTITY ex.1b.1 ‘heofonrices‘〉
〈!ENTITY ex.1b.2 ‘&edwynn;eard‘〉
〈!ENTITY ex.2a.1 ‘meotodes‘〉
〈!ENTITY ex.2a.2 ‘meahte‘〉
〈!ENTITY ex.2b.1 ‘ond‘〉
〈!ENTITY ex.2b.2 ‘his‘〉
〈!ENTITY ex.2b.3 ‘modgeþanc‘〉

Citation from the text of T1 (bold) in an introductory chapter (simplified for demonstration purposes).

〈p id=“CH6.420” n=“6.42“〉Old English 〈mentioned lang=“ANG“〉swe〈/mentioned〉, 〈mentioned
lang=“ANG“〉swæ〈/mentioned〉, 〈mentioned lang=“ANG“〉swa〈/mentioned〉 appears as 〈mentioned
rend=“postcorrection” lang=“ANG“〉&t1.3b.1;〈/mentioned〉 (&carmsx; 〈mentioned rend=“postcorrection”
lang=“ANG“〉&ar.3b.1;〈/mentioned〉) in all West-Saxon witnesses of the poem on its sole occurrence in 3b. The expected West-Saxon development is 〈mentioned lang=“ANG“〉swæ〈/mentioned〉, found in early West-Saxon. As in most dialects, however, 〈mentioned lang=“ANG“〉swa〈/mentioned〉 develops
irregularly in the later period. 〈mentioned [Begin p. 59] lang=“ANG“〉Swa〈/mentioned〉 is the usual late West-Saxon reflex (see &hogg1992;, § 3.25, n. 3).〈/p〉

Citation from the text of T1 (bold) in a textual apparatus (simplified for demonstration purposes)

〈app id=“EX.1A.1.APP” n=“1” from=“EX.1A.1“〉
 〈lem id=“EX.1A.1.LEM” n=“1a“〉&ex.1a.1;〈/lem〉
 〈rdggrp〉
  〈rdggrp〉
   〈rdggrp〉
    〈rdg id=“T1.1A.1.RDG” wit=“T1“〉&t1.1a.1;〈/rdg〉〈wit〉〈xptr doc=“t1”
from=“T1.1A.1” n=“T1” rend=“eorthan“〉〈/wit〉
    〈rdg id=“O.1A.1.RDG” wit=“O (Pre-Correction)“〉〈seg rend=“precorrection“〉&o.1a.1;〈/seg〉〈/rdg〉〈wit〉〈xptr doc=“o” from=“O.1A.1” n=“O (Pre-Correction)”
rend=“eorthan“〉〈/wit〉
   〈/rdggrp〉
  〈/rdggrp〉
  〈rdggrp〉
   〈rdggrp〉
    〈rdg id=“N.1A.1.RDG” wit=“N“〉&n.1a.1;〈/rdg〉〈wit〉〈xptr doc=“n” from=“N.1A.1” n=“N” rend=“eorthan“〉〈/wit〉
   〈/rdggrp〉
  〈/rdggrp〉
 〈/rdggrp〉
 〈rdggrp〉
  〈rdggrp〉
   〈rdggrp〉
    〈rdg id=“B1.1A.1.RDG” wit=“B1“〉&b1.1a.1;&b1.1a.2;〈/rdg〉〈wit〉〈xptr doc=“b1” from=“B1.1A.1” n=“B1” rend=“eorthan“〉〈/wit〉
    〈rdg id=“TO.1A.1.RDG” wit=“To“〉&to.1a.1;&to.1a.2;〈/rdg〉〈wit〉〈xptr doc=“to” from=“TO.1A.1” n=“To” rend=“eorthan“〉〈/wit〉
    〈rdg sameas=“O.1A.1.RDG” wit=“O (Post-Correction)“〉〈seg
rend=“postcorrection“〉&o.1a.1;&o.1a.2;〈/seg〉〈/rdg〉〈wit〉〈xptr doc=“o” from=“O.1A.1” n=“O (Post-Correction)” rend=“eorthan“〉〈/wit〉
    〈rdg id=“CA.1A.1.RDG” wit=“Ca“〉&ca.1a.1;&ca.1a.2;〈/rdg〉〈wit〉〈xptr doc=“ca” from=“CA.1A.1” n=“Ca” rend=“eorthan“〉〈/wit〉
   〈/rdggrp〉
  〈/rdggrp〉
 〈/rdggrp〉
〈/app〉

[Begin p. 60]

Implementation (2005)

The solutions I developed in 1998 to the problem of SGML transformation are no longer of intrinsic interest to Humanities Computing specialists except, perhaps, from a historical perspective. With the publication of the first XSL draft in November 1999, and, especially, the subsequent rapid integration of XSL and XML into commercial and academic digital practice, editors soon had far more powerful languages and tools available to accomplish the same ends.

Where my solutions are valuable, however, is as proof-of-concept. By dividing the editorial process into distinct phases, I was able to achieve, albeit imperfectly, both my original goals: no Old English text from the primary witnesses was input more than once in my edition and I did to a certain extent find in the “Virtual Editions” an appropriate and explicit locus for the encoding of editorial information.

With the use of XSLT, however, it is possible to improve upon this approach in both practice and theory. In practical terms, XSLT functions and instructions such as document() and xsl:result-document eliminate the need for a pre-compiled textual database: scholars using XSLT today can work, as I originally had hoped to, directly with the original witness transcriptions, extracting readings, processing them, and outputing them to different display texts using a single language and processor – and indeed perhaps even a single set of stylesheets.

In theoretical terms, moreover, the adoption of XSLT helps clarify an ambiguity in my original proposal. Because, in 1998, I saw the process of generating an edition largely as a question of translation from diplomatic to editorial encoding, my original model distinguished between the first two levels on largely semantic grounds. The Witness Archive was the level that was used to store primary readings from the poem’s manuscripts; the filter or Virtual Edition level was used to store everything else, from transformations necessary to translate witness readings into
editorial forms to secondary textual content such as introductory chapters, glossary entries, and bibliography.

In XSLT terms, however, there is no significant reason for maintaining such a distinction: to the stylesheet, both types of content are simply raw material for the transformation. What this raw material is, where it came from, or who its author is, are irrelevant to the stylesheet’s task of
[Begin p. 61] organisation, adaptation, interpretation, and re-presentation. While poor quality or poorly constructed data will affect the ultimate quality of its output, data composition and encoding remain, in the XSLT world, distinct operations from transformation.

This is significant because it helps us refine our theoretical model of the editorial process and further isolate the place where editorial intelligence is encoded in a digital edition. For organisation, adaptation, interpretation, and re-presentation are the defining tasks of the scholarly editor as much as they are that of the XSLT stylesheet. Change the way a standard set of textual data is interpreted, organised, adapted, or presented, and you change the nature of the final “edition”. Editions of literary works are usually based on very similar sets of primary data – there is only one Beowulf manuscript, after all, and even better attested works usually have a relatively small set of textually significant witnesses, editions, or recensions. What differences arise between modern editions of literary texts tend for the most part to hinge on the reinterpretation of existing evidence, rather than any real change in the available data5. In traditional editions, the evidence for this observation can be obscured by the fact that the “editor” also usually is responsible for much of the secondary textual content. That the observation is true, however, is demonstrated by emerging forms of digital editions in which the editorial function is largely distinct from that of content creation: multigenerational and derivative editions such as those discussed by Ore (2004), as well as interactive models such as that proposed by the Virtual Humanities Lab (e.g. Armstrong & Zafrin 2005), or examples in which users reinterpret data in already existing corpora or databases (e.g. Green 2005).

Taken together, this suggests that my 1998 model was correct in its division of the editorial process into distinct tasks, but imprecise in its understanding of the editorial function. [Begin p. 62]

Figure 3. Revised Schema

In the revised version, the original “Witness Archive” is now reconceived of more generally as a collection of textual data used in the edition, regardless of source or type. This data is then organised, interpreted, adapted, and prepared for presentation using stylesheets (and perhaps other organisational tools) provided by an “editor” – regardless of whether this “editor” is the person responsible for assembling and/or authoring the original content, an invited collaborator, or even an end user. As in the original model, this reorganisation is then presented using
an appropriate display media.

Conclusion

Technical advances of the last eight years have greatly improved our ability to extract and manipulate textual data – and our ability to build editions in ways simply impossible in print. The model for the editorial [Begin p. 63] process proposed in O’Donnell (1998) represented an early attempt to understand how the new technology might affect the way editors work, and, more importantly, how this technology might be harnessed more efficiently. With suitable modifications to reflect our field’s growing sophistication, the model appears to have stood the test of time, and proven itself easily adapted to include approaches developed since its original publication. From my perspective, however, a real sign of strength is that it continues to satisfy my original two goals: it suggests a method for avoiding reinputting primary source documents, and it provides a description of the locus of editorial activity; in an increasingly collaborative and interactive scholarly world, it appears that the ghost in the machine may reside in the stylesheet.

Daniel Paul O’Donnell is an Associate Professor of English at the University of Lethbridge, Alberta, Canada. He is also director of the Digital Medievalist Project 〈http://www.digitalmedievalist.org/〉 and editor of Cædmon’s Hymn: A Multimedia Study, Edition, and Archive (D.S. Brewer, forthcoming 2005). His research interests include Old English poetry, Textual and Editorial Studies, Humanities Computing, and the History of the Book. E-mail: daniel.odonnell@uleth.ca Web page: http://people.uleth.ca/~daniel.odonnell/ [Begin p. 64]

Notes

1 A bibliography of studies and editions of Cædmon’s Hymn can be found in O’Donnell (forthcoming).

2 In the event, the final text of O’Donnell (forthcoming) has eight critical editions, all of which have several apparatus, and “semi-normalised” editions of all twenty-one witnesses.

3 This choice was unfortunate, as it seems that it led to my model being understood far more radically than I intended (e.g. in Dahlström 2000, 17, cited above). A perhaps better formulation would be that editors (print and digital) function in a manner analogous to (and perhaps reproducable in) progamming instructions.

4 In practice, of course, this type of modelling would work best in the case of simple, linguistically oriented exemplars. It becomes increasingly difficult – though still theoretically possible – with more complex or highly eclectic editorial approaches. A rule-based replica of Kane and Donaldson (1988), for example, is possible probably only in theory.

5 While this obviously does not apply in those few cases in which editions are made after the discovery of significant new textual evidence, such discoveries are few and far between. Most editorial differences are the result of a reinterpretation of essentially similar sets of textual data.

[Begin p. 65]

References

[Begin p. 66]

[Begin p. 67]

Appendix: O’Donnell (1998)

The following is a reprint of O’Donnell (1998). It has been reformatted for publication, but is otherwise unchanged from the original text with the exception of closing brackets that were missing from some of the code examples in the original and that have been added here. The Editorial Schema diagram has been redrawn without any deliberate substantive alteration. The original low resolution version can be found at 〈http://people.uleth.ca/~daniel.odonnell/research/caedmon-job.html〉.

The Electronic Cædmon’s Hymn: Editorial Method

Daniel Paul O’Donnell

The Electronic Cædmon’s Hymn will be an archive based, virtual critical edition. This means users will:

The following is a rough schema describing how the edition will work:

[Begin p. 68]

Figure 1.

This schema reflects my developing view of the editing process. The terms (Witness Level, Processor Level, etc.) are defined further below.

In my view, the editor of a critical edition is understood as being functionally equivalent to a filter separating the final reader from the uninterpreted data contained in the raw witnesses. Depending on the nature of the instructions this processor is given, different types of manipulation will occur in producing the final critical edition. An editor interested in producing a student edition of the poem, for example, can be understood to be manipulating the data according to the instructions choose the easiest (most sensible) readings and ignore those which raise advanced textual problems; an editor interested in producing the ‘original’ text can be seen as a processor performing the instruction choose readings from the earliest manuscript(s) when these are available and sensible; emend or normalise readings as required; and an editor interested in producing an edition of a specific dialectal version of a text is working to the instruct[Begin p. 69]tion choose readings from manuscripts belong to dialect x; when these are not available, reconstruct or emend readings from other manuscripts, ensuring that they conform to the spelling rules of the dialect. If editors can be reduced to a set of programming instructions, then it ought to be possible, in an electronic edition, to automate the manipulations necessary to produce various kinds of critical texts. In the above schema, I have attempted to do so. Instead of producing a final interpretation of ‘the text’, I instead divide the editorial process into a series of discrete steps:

Because the critical edition is not seen as an actual text but rather as a simple view of the raw data, different textual approaches are understood as being complementary rather than competing. It is possible to have multiple ‘views’ coexisting within a single edition. Readers will be expected to choose the view most appropriate to the type of work they wish to do. For research requiring a reconstruction of the hypothetical ‘author’s original’, a ‘reconstruction filter’ might be applied; a student can apply the ‘student edition filter’ and get a readable simplified text.
And the oral-formulaicist can apply the ‘single manuscript x filter’ and get a formatted edition of the readings of a single manuscript. Because different things are expected of the different levels, each layer has its own format and protocol. Because all layers are essential to the
development of the text, all would be included on the CDRom containing the edition. Users could program their own filters at the filter level, or change the processing instructions to use other layouts or formats; they could also conduct statistical experiments and the like on the raw
SGML texts in the witness archive or filter level as needed.

[Begin p. 70]

Witness Archive

The witness archive consists of facsimiles and diplomatic transcriptions of all relevant witnesses marked up in SGML (TEI) format. TEI is better for this initial stage of the mark-up because it is so verbose. Information completely unnecessary to formatting – linguistic, historical, metrical,
etc. – can be included for use search programs and manipulation by other scholars.

The following is a sample from a marked-up transcription at the witness archive level:

bq..〈l id=“ld.1” n=“1“〉
 〈w〉Nu〈/w〉
 〈w〉&wynn;e〈/w〉〈space extent=0〉
 〈w〉sceolan〈/w〉
 〈w〉herian〈/w〉
 〈w〉〈del type=“underlined“〉herian〈/del〉〈/w〉
 〈caesura〉
 〈w〉heo〈lb〉〈add hand=“editorial” cert=“90“〉f〈/add〉on〈space extent=1〉rices〈/w〉
 〈w〉&wynn;eard〈/w〉.〈space extent=0〉
〈/l〉

Virtual Editions

Virtual Editions are the filters that contain the editorial processing instructions. They are not so much texts in themselves as records of the intellectual processes by which a critical text interprets the underlying data contained in the witness archive. They are SGML (TEI) encoding
documents which provide a map of which witness readings are to be used in which critical texts. For most readings in most types of editions, these instructions will consist of empty elements using the ‘sameAs’ and ‘copyOf’ attributes to indicate which witness is to provide a specific
reading: e.g. 〈w copyOf=CaW2〉〈/w〉 where CaW2 is the identifier for the reading of a specific word from manuscript Ca. One of the advantages of this method is that eliminates one potential source of error (cutting and pasting from the diplomatic transcriptions into the critical editions); it also allows for the near instantaneous integration of new manuscript readings into the finished editions – changes in the witness transcriptions are automatically incorporated in the final texts via the filter.

[Begin p. 71]

In some cases, the elements will contain emendations or normalisation instructions: e.g. 〈w sameAs=CaW2〉þa〈w〉. The sample is from a virtual edition. It specifies that line 1 of this critical text is to be taken verbatim from manuscript ld (i.e. the text reproduced above):

〈l id=“Early.1” n=“1” copyOf=“ld.1“〉〈/l〉

Processing Level and Display Texts

The ‘Virtual Editions’ are a record of the decisions made by an editor in producing his or her text rather than a record of the text itself. Because they consists for the most part of references to specific readings in other files, the virtual editions will be next-to-unreadable to the human eye. Turning these instructions into readable, formatted text is the function of the next layer – in which the processing instructions implied by the virtual layer are applied and in which final formatting is applied. This processing is carried out using a transformation type processor – like Jade – in which the virtual text is filled in with actual readings from the
witness archive, and these readings then formatted with punctuation and capitalisation etc. as required. The final display text is HTML or XML. While this will involve a necessary loss of information – most TEI tags have nothing to do with formatting, few HTML tags have much to do with content – it is more than compensated for by the ability to include the bells and whistles which make a text useful to human readers: HTML browsers are as a rule better and more user friendly than SGML browsers. Users who need to do computer analysis of the texts can always use the TEI encoded witness transcriptions or virtual editions.

Here is my guess as to how HTML would display the same line in the final edition (a critical apparatus would normally also be attached at this layer containing variant readings from other manuscripts [built up from the manuscript archive using the ‘copyOf’ attribute rather than by
cutting and pasting]; notes would discuss the various corrections etc. ignored in the reading text of this view):

〈P〉Nu we sceolan herian heofonrices weard〈/P〉

----  

O Captain! My Captain! Using Technology to Guide Readers Through an Electronic Edition

Posted: Dec 15, 2006 16:12;
Last Modified: May 23, 2012 20:05

Tags: , , , , ,

---

Original Publication Information: Heroic Age 8 (2005). http://www.heroicage.org/issues/8/em.html.

O CAPTAIN! my Captain! our fearful trip is done;
The ship has weather’d every rack, the prize we sought is won;
The port is near, the bells I hear, the people all exulting,
While follow eyes the steady keel, the vessel grim and daring:
  But O heart! heart! heart!
    O the bleeding drops of red,
      Where on the deck my Captain lies,
        Fallen cold and dead.

Walt Whitman, Leaves of Grass

Digital vs. Print editions

§1. Most theoretical discussions of electronic editing attribute two main advantages to the digital medium over print: interactivity and the ability to transcend the physical limitations of the page1. From a production standpoint, printed books are static, linearly organised, and physically limited. With a few expensive or unwieldy exceptions, their content is bound in a fixed, unchangeable order, and required to fit on standard-sized, two dimensional pages. Readers cannot customise the physical order in which information is presented to them, and authors are restricted in the type of material they can reproduce to that which can be presented within the physical confines of the printed page2.

§2. Electronic editions, in contrast, offer readers and authors far greater flexibility. Content can be reorganised on demand in response to changing user needs through the use of links, search programs, and other utilities. The physical limitations of the screen can be overcome in part through the intelligent use of scrolling, dynamically generated content, frames, and other conventions of the electronic medium. The ability to organise and present non-textual material, indeed, has expanded the scope of the edition itself: it is becoming increasingly possible to edit physical objects and intellectual concepts as easily as literary or historical texts.

§3. Not surprisingly, this greater flexibility has encouraged electronic editors to experiment with the conventions of their genre. As McGann has argued, the traditional print-based critical edition is a machine of knowledge (McGann 1995). Its conventions developed over several centuries in response to a complex interplay of intellectual pressures imposed by the requirements of its subject and technical pressures imposed by requirements of its form:

Scholarly editions comprise the most fundamental tools in literary studies. Their development came in response to the complexity of literary works, especially those that had evolved through a long historical process (as one sees in the bible, Homer, the plays of Shakespeare). To deal with these works, scholars invented an array of ingenious machines: facsimile editions, critical editions, editions with elaborate notes and contextual materials for clarifying a work’s meaning. The limits of the book determined the development of the structural forms of these different mechanisms; those limits also necessitated the periodic recreation of new editions as relevant materials appeared or disappeared, or as new interests arose.

With the elimination of (many) traditional constraints faced by their print predecessors, electronic editors have been free to reconceive the intellectual organisation of their work. The ability to construct electronic documents dynamically and interactively has allowed editors to reflect contemporary doubts about the validity of the definitive critical text. Cheap digital photography and the ability to include sound and video clips has encouraged them to provide far more contextual information than was ever possible in print. With the judicious use of animation, virtual reality, and other digital effects, electronic editions are now able to recreate the experience of medieval textuality in ways impossible to imagine in traditional print editions.

Print Convention vs. Electronic Innovation

§4. The increased freedom enjoyed by electronic editors has brought with it increased responsibility. Because they work in a well established and highly standardised tradition, print-based editors are able to take most organisational aspects of their editions for granted. With some minor variation, print-based textual editions are highly predictable in the elements they contain, the order in which these elements are arranged, and they way in which they are laid out on the page (for examples and facsimiles of the major types, see Greetham 1994). In print editions, the textual introduction always appears before the critical text; the apparatus criticus always appears at the bottom of the page or after the main editorial text; glossaries, when they appear, are part of the back matter; contextual information about witnesses or the literary background to the text appears in the introduction. Publishers commonly require these elements to be laid out in a house style; beginning editors can look up the required elements in one of several standard studies (e.g. Greetham 1994, West 1973, Willis 1972).

§5. No such standardisation exists for the electronic editor (Robinson 2005)3. Few if any publishing houses have a strong house style for electronic texts, and, apart from a sense that electronic editions should include high quality colour images of all known witnesses, there are, as yet, few required elements. Electronic editions have been published over the last several years without textual introductions (Kiernan 1999), without critical texts (Solopova 2000), without a traditional textual apparatus (De Smedt and Vanhoutte 2000), and without glossaries (Adams et al. 2000)4. There are few standards for mise en page: some editions attempt to fit as much as possible into a single frameset (Slade 2002); others require users to navigate between different documents or browser tabs (Stolz 2003). Facsimiles can appear within the browser window or in specialised imaging software (cf. McGillivray 1997 vs. Adams et al. 2000): there are as yet few universally observed standards for image resolution, post-processing, or file formats. User interfaces differ, almost invariably, from edition to edition, even among texts issued by the same project or press (cf. Bordalejo 2003 vs. Solopova 2000). Where readers of print editions can expect different texts to operate in an approximately similar fashion, readers approaching new electronic texts for the first time cannot expect their text’s operation to agree with that of other editions they have consulted5.

Technology for Technology’s Sake?

§6. The danger this freedom brings is the temptation towards novelty for novelty’s sake. Freed largely from the constraints of pre-existing convention, electronic editors can be tempted to towards technological innovations that detract from the scholarly usefulness of their projects.

Turning the Page (British Library)

§7. Some innovations can be more annoying than harmful. The British Library Turning the Pages, series, for example, allows readers to mimic the action of turning pages in a manuscript facsimile (http://www.bl.uk/collections/treasures/digitisation4.html). When users click on the top or bottom corner of the manuscript page and drag the cursor to the opposite side of the book, they are presented with an animation showing the page being turned over. If they release the mouse button before the page has been pulled approximately 40% of the way across the visible page spread, virtual gravity takes over and the page falls back into its original position.

§8. This is an amusing animation, and well suited to its intended purpose as an interactive program that allows museums and libraries to give members of the public access to precious books while keeping the originals safely under glass ( http://www.armadillosystems.com/ttp_commercial/home.htm). Scholars interested in the texts as research objects, however, are likely to find the system less attractive. The page-turning system uses an immense amount of memory—the British Library estimates up to 1 GB of RAM for high quality images ( http://www.armadillosystems.com/ttp_commercial/techspec.htm)—and the requirement that users drag pages across the screen makes paging through an edition a time- and attention-consuming activity: having performed an action that indicates that they wish an event to occur (clicking on the page in question), users are then required to perform additional complex actions (holding the mouse button down while dragging the page across the screen) in order to effect the desired result. What was initially amusing rapidly becomes a major and unnecessary irritation.

A Wheel of Memory: The Hereford Mappamundi (Reed Kline 2001)

§9. Other innovations can be more harmful to the intellectual usefulness of a given project. A Wheel of Memory: The Hereford Mappamundi uses the Mappamundi as a conceit for the exploration of the medieval collective memory… using our own collective rota of knowledge, the CD-ROM (Reed Kline 2001, I audio). The edition has extremely high production values. It contains original music and professional narration. Images from the map6 and associated documents are displayed in a custom-designed viewing area that is itself in part a rota. Editorial material is arranged as a series of chapters and thematically organised explorations of different medieval Worlds: World of the Animals, World of the Strange Races, World of Alexander the Great, etc. With the exception of four numbered chapters, the edition makes heavy use of the possibilities for non-linear browsing inherent in the digital medium to organise its more than 1000 text and image files.

§10. In this case, however, the project’s innovative organisation and high production values are ultimately self-defeating.!(left)http://www.heroicage.org/issues/8/images/herefordwholemap.png| Despite its heavy reliance on a non-linear structural conceit, the edition itself is next to impossible to use or navigate in ways not anticipated by the project designers. Text and narration are keyed to specific elements of the map and edition and vanish if the user strays from the relevant hotspot: because of this close integration of text and image, it is impossible to compare text written about one area of the map with a facsimile of another. The facsimile of the map itself is also very difficult to study. The customised viewing area is of a fixed size (I estimate approximately 615×460 pixels) with more than half this surface given over to background and navigation: when the user chooses to view the whole map on screen, the 4 foot wide original is reproduced with a diameter of less than 350 pixels (approximately 1/10 actual size). Even then, it remains impossible to display the map in its entirety: in keeping with the project’s rota conceit, the facsimile viewing area is circular, even though the Hereford map itself is pentagonal: try as I might, I was unable ever to get a clear view of the border and image in the facsimile’s top corner.

Using Technology to Transcend Print

§11. The problem with the British Library and Hereford editions is not that they use innovative technology to produce unconventional editions. Rather, it is that they use this innovative technology primarily for effect rather than as a means of contributing something essential to the presentation of the underlying artifact. In both cases this results in editions that are superficially attractive, but unsuited to repeated use or serious study7. The British Library facsimiles lose nothing if the user turns off the “Turning the Page” technology (indeed, in the on-line version, an accessibility option allows users precisely this possibility); leaving the technology on comes at the cost of usability and memory. In the case of the Hereford Mappamundi, the emphasis on the rota navigational conceit and the project’s high production values get in the way of the map itself: the use of the round viewing area and fixed-width browser actually prevents the user from exploring the entire map, while the close integration of text, narration, and images ironically binds readers more closely to the editor’s view of her material than would be possible in a print edition.

Bayeux Tapestry (Foys 2003)

§12. Appropriately used, innovative technology can create editions that transcend the possibilities of print, however. This can be seen in the third edition discussed in this paper, The Bayeux Tapestry: Digital Edition8.

§13. On the one hand, the Bayeux Tapestry edition uses technology in ways that, at first glance, seem very similar to the Hereford Mappamundi and British Library facsimiles. Like the Mappamundi project, the Bayeux edition has very high production values and is presented using a custom-designed user interface (indeed, the Hereford and Bayeux projects both use the same Macromedia presentation software). Like the British Library facsimiles, the Bayeux project uses technology to imitate the physical act of consulting the medieval artifact: users of the Bayeux Tapestry edition, like visitors to the Bayeux Tapestry itself, move along what appears to be a seamless presentation of the entire 68 metre long object.

§14. The difference between The Bayeux Tapestry: Digital edition and the other two projects, however, is that in the Bayeux edition this technology plays an essential role in the representation of the underlying object. I am aware of no medieval manuscript that incorporates the act of turning the page into its artistic design; the Bayeux tapestry, however, was designed to be viewed as a single continuous document. By integrating hundreds of digital images into what behaves like a single facsimile, the Bayeux project allows users to consult the tapestry as its makers originally intended: moving fluidly from scene to scene and pausing every-so-often to examine individual panels or figures in greater detail.

§15. The organisation of the Bayeux edition is similarly well thought out. In contrast to the Hereford Mappamundi project, the Bayeux project is constructed around the object it reproduces. The opening screen shows a section from the facsimile (few screens would be able to accommodate the entire facsimile in reasonable detail) above a plot-line that provides an overview of the Tapestry’s entire contents in a single screen. Users can navigate the Tapestry scene-by-scene using arrow buttons at the bottom left of the browser window, centimetre by centimetre using a slider on the plot-line, or by jumping directly to an arbitrary point on the tapestry by clicking on the plot-line at the desired location. Tools, background information, other facsimiles of the tapestry, scene synopses, and notes are accessed through buttons at the bottom left corner of the browser. The first three types of material are presented in a separate window when chosen; the last two appear under the edition’s plot-line. Where the organisational conceit of the rota prevented users from accessing the entire Hereford map, the structure of the Bayeux edition encourages users to explore the entire length of the Tapestry.

§16. The Bayeux project also does its best to avoid imposing a particular structure on its editorial content. Where the Hereford project proved extremely difficult to navigate in ways not anticipated by its editor, The Bayeux Tapestry contains a slideshow utility that allows users to reorder elements of the edition to suit their own needs. While few readers perhaps will need to use this in their own study, the utility will prove of the greatest benefit to teachers and lecturers who wish to use the edition to illustrate their own work.

Conclusion

§17. The interactivity, flexibility, and sheer novelty of digital media bring with them great challenges for the electronic editor. Where scholars working in print can rely on centuries of precedent in designing their editions, those working in digital media still operate for the most part in the absence of any clear consensus as to even the most basic expectations of the genre. This technological freedom can, on the one hand, be extremely liberating: electronic editors can now produce editions of a much wider range of texts, artifacts, and concepts than was ever possible in print. At the same time, however, this freedom can also lead to the temptation of using technology for its own sake.

§18. The three projects discussed in this column have been produced by careful and technologically sophisticated researchers. The differences among them lie for the most part in the way they match their technological innovation to the needs of the objects they reproduce. The British Library and Hereford Mappamundi projects both suffer from an emphasis on the use of advanced technology for largely decorative purposes; both would be easier to use without much of their most superficially attractive technological features. The Bayeux Tapestry project, on the other hand, succeeds as an electronic text because it uses advanced technology that is well suited to its underlying object and allows it to be presented in a fashion difficult, if not impossible, in any other medium. Users of the British Library and Hereford facsimiles may find themselves wishing for a simpler presentation; few users of the Bayeux tapestry would wish that this edition had been published in book form.

Notes

1 This is a commonplace. For an influential discussion, see McGann 1995. Strictly speaking, print and digital/electronic in this discussion refer to accidentals of display rather than essential features of composition and storage. Texts composed and stored digitally can be displayed in print format, in which case they are subject to the same limitations as texts composed and stored entirely on paper. The importance of this distinction between composition and display is commonly missed in early theoretical discussions, which tend to concentrate exclusively on possibilities for on-screen display. In fact, as recent commercial and scholarly applications of xml are demonstrating, the real advantage of electronic composition and storage is reusability. Properly designed electronic texts can be published simultaneously in a number of different formats, allowing users to take advantage of the peculiar strengths and weaknesses of each. In my view, the most profound difference between electronic and print texts lies in the separation of content and presentation which makes this reuse of electronic texts possible.

2 It is easy to overemphasise the limitations of print and the flexibility of digital display. While books are for the most part physically static and two dimensional (the main exception are books published as loose pages intended for storage in binders or picture books with three dimensional figures), they are intellectually flexible: readers are free to reorganise them virtually by paging back and forth or using a table of contents or index to find and extract relevant information. In certain cases, e.g. dictionaries and encyclopedias, this intellectual flexibility is an essential feature of the genre. Not surprisingly, these genres were also among the first and most successful titles to be published in electronic format. Screens, for all their flexibility and interactivity, remain two-dimensional display devices subject to many of the same limitations of the printed page.

3 Robinson’s important article came to my attention after this column was in proof.

4 The observation that these examples are missing one or more traditional elements of a print edition is not intended as a criticism. Not all editions need all traditional parts, and, in several cases, the editorial approach used explicitly precludes the inclusion of the missing element. What the observation does demonstrate, however, is that no strong consensus exists as to what must appear in an electronic critical edition. The only thing common to all is the presence of facsimiles.

5 One possible objection to the above list of examples is that I am mixing editions produced using very different technologies over the greater part of a decade (a long time in humanities computing). This technical fluidity is one of the reasons for the lack of consensus among electronic editors, however. Since in most cases, moreover, the technology has aged before the editorial content (eight years is a relatively short time in medieval textual studies), the comparison is also valid from a user’s perspective: as a medievalist, I am as likely to want to consult the first disc in the Canterbury Tales Project as I am the most recent.

6 As is noted in the introduction to the edition, the facsimile reproduces a nineteenth-century copy of the Hereford map rather than the medieval Mappamundi itself. The images in the Bayeux disc discussed below are similarly based on facsimiles—albeit in this case photographs of the original tapestry.

7 This is less of a problem in the case of the British Library series, which presents itself primarily as an aid for the exhibition of manuscripts to the general public rather than a serious tool for professional scholars. The intended audience of the Mappamundi project is less certain: it is sold by a university press and seems to address itself to both scholars and students; much of its content, however, seems aimed at the high school level. The design flaws in both texts seem likely to discourage repeated use by scholars and members of the general public alike.

8 In the interests of full disclosure, readers should be aware that I am currently associated with Foys in several on-going projects. These projects began after the publication of Foys 2003, with which I am not associated in any way.

Works Cited

----  

The Doomsday Machine, or, "If you build it, will they still come ten years from now?": What Medievalists working in digital media can do to ensure the longevity of their research

Posted: Dec 15, 2006 13:12;
Last Modified: May 23, 2012 20:05

Tags: , , , , , ,

---

Original Publication Information: Heroic Age 7 (2004). http://www.heroicage.org/issues/7/ecolumn.html.

Yes, but the… whole point of the doomsday machine… is lost… if you keep it a secret!

Dr. Strangelove

It is, perhaps, the first urban myth of humanities computing: the Case of the Unreadable Doomsday Machine. In 1986, in celebration of the 900th anniversary of William the Conqueror’s original survey of his British territories, the British Broadcasting Corporation (BBC) commissioned a mammoth 2.5 million electronic successor to the Domesday Book. Stored on two 12 inch video laser discs and containing thousands of photographs, maps, texts, and moving images, the Domesday Project was intended to provide a high-tech picture of life in late 20th century Great Britain. The project’s content was reproduced in an innovative early virtual reality environment and engineered using some of the most advanced technology of its day, including specially designed computers, software, and laser disc readers (Finney 1986).

Despite its technical sophistication, however, the Domesday Project was a flop by almost any practical measure. The discs and specialized readers required for accessing the project’s content turned out to be too expensive for the state-funded schools and public libraries that comprised its intended market. The technology used in its production and presentation also never caught on outside the British government and school system: few other groups attempted to emulate the Domesday Project’s approach to collecting and preserving digital material, and no significant market emerged for the specialized computers and hardware necessary for its display (Finney 1986, McKie and Thorpe 2003). In the end, few of the more than one million people who contributed to the project were ever able to see the results of their effort.

The final indignity, however, came in March 2003 when, in a widely circulated story, the British newspaper The Observer reported that the discs had finally become “unreadable”:

16 years after it was created, the £2.5 million BBC Domesday Project has achieved an unexpected and unwelcome status: it is now unreadable.

The special computers developed to play the 12in video discs of text, photographs, maps and archive footage of British life are — quite simply — obsolete.

As a result, no one can access the reams of project information — equivalent to several sets of encyclopedias — that were assembled about the state of the nation in 1986. By contrast, the original Domesday Book — an inventory of eleventh-century England compiled in 1086 by Norman monks — is in fine condition in the Public Record Office, Kew, and can be accessed by anyone who can read and has the right credentials. ‘It is ironic, but the 15-year-old version is unreadable, while the ancient one is still perfectly usable,’ said computer expert Paul Wheatley. ‘We’re lucky Shakespeare didn’t write on an old PC.’ (McKie and Thorpe 2003)

In fact, the situation was not as dire as McKie and Thorpe suggest. For one thing, the project was never actually “unreadable,” only difficult to access: relatively clean copies of the original laser discs still survive, as do a few working examples of the original computer system and disc reader (Garfinkel 2003). For another, the project appears not to depend, ultimately, on the survival of its obsolete hardware. Less than ten months after the publication of the original story in The Observer, indeed, engineers at Camileon, a joint project of the Universities of Leeds and Michigan, were able to reproduce most if not quite all the material preserved on the original 12 inch discs using contemporary computer hardware and software (Camileon 2003a; Garfinkel 2003).

The Domesday Project’s recent history has some valuable, if still contested, lessons for librarians, archivists, and computer scientists (see for example the discussion thread to Garfinkel 2003; also Camileon 2003b). On the one hand, the fact that engineers seem to be on the verge of designing software that will allow for the complete recovery of the project’s original content and environment is encouraging. While it may not yet have proven itself to be as robust as King William’s original survey, the electronic Domesday Project now at least does appear have been saved for the foreseeable future-even if “foreseeable” in this case may mean simply until the hardware and software supporting the current emulator itself becomes obsolete.

On the other hand, however, it cannot be comforting to realise that the Domesday Project required the adoption of such extensive and expensive restoration measures in the first place less than two decades after its original composition: the discs that the engineers at Camileon have devoted the last ten months to recovering have turned out to have less than 2% the readable lifespan enjoyed by their eleventh-century predecessor. Even pulp novels and newspapers published on acidic paper at the beginning of the last century have proved more durable under similarly controlled conditions.1 While viewed in the short term, digital formats do appear to offer a cheap method of preserving, cataloguing, and especially distributing copies of texts and other cultural material, their effectiveness and economic value as a means of long-term preservation has yet to be demonstrated completely.

These are, for the most part, issues for librarians, archivists, curators, computer scientists, and their associations: their solution will almost certainly demand resources, a level of technical knowledge, and perhaps most importantly, a degree of international cooperation far beyond that available to most individual humanities scholars (Keene 2003). In as much as they are responsible for the production of an increasing number of electronic texts and resources, however, humanities scholars do have an interest in ensuring that the physical record of their intellectual labour will outlast their careers. Fortunately there are also some specific lessons to be learned from the Domesday Project that are of immediate use to individual scholars in their day-to-day research and publication.

1. Do not write for specific hardware or software.

Many of the preservation problems facing the Domesday Project stem from its heavy reliance on specific proprietary (and often customized) hardware and software. This reliance came about for largely historical reasons. The Domesday Project team was working on a multimedia project of unprecedented scope, before the Internet developed as a significant medium for the dissemination of data.2 In the absence of suitable commercial software and any real industry emphasis on inter-platform compatibility or international standards, they were forced to custom-build or commission most of their own hardware and software. The project was designed to be played from a specially-designed Phillips video-disc player and displayed using custom-built software that functioned best on a single operating platform: the BBC Master, a now obsolete computer system which, with the related BBC Model B, was at the time far more popular in schools and libraries in the United Kingdom than the competing Macintosh, IBM PC, or long forgotten Acorn systems.3

With the rise of the internet and the development of well-defined international standard languages such as Standard General Markup Language (SGML), HyperText Markup Language (HTML), eXtensible Markup Language (XML), and Hypermedia/Time-based Structuring Language (HyTime), few contemporary or future digital projects are likely to be as completely committed to a single specific hardware or software system as the Domesday Project. This does not mean, however, that the temptation to write for specific hardware or software has vanished entirely. Different operating systems allow designers to use different, often incompatible, shortcuts for processes such as referring to colour, assigning fonts, or referencing foreign characters (even something as simple as the Old English character thorn can be referred to in incompatible ways on Windows and Macintosh computers). The major internet browsers also all have proprietary extensions and idiosyncratic ways of understanding supposedly standard features of the major internet languages. It is very easy to fall into the trap of adapting one’s encoding to fit the possibilities offered by non-standard extensions, languages, and features of a specific piece of hardware or software.

The very real dangers of obsolescence this carries with it can be demonstrated by the history of the Netscape and tags. Introduced with the Netscape 4.0 browser in early 1997, the and tags were proprietary extensions of HTML that allowed internet designers to position different parts of their documents independently of one another on the screen: to superimpose one piece of a text over another, to place text over (or under) images, or to remove one section of a line from the main textual flow and place it elsewhere (Netscape Communications Corporation 1997). The possibilities this extension opened up were exciting. In addition to enlivening otherwise boring pages with fancy typographic effects, the and elements also allowed web designers to create implicit intellectual associations among otherwise disparate elements in a single document. For example, one could use these tags to create type facsimiles of manuscript abbreviations by superimposing their component parts or create annotated facsimile editions by placing textual notes or transcriptions over relevant manuscript images.

As with the Domesday Project, however, projects that relied on these proprietary extensions for anything other than the most incidental effects were doomed to early obsolescence: the and tags were never adopted by the other major browsers and, indeed, were dropped by Netscape itself in subsequent editions of its Navigator browser. Thus an annotated manuscript facsimile coded in mid 1997 to take advantage of the new Netscape 4.0 and tags would, with the release of Netscape 5.0 at the end of 1999, already be obsolete. Users who wished to maintain the presumably intellectually significant implicit association between the designer’s notes and images in this hypothetical case would need either to maintain (or recreate) a working older version of the Netscape browser on their system (an increasingly difficult task as operating systems themselves go through subsequent alterations and improvements) or to convert the underlying files to a standard encoding.

2. Maintain a distinction between content and presentation

A second factor promoting the early obsolescence of the Domesday Project was its emphasis on the close integration of content and presentation. The project was conceived of as a multimedia experience and its various components-text, video, maps, statistical information-often acquired meaning from their interaction, juxtaposition, sequencing, and superimposition (Finney 1986, “Using Domesday”; see also Camileon 2003b). In order to preserve the project as a coherent whole, indeed, engineers at Camileon have had to reproduce not only the project’s content but also the look and feel of the specific software environment in which it was intended to be searched and navigated (Camileon 2003b).

Here too the Domesday Project designers were largely victims of history. Their project was a pioneering experiment in multimedia organisation and presentation and put together in the virtual absence of now standard international languages for the design and dissemination of electronic documents and multimedia projects — many of which, indeed, were in their initial stages of development at the time the BBC project went to press.4

More importantly, however, these nascent international standards involved a break with the model of electronic document design and dissemination employed by the Domesday Project designers. Where the Domesday Project might be described as an information machine — a work in which content and presentation are so closely intertwined as to become a single entity — the new standards concentrated on establishing a theoretical separation between content and presentation (see Connolly 1994 for a useful discussion of the distinction between “programmable” and “static” document formats and their implications for document conversion and exchange). This allows both aspects of an electronic to be described separately and, for the most part, in quite abstract terms which are then left open to interpretation by users in response to their specific needs and resources. It is this flexibility which helped in the initial popularization of the World Wide Web: document designers could present their material in a single standard format and, in contrast to the designers of the Domesday Project, be relatively certain that their work would remain accessible to users accessing it with various software and hardware systems — whether this was the latest version of the new Mosaic browser or some other, slightly older and non-graphical interface like Lynx (see Berners-Lee 1989-1990 for an early summary of the advantages of multi-platform support and a comparison with early multi-media models such as that adopted by the Domesday Project). In recent years, this same flexibility has allowed project designers to accommodate the increasingly large demand for access to internet documents from users of (often very advanced) non-traditional devices: web activated mobile phones, palm-sized digital assistants, and of course aural screen readers and Braille printers.

In theory, this flexibility also means that where engineers responsible for restoring the Domesday Project have been forced to emulate the original software in order to recreate the BBC designer’s work, future archivists will be able to restore current, standards-based, electronic projects by interpreting the accompanying description of their presentation in a way appropriate to their own contemporary technology. In some cases, indeed, this restoration may not even require the development of any actual computer software: a simple HTML document, properly encoded according to the strictest international standards, should in most cases be understandable to the naked eye even when read from a paper printout or text-only display.

In practice, however, it is still easy to fall into the trap of integrating content and presentation. One common example involves the use of table elements for positioning unrelated or sequential text in parallel “columns” on browser screens (see Chisholm, Vanderheiden, et al. 2000, § 5). From a structural point of view, tables are a device for indicating relations among disparate pieces of information (mileage between various cities, postage prices for different sizes and classes of mail, etc.). Using tables to position columns, document designers imply in formal terms the existence of a logical association between bits of text found in the same row or column — even if the actual rationale for this association is primarily aesthetic. While the layout technique, which depends on the fact that all current graphic-enabled browsers display tables by default in approximately the same fashion, works well on desktop computers, the same trick can produce nonsensical text when rendered on the small screen of a mobile phone, printed by a Braille output device, or read aloud by an aural browser or screen-reader. Just as importantly, this technique too can lead to early obsolescence or other significant problems for future users. Designers of a linguistic corpus based on specific types of pre-existing electronic documents, for example, might be required to devote consider manual effort to recognising and repairing content arbitrarily and improperly arranged in tabular format for aesthetic reasons.

3. Avoid unnecessary technical innovation

A final lesson to be learned from the early obsolescence of the Domesday Project involves the hidden costs of technical innovation. As a pioneering electronic document, the Domesday Project was in many ways an experiment in multimedia production, publication, and preservation. In the absence of obvious predecessors, its designers were forced to develop their own technology, organisational outlines, navigation techniques, and distribution plans (see Finney 1986 and Camileon 2003a for detailed descriptions). The fact that relatively few other projects adopted their proposed solutions to these problems — and that subsequent developments in the field led to a different focus in electronic document design and dissemination — only increased the speed of the project’s obsolescence and the cost and difficulty of its restoration and recovery.

Given the experimental status of this specific project, these were acceptable costs. The Domesday Project was never really intended as a true reference work in any usual sense of the word.5 Although it is full of information about mid-1980s Great Britain, for example, the project has never proved to be an indispensable resource for study of the period. While it was inspired by William the Conqueror’s great inventory of post-conquest Britain, the Domesday Project was, in the end, more an experiment in new media design than an attempt at collecting useful information for the operation of Mrs. Thatcher’s government.

We are now long past the day in which electronic projects can be considered interesting simply because they are electronic. Whether they are accessing a Z39.50 compliant library catalogue, consulting an electronic journal on JSTOR, or accessing an electronic text edition or manuscript facsimile published by an academic press, users of contemporary electronic projects by-and-large are now more likely to be interested in the quality and range of an electronic text’s intellectual content than the novelty of its display, organisation or technological features (Nielsen 2000). The tools, techniques, and languages available to producers of electronic projects, likewise, are now far more standardised and helpful than those available to those responsible for electronic incunabula such as the Domesday Project.

Unfortunately this does not mean that contemporary designers are entirely free of the dangers posed by technological experimentation. The exponential growth of the internet, the increasing emphasis on compliance with international standards, and the simple pace of technological change over the last decade all pose significant challenges to the small budgets and staff of many humanities computing projects. While large projects and well-funded universities can sometimes afford to hire specialized personnel to follow developments in computing design and implementation and freeing other specialists to work on content development, scholars working on digital projects in smaller groups, at less well-funded universities, or on their own often find themselves responsible for both the technological and intellectual components of their work. Anecdotal evidence suggests that such researchers find keeping up with the pace of technological change relatively difficult — particularly when it comes to discovering and implementing standard solutions to common technological problems (Baker, Foys, et al. 2003). If the designers of the Domesday Project courted early obsolescence because their pioneering status forced them to design unique technological solutions to previously unresolved problems, many contemporary humanities projects appear to run same risk of obsolescence and incompatibility because their inability to easily discover and implement best practice encourages them to continuously invent new solutions to already solved problems (HATII and NINCH 2002, NINCH 2002-2003, Healey 2003, Baker, Foys, et al. 2003 and O’Donnell 2003).

This area of humanities computing has been perhaps the least well served by the developments of the last two decades. While technological changes and the development of well-designed international standards have increased opportunities for contemporary designers to avoid the problems which led to the Domesday Project’s early obsolescence, the absence of a robust system for sharing technological know-how among members of the relevant community has remained a significant impediment to the production of durable, standards-based projects. Fortunately, however, improvements are being made in this area as well. While mailing lists such humanist-l and tei-l long have facilitated exchange of information on aspects of electronic project design and implementation, several new initiatives have appeared over the last few years which are more directly aimed at encouraging humanities computing specialists to share their expertise and marshal their common interests. The Text Encoding Initiative (TEI) has recently established a number of Special Interest Groups (SIGs) aimed at establishing community practice in response to specific types of textual encoding problems. Since 1993, the National Initiative for a Networked Cultural Heritage (NINCH) has provided a forum for collaboration and development of best practice among directors and officers of major humanities computing projects. The recently established TAPoR project in Canada and the Arts and Humanities Data Service (AHDS) in the United Kingdom likewise seek to serve as national clearing houses for humanities computing education and tools. Finally, and aimed more specifically at medievalists, the Digital Medievalist Project (of which I am currently director) is seeking funding to establish a “Community of Practice” for medievalists engaged in the production of digital resources, through which individual scholars and projects will be able to pool skills and practice acquired in the course of their research (see Baker, Foys, et al. 2003). Although we are still in the beginning stages, there is increasing evidence that humanities computing specialists are beginning to recognise the extent to which the discovery of standardised implementations and solutions to common technological problems is likely to provide as significant a boost to the durability of electronic resources as the development of standardised languages and client-side user agents in the late 1980s and early 1990s. We can only benefit from increased cooperation.

The Case of the Unreadable Doomsday Machine makes for good newspaper copy: it pits new technology against old in an information-age version of nineteenth-century races between the horse and the locomotive. Moreover, there is an undeniable irony to be found in the fact that King William’s eleventh-century parchment survey has thus far proven itself to be more durable than the BBC’s 1980s computer program.

But the difficulties faced by the Domesday Project and its conservators are neither necessarily intrinsic to the electronic medium nor necessarily evidence that scholars at work on digital humanities projects have backed wrong horse in the information race. Many of the problems which led to the Domesday Project’s early obsolescence and expensive restoration can be traced to its experimental nature and the innovative position it occupies in the history of humanities computing. By paying close attention to its example, by learning from its mistakes, and by recognising the often fundamentally different ways in which contemporary humanities computing projects differ from such digital incunabula, scholars can contribute greatly to the likelihood that their current projects will remain accessible long after their authors reach retirement age.

Notes

1 See the controversy between Baker 2002 and [Association of Research Libraries] 2001, both of whom agree that even very acidic newsprint can survive “several decades” in carefully controlled environments.

2 The first internet browser, “WorldWideWeb,” was finished by Tim Berners-Lee at CERN (Conseil Européen pour la Recherche Nucléaire) on Christmas Day 1990. The first popular consumer browser able to operate on personal computer systems was the National Center for Supercomputing Applications (NCSA) Mosaic (a precursor to Netscape), which appeared in 1993. See [livinginternet.com] 2003 and Cailliau 1995 for brief histories of the early browser systems. The first internet application, e-mail, was developed in the early 1970s ([www.almark.net] 2003); until the 1990s, its use was restricted largely to university researchers and the U.S. military.

3 Camileon 2003; See McMordie 2003 for a history of the Acorn platform.
fn4. SGML, the language from which HTML is derived, was developed in the late 1970s and early 1980s but not widely used until the mid-to-late 1980s ([SGML Users’ Group] 1990). HyTime, a multimedia standard, was approved in 1991 ([SGML SIGhyper] 1994).
fn5. This is the implication of Finney 1986, who stresses the project’s technically innovative nature, rather than its practical usefulness, throughout.

Reference List

----  

Back to content

Search my site

Sections

Current teaching

Recent changes to this site

Tags

anglo-saxon studies, caedmon, citation, citation practice, citations, composition, computers, digital humanities, digital pedagogy, exercises, grammar, history, moodle, old english, pedagogy, research, student employees, students, study tips, teaching, tips, tutorials, unessay, universities, university of lethbridge

See all...

Follow me on Twitter

At the dpod blog