Reverse detail from Kakelbont MS 1, a fifteenth-century French Psalter. This image is in the public domain. Daniel Paul O'Donnell

Forward to Navigation

Well that’s that. Solving (?) the VC model and workflow

Posted: Dec 01, 2015 11:12;
Last Modified: Dec 01, 2015 22:12


Yesterday, Dot Porter, one of the leads on the Visionary Cross project visited Lethbridge for a project meeting (and to speak to my DH class). The main purpose of her meeting was to plan the work that needs to happen on the Digital Library side of the project.

This is a core issue for us. As we say on the front page:

The Visionary Cross project is an international, multidisciplinary research project whose principle objective is the development of a new kind of digital archive and edition of texts and objects associated with the Visionary Cross tradition in Anglo-Saxon England.

Taking its cue from recent developments in digital editorial theory and practice, the project takes a data-centric, distributed, and generalisable approach to the representation of cultural heritage texts, objects, and contexts in order to encourage broad scholarly and popular engagement with its material.

The important things here are that it is an archive-edition: it is data-centric, distributed, and (supposed to) be designed to encourage broad scholarly reuse and popular engagement. In our thinking on this, we have been very influenced by the work of Peter Boot and Joris van Zundert on “Just in Time editing” or, especially, the edition as service. In other words, we have understood our work to be not primarily an interface, but rather a service: a source of mediated primary material. This is in keeping with the philosophy of my edition of Caedmon’s Hymn, where the emphasis was on exploiting the power of existing web protocols, infrastructure, standards, and services, rather than custom programming.

In practice, however, this has proved to be something of a block in our progress. Since the very beginning, the Visionary Cross project has approached the problem of editing its objects as an intellectual market place. We’ve had several teams in place who have been working with our common material in different ways: as a Serious Game, as a student-centred resource, as raw material for humanities research, as part of a different edition of a related collection. In each case, the participants have been working alongside each other, rather than directly in collaboration or cooperation, in part because the thing the project has been leveraging has been the overlap in their enthusiasm and their interest in the common dataset. We’ve wanted people to want to share resources because they see how this sharing allows them to do their own research in the directions that appeal to them, rather than to try and bend their interests towards a common, lowest-common-denominator consensus-focussed single interface.

We began this way initially for funding reasons: we didn’t have enough (our first grant awarded us only 25% of our ask) and the only way of getting any work done was to tie our project to the interests of its participants as they worked on other things.

But over the years we began to see this as a virtue as well. By the time we did get all the funding we asked for, this 百花齊放,百家爭鳴 (Let a hundred flowers bloom, let a hundred schools of thought contend) approach had become a part of the goal of the project: we now wanted to exemplify the way we thought our project should be used by others in our internal workings (and, of course, byt the time the funding arrived, we were committed to our different streams anyway).

The downside to this approach, however, has been that it has proved to be difficult to manage: the sub-projects are themselves quite different from each other (though with at times considerable overlap in some aspects) and the result has been that it has been difficult to do the common work. It has also, in some cases, led to minor friction: overlap can, after all, look a bit like competition. As individual projects work on their interfaces, navigation, content, and the like, there’s been little incentive to pay attention to the common aspects of our work; and more importantly, preparing content (objects, annotation, approaches, etc.) has generally involved customised work for specific sub-projects: instead of developing a core set of intellectual objects (metadata, annotation, etc.), we’ve basically had different groups adding custom intellectual objects to a limited set of common core facsimiles (i.e. the 3D models, photography, and, to a limited extent transcriptions.

This is both why we were having the meeting yesterday and why its results were so important. The goal of the meeting was to lay down the ground work for building the central Digital Library that would allow us to build an appropriate place for projects to feed back into the common body of objects and to provide a place where generalisable scholarship and mediation could be done: i.e. a place where we could develop common metadata, commentary, annotation, and the like that could be then used to distribute to the sub-projects

The result was the following diagram:



The way to read this is that we currently have the situation at the far left and far right. I.e. as at the left, we have a lot of use-cases for our data–a Serious Game, a student-focussed reader, some work on a scholarly edition. And as at the far right, we have a collection of files: raw files, processed files, working copies, etc., all organised by origin (i.e. a dump of the different drives, cameras, scanners, and so on). What we don’t have, is the middle: an object-organised collection of objects, metadata, and intellectual objects that can serve as an engine for the projects at the left. And this is where our problems are coming from: since we are missing that engine, the sub-projects are developing their own collections of intellectual objects and processed files.

Initially, it was as a middle that I thought we needed a digital library application. I.e. that the solution would be to set up an Omeka or DSpace, or ContentDM installation and put our stuff in it. But we were hanging up on the choice of software: was Omeka better or worse than Greenstone for this? how would the interface look? and so on.

What we realised yesterday, however, was that these were actually implementation (i.e. left-hand) issues, more or less the same as the questions about our various viewers and environments. That if we really saw the core of the edition as a curated collection of intellectual objects that were intended primarily for reuse by others, then we needed to focus entirely on the data–providing a simple, easily understood, open, and extremely robust collection that could be then used by others for any sort of other purpose, including putting in a Digital Library Application.

A model for this is OPenn. This is an example of the “digital library” in its most simple form: as a well thought out and curated and minimally skinned series of directories with predictable content and naming conventions… and nothing else. As Dot has shown in her own subsequent work, this approach is in fact extremely powerful: it is easy to add to (I was in Penn for the launch of OPenn this Spring, and it has already grown rapidly in the intervening months to include collections from other collections in the Philadelphia area); and it is easy to use for added-value projects: Dot has used this system to build eBooks, page-turning software, a searchable digital library, and so on.

Moreover, as Dot showed in one of her two lectures yesterday, OPenn originated from what was actually a similar problem: a collection that existed in what we are describing here as a “left-hand” format without a corresponding “middle” (she also indicated that they might have had a similar “right hand” problem as well, but that’s not important for us at the moment). The Schoenberg Institute at the University of Pennsylvania was created to “to bring manuscript culture, modern technology and people together to bring access to and understanding of our cultural heritage locally and around the world.” Penn itself began a digitisation programme as early as the late 1990s, and, I believe, has now fully digitised and released its collection of manuscripts under a Creative Commons licence (CC-0, in fact). Like many libraries, it then released this material to the public using a “page turning” interface that allows users to “read” the manuscripts online in much the same way they would the actual codex (this is an interface design loved by museum and library directors, and, reportedly, the general public, but hated by most scholars who work with manuscripts).


The problem, however, was that apart from reading online, there was not much one could do with the material. It was possible to download individual manuscript leaves (and, if you worked at it, all the leaves in a given manuscript, page by page). There was also largely untyped output from the library MARC records for each manuscript that could, as far as I can see, be scraped, if you wanted to use it. But there was no easy way of accessing the resource for other reasons or to repurpose the material for other applications.

The solution to this was to develop OPenn. This is, in essence, a refactoring of Penn-in-hand in its absolutely most simple form: as a directory structure in which objects and associated metadata are grouped together. Each level in the directory can be browsed by a human as a web-page with links and images (a system so simple that, apart from the browser-side XLST processing that is going on, there is nothing that couldn’t be accessed via Netscape Navigator or maybe even Lynx). But more importantly, each level can also be accessed by utilities like wget (allowing you to download entirely collections programmatically) or by URL (allowing you to address inidividual files externally). There are no bells and whistles here (there’s not even a search function, though as Dot showed, you can build one in viewshare). There is nothing to maintain, other than paying for the server and your ISP. Directories and files are not even core Internet architecture, they are core computing architecture and not going anywhere any time soon.

But the important thing is that, by reducing their DL to its absolute most simple form, OPenn makes it easier to design almost everything else a user might want. In addition to the search interface, Dot showed us how she had then used external systems and libraries to build different ways of accessing the Schoenberg material–as eBooks, for example, or even through a page turning interface library. In other words, by massively–you’re tempted to think almost irresponsibly–simplifying the core service of their project, OPenn is able to do everything Penn-in-Hand does, and much much more easily.

So this, I think, is where we have to go with the Visionary Cross. To focus on the core aspects of our content–metadata, curation, content, and high quality objects–and present this to potential users (and our various subprojects) in as simple a fashion as possible: in a way that focusses purely on making the data available in a responsible fashion and ignores all questions of interface, tools, and other things we commonly consider when we think of “digital editions,” in order to do a good job of delivering the data in a form that others can use more easily for their own projects.


Yii Ensuring that key terms are always linked

Posted: Feb 24, 2012 11:02;
Last Modified: May 23, 2012 18:05


As we are building our workflow manager, we are discovering that we develop a more intuitive interface if some terms are always hyperlinked and point to a standard presentation of the relational information.

One example of this might be names of people associated with the workflow (editors, authors, copyeditors, production assistants). An intuitive internal navigation method seems to be to have the names of these people always hyperlinked with the hyperlink always pointing to the person’s profile page.

One way of doing this in Yii would be to modify the views associated with each table in the site so that every time a name is called, you get a link. This is contrary to spirit of the MVC model, however, since it means you are using a view to present logic about the underlying model. And it is also prone to error, since it means you a) need to find every possible invocation of the name in all you various views and b) not make an error as you enter the same code over an over again in all these different views.

The better approach is to add this functionality to the underlying datamodel that supplies the information to the entire site in the first place—that is, to the model for the database that is providing the name information and the page you want to link to in the end.

Here’s some code for your model that would allow you produce a linked name anywhere in your yii site (for simplicity’s sake in this example, I am wrapping a single attribute from my database in a hyperlink. This post shows you how to use a similar method to first make compound attributes):

public function getLastNameLink()
    return CHtml::link(CHtml::encode($this->lastName),
    array('person/view', 'id'=>$this->person_id));

Here are some underlying premises behind this code:

  1. There is a table in my database called person
  2. I have written a view for this database (either by hand or using the gii utility to build one automatically): person/view is the URL fragment CHtml::link will use to build the link to the profile page for a given person (note: it is tempting to just use view for the URL because we are already in the person model; you should use the “full” yii-path however because you will be invoking this throughout the site from views associated with all sorts of other models)
  3. The table person has an attribute (column) called person_id.

Once this has been added to my Person model, I can call the code (and add a link to the person profile) in any view by just invoking the method: from now on, the LastNameLink functions as an attribute of the Person model and can be used in exactly the same way actual direct, table-based attributes can be invoked. For example, in a different model’s view:

<?php echo $data->person->LastNameLink; ?>

This code will produce a link to index.php?r=person/view&id=n where n is the id number of a given record in the table. If I hadn’t added the above code to the Person model, the code required to do the equivalent would have been:

<?php echo 
array('person/view', 'id'=>$data->person->person_id)); ?>

Back to content

Search my site


Current teaching

Recent changes to this site


anglo-saxon studies, caedmon, citation, citation practice, citations, composition, computers, digital humanities, digital pedagogy, exercises, grammar, history, moodle, old english, pedagogy, research, student employees, students, study tips, teaching, tips, tutorials, unessay, universities, university of lethbridge

See all...

Follow me on Twitter

At the dpod blog