Reverse detail from Kakelbont MS 1, a fifteenth-century French Psalter. This image is in the public domain. Daniel Paul O'Donnell

Forward to Navigation

The bird in hand: Humanities research in the age of open data (Digital Science Report)

Posted: Oct 24, 2016 13:10;
Last Modified: Oct 24, 2016 13:10


Originally published as Daniel Paul O’Donnell. 2016. “The Bird in Hand: Humanities Research in the Age of Open Data.” In The State of Open Data: A Selection of Analyses and Articles about Open Data, Edited by Figshare, 34–35. Digital Science Report. London: Digital Science.

Traditionally, humanities scholars have resisted describing their raw material as
“data” 10.

Instead, they speak of “sources” and “readings.” “Primary sources” are the
texts, objects, and artifacts they study; “secondary sources” are the works
of other commentators used in their analyses; “readings” can be either the
arguments that represent the end product of their research or the extracts
and quotations they use for support.

These definitions are contextual. The primary source for one argument can be
the secondary source for another or, as in the case of a “critical edition” of a
historical text, simultaneously primary and secondary. Almost any document,
artifact or record of human activity can be a topic of study. Arguments proposing
previously unrecognized sources (“high school yearbooks, cookbooks, or wear
patterns in the floors of public places”) are valued acts of scholarship. 1

This resistance to “data” is a recognition of real differences in the way humanists
collect and use such material. In other domains, data are generated through
experiment, observation, and measurement. Darwin goes to the Galapagos
Islands, observes the finches, and fills notebooks with what he sees. His notes
(i.e. his “data”) “represent information in a formalized manner suitable for
communication, interpretation, or processing” 2 . They are “the facts, numbers,
letters, and symbols that describe an object, idea, condition, situation, or other
factors” 3. Given the extent to which they are generated, it has been argued that
they might be described better as capta, “taken,” than data, “given”. 4

The material of humanities research traditionally is much more datum than
captum, finch than note. Since the humanities involve the study of the meaning
of human thought, culture, and history, such material typically involves other
people’s work. It is often unique and its interpretation is usually provisional,
depending on broader understandings of purpose, context and form that are
themselves open to analysis, argument and modification. In the humanities, we
more often end up debating why we think something is a finch than what we
can conclude from observing it.

Perhaps most telling is the fact that humanities sources, unlike scientific
data, are usually practically as well as theoretically non-rivalrous 5. Humanities
researchers rarely have an incentive (or capability) to prevent others from
accessing their raw material and entire research domains (e.g. Jane Austen
studies) can work for centuries from the same few primary sources. Priority
disputes that occur regularly in the sciences 6 are almost non-existent within
the humanities. 1

The digital age is changing one aspect of this traditional disciplinary difference.
Mass digitalization and new tools make it possible to extract material
algorithmically from large numbers of cultural artifacts. Where researchers
used to be limited to sources in archives and libraries to which they had
physical access, digital archives and metadata now make it easier to work
across complete historical or geographic corpora: all surviving periodicals from
19th century England, for example, or every known pamphlet from the Civil
War. In the digital age, humanities resources can be capta as well as data.

Such changes allow for new types of research and improve the efficacy of some
traditional approaches. But they also raise existential questions about long-
standing practices. Traditionally, humanities researchers have tended to work
with details from a limited corpus to make larger arguments: “close readings” of
selected passages in a given text to produce larger interpretations of the work
as a whole; or of passages from a few selected works to support arguments
about larger events, movements or schools. In one famous but far from atypical
example, author Ian Watt uses readings from five novels and three authors as the
main primary sources in his discussion of the Rise of the Novel. 7

In the age of open data, it is tempting to see this as being, in essence, a small-
sample analysis lacking in statistical power. 8 But such data-centric criticism of
traditional humanities arguments can be a form of category error. Humanities
research is as a rule more about interpretation than solution. It is about why
you understand something the way you do rather than why something is
the way it is. It treats its sources as examples to support an argument rather
phenomena to be observed in the service of a solution. While Watt’s title,
“The Rise of the Novel,” can be understood as implying a historical scope
that his sample cannot support, his subtitle, “Studies in Defoe, Richardson,
and Fielding,” shows that he actually was making an argument about the
interpretation of three canonical authors based on his understanding of
the novel’s early history – an understanding that by definition always will be
provisional and open to amendment.

The real challenge for the humanities in the age of digital open data is
recognizing the value of both types of sources: the material we can now
generate algorithmically at previously unimaginable scales and the continuing
value of the exemplary source or passage. As the raw material of humanities
research begins to acquire formal qualities associated with data in other fields,
the danger is going to be that we forget that our research requires us to be
sensitive to both object and observation, datum and captum, finch and note. In
asking ourselves what we can do with a million books 9, we need to remember
that we remain interested in the meaning of individual titles and passages.

Works cited

1 Borgman, Christine L. 2007. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass: MIT Press.

2 Consultative Committee for Space Data Systems. 2012. “Reference Model for an Open Archival Information System (OAIS).” CCSDS 650.0-M-2.

3 National Research Council. 1999. Question of Balance: Private Rights and the Public Interest in Scientific and Technical Databases. Washington:
National Academies Press.

4 Jensen, H. E. 1950. “Editorial Note.” In Through Values to Social Interpretation: Essays on Social Contexts, Actions, Types, and Prospects, vii – xi.
Sociological Series. Duke University Press.

5 Kitchin, Rob. 2014. The Data Revolution. Thousand Oaks, CA: SAGE Publications Ltd.

6 Casadevall, Arturo, and Ferric C. Fang. 2012. “Winner Takes All.” Scientific American 307 (2): 13. doi:10.1038/scientificamerican0812-13.

7 Watt, Ian P. (1957) 1987. The Rise of the Novel: Studies in Defoe, Richardson, and Fielding. London: Hogarth.

8 Jockers, Matthew L. 2013. Macroanalysis : Digital Methods and Literary History. Urbana, IL: University of Illinois Press.

9 Crane, Gregory. 2006. “What Do You Do with a Million Books?” D-Lib Magazine 12 (3). doi:10.1045/march2006-crane.

10 Marche, Stephen. 2012. “Literature Is Not Data: Against Digital Humanities.” Los Angeles Review of Books, October. article/literature-is-not-data-against-digital-humanities/


Cædmon Citation Network - Week 14

Posted: Sep 02, 2016 18:09;
Last Modified: Sep 02, 2016 18:09


Hi all!

I spent this week putting information into the newly updated database. It works much faster than it did before, and is very intuitive to use. Dan mentioned that he would like to see some screenshots, so please enjoy the following images:

Here we see the front page of the database, with two text boxes, one for the Source and one for the Reference.

Options will pop up after you begin typing which makes adding sources and references super quick.

The Location box allows you to type the page number on which you found the reference in your source material (I simply type the number without any “p.” or “pg” preceding it) and the drop down box allows you to choose whether the reference is a Text Quote, Text Mention, Scholarly Reference, or Other Reference.

Clicking on the “View Entries” link allows you to view all of the entries that you have made. They are listed from oldest to newest in one big list.

So far I have had zero problems with the database, however I have been coming across a few snags with regards to gathering references from the sources. To use this first article by Lenore Abraham as an example, it is not noted anywhere which edition of Bede’s “History of the English Church and People” that she uses, she just simply gives the title. I am not sure how to figure this out, but feel that it is important to know as the edition cited is the most important piece of information that we are attempting to gather. I am concerned that a lot of other articles might omit this information as well, but I suppose we shall see as the collection continues. I was also curious as to whether or not we count the “about the author” blurbs when adding references. The beginnings of articles will occasionally list other pieces the author has published and I am not sure whether or not to count these as references. My initial instinct was to ignore them, as they do not necessarily have anything to do with the article in question, and if they are important they will be cited again further on, however I thought I would bring it up to be sure.

I am excited to continue collecting information. I will be back in Lethbridge for school on Tuesday, so I can start requesting inter-library loans again and keep our project rolling!

Until next week,



Cædmon Citation Network - Week 12+13

Posted: Aug 23, 2016 10:08;
Last Modified: Aug 23, 2016 10:08


Hi all!

Summer is winding to a close, and our project continues to progress. The database is working, and is currently being made faster for even easier use. Books and articles are still being collected and scanned, and I am trying to split my time between scanning sources and collecting data.

At our last meeting Dan and I went over the exact specifications for the references I am collecting. Information is sorted into four types:

Text Quotes (TQ)

Text Mentions ™

Scholarly References (SR)

Other References (OR)

Text Quotes and Text Mentions come from editions, facsimiles, translations, and manuscripts, and only refer to Cædmon’s Hymn itself. Quotes are direct quotations from the poem, while mentions are references to other editions.

Scholarly References will consist of references made to anything other than Cædmon’s own words. This can include books and articles about the hymn or other topics, as well as supplementary text from the editions of the hymn.

Other References is simply a catch-all category for anything that does not fit into the previous three categories.

Unfortunately I have been having laptop issues and had to reinstall the operating system on my computer, losing some programs in the process. I am not sure if this will affect my GLOBUS endpoint, but I will try transferring some files later to determine if I need to figure all of that out again.

My goals for this week are to scan the ILL books that I currently have checked out, transfer all the files I have scanned to GLOBUS, and fine tune the way I collect my data from the books and articles. I have been finding that it is quicker to write down a large chunk of references on paper and then input the info to the database in one go. This may change as Garret makes the database quicker. The faster version of the database should be ready this week, but I do not currently have access to it so today will be a scanning day. I also plan to request another chunk of ILL books and articles from the library.

For next week’s blog I hope to write a sort of how-to guide on collecting information from the sources and inputting the info into the database. As the new semester starts in two weeks, I will have less time to spend on the project and I believe Dan plans on hiring more students to help the collection go faster. The how-to guide should ensure that we are all collecting data in the same way, and should ease any confusion that might cause errors. As the semester progresses I and whoever else might be working on the project can go through the data collection at a steady pace, and I can continue to collect and scan the sources needed to complete the bibliography.

Things seem to be on track, and hopefully the transition into the new semester will be smooth!

Until next week,



Cædmon Citation Network - Week 10

Posted: Jul 25, 2016 10:07;
Last Modified: Jul 25, 2016 10:07


Hi all,

It is week 10 already, and I feel like I am nowhere near where I thought I would be with regards to this project. While the list of the sources we need for our data collection on Zotero are as complete as we can know at present, not everything on the list has been collected yet. I was in high spirits at the beginning of last week thinking that the collection of sources was nearly complete, however I realised later on that I had missed a good chunk of the list. It turned out that I had some filters set that were omitting a portion of the 700-ish books and articles. To make a long story short, more collection is still needed!

This will mean more inter-library loan books will need to be ordered and scanned, and more articles will need to be transferred to the GLOBUS folder. Thankfully the book scanner is back up and running again! If it holds out it should make the process painless and a good deal quicker than scanning things on the photocopier.

My plan for this week is to:

- finish scanning the inter-library loan books I currently have checked out (there are about four left to scan)

- finish collecting EVERYTHING on the Zotero list, keeping track of how many inter-library loan books are due to come in so I can account for future scanning time.

- transfer all electronic copies of articles and books to the GLOBUS folder (the internet guy is FINALLY coming on Tuesday to hook up my apartment, so I can work on this every night starting tomorrow)

- And then, if by some miracle I finish everything before the end of the week, I will begin data collection.

To be quite honest, I have been very frustrated with myself and the fact that I have not begun the data collection sooner. I suppose that collecting and organising hundreds of articles just takes longer than I imagined. I really have been working at it steadily throughout the summer, trying to maintain a level of organisation that allows information on the project to transfer easily between myself, Dan, Garret, and anyone else that might happen to work on the project. Although scanning and photocopying seem like menial tasks, I think I need to remind myself that such tasks take time and are necessary to keep our project organised and moving forward.

I do hope though that if Dan has any concerns with the pace of the project that he will let me know, as I do not want to drag things out way longer than he was expecting. The project IS moving forward, however slow it may have seemed the past couple of weeks. Although collecting and organising the sources is not the most exciting part of the job to write updates about, you all can be assured that it is almost complete and the data collection will begin very soon! I am very much looking forward to beginning this part of the project, seeing what we find, and facing the challenges that I am sure we will encounter.

Until later,



Cædmon Citation Network - Week 9

Posted: Jul 18, 2016 09:07;
Last Modified: Jul 18, 2016 09:07


Hi all!

I finally get to start reading this week!!! While I am still not 100% complete in my sourcing of all the books and articles, it is looking as though I will definitely be able to start reading by Wednesday if not earlier.

I also have a bunch of books from inter-library loans that I need to scan portions of. That will be part of my job today.

The database will be ready this week as well. Garret says that there will be a few improvements that he will want to make, but I will be able to start using it this week. All the information that I collect will still be available as the database is upgraded.

You may have noticed that I have switched to blogging at the beginning of the week as opposed to the end. I have found that at this point it is more beneficial to myself to post at the start of the week outlining some goals and then adding an update post sometime during the middle of the week. I am going to continue this model for the next while.

Until next time!



Cædmon Citation Network - Week 7.5

Posted: Jul 07, 2016 13:07;
Last Modified: Jul 07, 2016 13:07


Hi all!
You might have noticed that I forgot to blog last week… This is true and totally my fault. I moved into a new apartment and in the process may have suffered a mild concussion. Oops! I have been keeping up with my work, however because I was working at random times of the day and night in chunks of a few hours each I definitely forgot to blog! So here is my update from the last two weeks:

Unfortunately I don’t have much news to report. I have been going through our Zotero bibliography and collecting missing articles through online databases and inter-library loans. It is going well, but it is taking a bit of time.
GLOBUS is now working for me thanks to Gurpreet’s help figuring out what was going on. It was mainly a permissions issue. I can now access our group research folder which is excellent.

The database is also on its way. I am expecting an update from Garret regarding that this weekend.

I will post another blog tomorrow to outline my goals for next week. In the meantime I will continue to pull articles to complete our pool of data.

Until tomorrow!


Cædmon Citation Network - Week 4

Posted: Jun 11, 2016 10:06;
Last Modified: Jun 11, 2016 11:06



This blog comes to you a day later than usual, as Friday’s work ended up taking a lot longer than I thought and I ran out of time! To be honest, this week was spent much like last week: checking our Zotero bibliography against other bibliographies of Cædmon scholarship.

I ended up re-doing a bit of my work from last week, as I learned in my meeting with Dan on Monday that our scope was a bit wider than I had previously thought. I was worried that I had not been considering certain entries in the various bibliographies to be “about Cædmon enough”, so I decided to go through the entries again and add some that I may have missed. It makes sense to add more rather than less, as I can simply remove an article from the list if I read it and realise it has nothing to do with Cædmon. At the moment our bibliography is almost complete, and we have nearly 700 entries!

What are we going to do with this giant list of articles and books? Well, firstly I have to acquire access to each entry, either via JSTOR, inter-library loans, or through one of our library’s other databases. Then I read through EVERYTHING and count each quote and mention of Cædmon and note which of the approximately sixty different editions of the Hymn are cited. We have also decided to try and note every other citation as well. For example if one article about “Cædmon’s Hymn” cites a book about the history of peanut butter sandwiches, I will take note of it, as there may be other pieces of Cædmon scholarship that also cite that book about the history of peanut butter sandwiches. It will be interesting to see if there are identifiable relationships between writing about Cædmon and seemingly unrelated topics – not peanut-butter-sandwich-history obviously, I just haven’t eaten breakfast yet so I am giving you a delicious example.

How am I going to keep track of all this? Good question! We will need a database that I can use to mark down each citation as I come across them in my reading. On Monday Dan and I discussed at length what we will need from this database, and how we would like it to work. At first we were hoping something on Google Forms would do the trick for us, however we discovered as we talked that we need more control over our information than this tool would allow.

One problem emerged when we realised that among our gigantic list of 700 articles (and books, etc) we would find certain works that were actually editions of the Hymn not included in our original list of editions. We would need a way to add this piece to the Editions list… Several other concerns were raised as well, but to be honest I am finding them difficult to explain without drawing you all a little picture. (I should ask Dan how to add images to these blog posts!)

I mentioned at some point that I would pick the brain of my boyfriend, Garret Johnson, who has his degree in Computer Science from the University of Lethbridge and is my go-to person whenever I have a question about these sorts of things. Dan suggested that he could hire Garret to build our database if he would be willing, as someone with a programming background could probably produce what we need a lot faster than either Dan or I working on it ourselves. So that is our current plan! Garret will begin building us a database that will suit our needs and my job for next week will be to start acquiring the 700 articles and books on our list. By the end of next week I am sure I will have thoroughly annoyed the librarians at school with the amount of inter-library loans I will be requesting.

Until next week!



Cædmon Citation Network - Week 3

Posted: Jun 03, 2016 10:06;
Last Modified: Jun 03, 2016 10:06


Hi all!

Another short post this week, but I will try to make up for it by posting more than one blog next week as I get further and further into the project!

Most of this week was spent methodically checking our body of Cædmon scholarship against various databases (all listed in my previous post). I felt a bit bad that it was going so slowly, as I do not want to lollygag in my work at all. Several things seemed to make the task slower than I hoped, however.

First of all, when I started going through the lists I would try and find access to each article or book that was missing from our body of scholarship as I became aware of it. I soon abandoned this practice and decided that I would create a running list of what we are missing FIRST, and then find access to these pieces as my next step.

I also found that many of the works we are missing were in a foreign language, which made my search for them (before I submitted to simply creating a list) more difficult. I will need to ask Dan if we are including foreign language articles in our data. If we are I will also need to figure out how I am going to comb the articles for quotes later on if we do decide to include them. I suppose quotes of the original will be written in Old English, so that is simple enough to pick out, but paraphrasing of the poem in something like Italian or German might prove difficult.

And finally I was a bit impeded by my own inability to figure out how to add new entries to our shared bibliography on Zotero. This is not a huge deal at the moment as I eventually decided to create a running list of missing pieces before finding sources, however I did waste quite a bit of time fighting with the program to add new entries before settling on this. This is something else that I’m sure Dan can help me with. I believe I might just need some sort of permission to add to the shared database.

In any case, next week should provide fodder for a some more interesting blog posts! It will be like a mystery: The Search For the Missing Cædmon Articles…

Until then,



Cædmon Citation Network - The Return

Posted: May 19, 2016 10:05;
Last Modified: May 19, 2016 11:05


Hello, Readers of Dan’s Blog!

My name is Colleen Copland, and I am a student of Dan’s who will be working with him on the Cædmon Citation Network which he and Rachel Hanks began work on last summer. I will be blogging here weekly, and thought I’d use this first post to introduce myself and more-or-less explain the project as I understand it so far. I am still familiarizing myself with everything, so my descriptions may fall short of the actual scope of the project or they might be totally off-base altogether, but as I learn more I will let you know all the juicy details!

Little intro on myself: I am an undergraduate student at the University of Lethbridge, majoring in English and hoping to be accepted into the English/Language Arts Education program this fall (cross your fingers for me, internet!). I have taken three courses with Dan in the past two years, Medieval English, Intro to Old English, and Advanced Old English in which we spent an entire semester reading Beowulf. Suffice to say I think Dan is a pretty excellent prof and I am excited to work for him this summer so I can continue to learn from him!

The Cædmon Citation Network (also known as the Cædmon Bibliography Project and possibly a few other names – I will need to ask Dan if there is something he’d like me to call it officially) is a gathering of data on the citations of various editions of Cædmon’s Hymn. The project is interested in tracking how long it takes a new edition of a work to start being cited in studies of said work. Cædmon’s Hymn, since it is such a short piece, has been re-translated and re-published a great many times since 1644, which should allow us to notice some patterns in the way each new edition is cited.

The project is also interested in looking at the differences between the citing of digital editions of works as opposed to print editions. Many people assume that it takes longer for digital editions to begin being cited, but this project aims to suggest that they are actually cited more quickly. It will be interesting to see what the data shows us.

Where are we right now with regards to the project? Personally, I am becoming oriented with the project’s goals and working to gain access to all of the excellent data collected by Rachel Hanks who worked on the project last year – figuring out where everything was left off and where Dan would like it to go this summer.

I am excited about gathering more information and will share it with you as I progress. It often seems that I gain a better understanding of a project when I explain what is happening to someone else, so I think this blog will be an excellent tool. It will also serve as a good record of what went on at different points during the project for Dan and I. Any questions you might have can be left in the comments section that I believe is located below this post…

Until next week,



World is a better place 3. Career 0.

Posted: Dec 02, 2015 11:12;
Last Modified: Dec 02, 2015 11:12


The last couple of days have been, by any measure, a huge success.

A visit by Dot Porter to Lethbridge got my DH class revved up and also led to a breakthrough in our understanding of the Visionary Cross project and a blog posting yesterday that seems to be making its way around the DHosphere.

Over the weekend, the executive and members of GO::DH led to the development of a report on diversity and intercultural communications issues that also seems to be hitting a nerve

And finally, there was some cool twitter chatter about my ongoing Unessay research.

Or actually, I shouldn’t say that it was a huge success by “any measure.” In fact, it was a wash, as far as career progress went, since none of these are official citations or refereed publications. Although, as I’ve argued elsewhere, Canadian universities are better than many in their ability to use non-bibliometric measures of success, we’re not that good at it.


Could we design comparative metrics that would favour the humanities?

Posted: Mar 29, 2015 13:03;
Last Modified: Mar 29, 2015 17:03


A quick, and still partially undigested, posting on metrics that might favour the humanities over the sciences in “open” competitions. I’m working this out in response to a discussion I had recently with a senior administrator who argued that the University’s tendency to channel resources disproportionately to the Natural Sciences was simply the result of their comparative excellence as measured in “open” competitions.


For a supposed “Liberal Arts” University, the University of Lethbridge is exceptionally bad at supporting the Humanities

As I’ve pointed out before, for a supposed Liberal Arts University, the University of Lethbridge is exceptionally poor in its support for the Humanities. While the Humanities suffer from a lack of resources and attention in comparison to the Social and especially Natural Sciences at all Universities, the University of Lethbridge is a national outlier in the way it has starved its researchers in this area over the last quarter century.

Thus, for example, while our HSS (Humanities and Social Sciences) researchers score at about the 50th percentile on a field normalised basis in terms of their research impact, we come in fourth-last in terms of our funding success compared to other Humanities and Social Science researchers at Canadian Universities (our natural scientists, in contrast, come in at the top of the bottom third in Canada in terms of both impact and funding success).

Poor performance can be attributed in part to administrative monocultures.

There are probably a number of reasons for this mismanagement. But one of them is almost certainly the fact that the University has for the same amount of time been managed almost entirely without participation from Humanists. In the last quarter-century, only two people with a background in the Humanities have been members of our senior administration—and one of these has been a Historian who has been managing our Faculty of Health Sciences. Two years ago, we appointed a classicist as Dean of Arts and Science. This is the first time in 25 years that a Humanist has been in a position to control a budget that actually affects Humanities research.

My argument has been that this lack of disciplinary breadth in our senior administration is largely responsible for our poor support for the Humanities (there have been more administrators from the Social Sciences and, not surprisingly, I would argue, they have tended to do better than the Humanities in terms of gaining resources). It is a natural impulse to find the things you understand more important than the things you do not and an equally natural impulse to unconsciously favour those who share your background and training. Just as our (almost exclusively) male senior administration has tended to find other men to be the most suitably people for vacancies as they have come up, so too an administration that consists (almost exclusively) of natural scientists has tended to think that those are the areas that could make the best use of resources like Canada Research Chairs and Board of Governor Research Chairs (until two years ago, the University of Lethbridge—almost uniquely in Canada—had never appointed a Canada Research Chair in the Humanities and only one in the Social Sciences; it has never appointed a Humanist to a Board of Governors Research Chair).

Or could it be that our Humanists are simply worse than our scientists?

Recently a member of the Senior Administration suggested to me that my analysis of the problem at the U of L was wrong because Research Chairs and similar resources are now being awarded competitively on the basis of open, University-wide, competitions (they used to be simply assigned by the Vice President Academic). If natural scientists are winning these resources, this persons argument went, then it was presumably because they were simply better.

Moreover, the committees that makes these awards are interdisciplinary. So it is no longer the case that these resources are being assigned solely by scientist-administrators who know nothing about the domain. While we may not have that many Humanists in our administration, the scientists we do have are being careful to overcome their bias by allowing the different disciplines to compete against each other.

There is no such thing as a truly “open” cross-disciplinary competition

But is there such a thing as a truly “open” competition across disciplines? The skills and activities that make you a good English professor, for example, may not be the same as those that make you a good Biologist. And within our different disciplines, we reward people for different kinds of activities (for an excellent discussion of this, see How Professors Think: Inside the Curious World of Academic Judgment by Michélle Lamont). Given this, it is an open question to what extent the outcome of these competitions is being shaped by the criteria that are being used to adjudicate them.

And, in fact, the criteria we usually use in these cases tend to favour the sciences: publication and citation counts, impact factors and h-indices are all measurements that are better suited to measuring activity in a field that moves quickly and deals in largely incremental and linear development. While there are problems with the use of such metrics even within the Sciences, there is no indication that they represent an adequate method for identifying excellence in other domains.

Using the wrong criteria can reward sub-optimal behaviour and hide excellence

Indeed, it is even possible that they might hide excellence or reward sub-optimal behaviour in some domains, even as they recognise and reward excellence in others. Many Humanities disciplines, for example, treat “the book” and/or lengthy articles as a measure of scholarly maturity. Publication counts—which reward scholars for avoiding synthesis by dividing work into minimum publishable units—are going to be a very poor measure of success in such fields. In English, for example, we tend to see books as being evidence of excellence; somebody who wants to beat a scientist in an open competition in terms of publication counts, however, would almost certainly be better concentrating on Notes, one of our more minor forms of publication.

Could we reverse the tables and create a structural bias in favour of things that make Humanities research excellent?

All this got me thinking, what would it take to reverse the tables on these “open” competitions? I.e. what metrics could I come up with that, while seeming neutral, might actually provide a structural advantage to Humanists over Natural Scientists in head-to-head competitions. In the spirit of “notes for further research,” here are a couple of guesses:

  1. Average length of contribution (the L-index). Anybody who has ever sat on cross-disciplinary promotion committees knows that page count means different things in different disciplines. In many Humanities disciplines, the best work tends to be synthetic: i.e. things that gather together various views and opinions and construct a larger synthesis. This is opposed to many sciences, where short, actual results are privileged. Our current use of publication counts privileges fields in which it is possible to think in terms of “minimal publishable units.” But what if we came up with a measure that privileged synthesis? A person who has published a few long works (i.e. has a high average length per publication) is probably a poor scientist; but they are also probably a stronger humanist. I’d be interested to see how we’d do if we starting counting length of publication along side their number.
  2. Length of Citation Record. I published my edition of Caedmon’s Hymn 10 years ago. The edition it replaced was published 70 years before that. Both works are still being cited and indeed my edition has recently been the subject of a major review article. This is not the result of any special excellence on my part or the part of the predecessor edition: it is in fact not uncommon in the Humanities to see references to a “recent” study that is ten to fifteen years old. What this suggests, then, is that length of citation record is probably an important measure of Humanities research success. Once again, it is probably a poor measure of scientific research success—except perhaps in the case of a few ground breaking examples—where the research development is more incremental and linear. But this is also why the h-index (which in practice is a measure of speed of citation rather than longevity of citableness) favours scientists over Humanities.
  3. Diachronic citation trend. My edition of Caedmon’s Hymn is also getting cited more now than it was when it was first published. In fact, in work I am planning to present this summer, I will show that it takes about 15-20 years for an edition to become “standard” (i.e. cited by everybody). This is also probably true of our greatest and most important works of literary theory, history, and philosophy: it takes a while for syntheses to catch on and influence thinking. Once again, this is opposed to much of the sciences. While some work, again largely field-changing, fundamental work, probably does have a long and upwardly rising citation trend, I suspect most science publications (including much of the very best work) have a citation half-life—that is to say that their citations fall off with time as the field moves on. In the Humanities, while this is probably common too, it is not a good sign: the best Humanities work gets cited with increasing frequency through time.

Maybe the solution is to compare apples to apples

This is all a thought experiment and is for the most part guesswork rather than research-based. But it is fun to wonder what would happen if the U of L redid some of its recent “open” competitions using criteria like the above as the discriminators. Since, I suspect, these criteria are as unfair to scientists as the ones we currently use are to Humanists, I guess the results would be very different.

Of course the better approach is to avoid “open” competitions at all and instead proceed on a discipline-normalised basis.


Four National and International talks by University of Lethbridge Digital Humanities students

Posted: Feb 02, 2015 12:02;
Last Modified: Mar 04, 2015 05:03


A quick catchup post: this semester is shaping up to be a blockbuster in terms of University of Lethbridge Digital Humanities students’ success in national and international refereed conferences.

The semester began strongly with Kayla Ueland’s presentation “Reconciling between novel and traditional ways to publish in the Social Sciences” at the Force 2015 conference in Oxford this past January. Ueland is a graduate student in Sociology and a Research Assistant in the Lethbridge Journal Incubator.

We have also just heard that four students and recent graduates of the University of Lethbridge’s Department of English have had papers accepted at the joint meeting of the Canadian Society for the Digital Humanities/Société canadienne des humanités numériques and the Association for Computers and the Humanities.

The students and their papers are:

Babalola Aiyegbusi is a recent graduate of the department’s M.A. programme (2014). Rawluk and Alexander, are both fourth-year undergraduates. Singh is a first year M.A. student.


Essential computer tools and skills for humanities students

Posted: Nov 30, 2014 15:11;
Last Modified: Dec 27, 2014 22:12


The Digital Humanities is a hot new field within the Arts. Its practitioners are often at the forefront of developing new topics within ICT itself.

But what about if you are not interested in the Digital Humanities? Or are interested in them, but don’t consider yourself particularly computer literate? What are the computer skills you need to thrive in the traditional humanities or get started in DH?

This is the first in what I hope will be a series of tutorials on basic computer skills and tools for students of the Humanities. It should be of use to those just beginning their undergraduate careers, for graduate students hoping to professionalise their research and study, and for researchers and teachers who have other things to do that follow the latest trends and software.


What kind of thing can I learn from this series?

The focus of this series is going to be on basic tools. It is going to assume you know nothing other than how to turn on a computer and get on the internet. It will make some recommendations about basic software, starting with such simple things as browsers. It will also cover some basic techniques: how to use styles in word processors, how to use a citation manager or spreadsheet.

How often will they appear?

I’m going to mark this as a special cluster in my blog (using a special tag, basic computer skills). But I’ll publish them irregularly, as the mood strikes and I have the time. I’m also hoping to get some guest authors involved. Mostly students who have done presentations on these things in my classes.

What if I have an idea for a tutorial? What if I disagree with you?

If you have an idea for an article in this series, I’d love to hear from you. If you have already written something on a topic I’m covering and would like me to know about it or link to you, please let me know as well!

Articles in this series

The following are links to the other articles in this series. You can also find them using the tag basic computer skills


Yet another example of why APC Open Access should be a non-starter

Posted: Oct 04, 2013 10:10;
Last Modified: Jan 25, 2014 14:01


I hope to write something more detailed about the fundamental ethical problems with APC (Article Processing Charges) models of Open Access.

The short version is that they are basically a subscription charge that preserves all the bad things about paywalled access to knowledge and preserve none of the good.

  1. User subscriptions are “pay to play” in the sense readers need to pay to access the knowledge; APC charges are “pay to play” in the sense that authors need to pay them or the knowledge to get published. Either way, access requires somebody to cross a paywall. Subscriptions are better because they spread the cost more widely (and hence make access cheaper on a unit basis). Moreover, subscription doesn’t prevent the dissemination of knowledge, it only restricts access; APC restricts the dissemination to those who can pay to publish. The second is much worse and far more unethical.
  2. User subscriptions involve trading cash for assets in the sense that libraries that pay subscriptions end up with an asset they can then use—access to the knowledge. APC charges are basically extortion: universities and libraries are told that their authors will not be allowed to publish if they don’t pay somebody to let them. The end result is that the library is poorer in terms of cash and not richer in terms of assets.

But the real evidence that there is a problem with the APC model is the existence of predatory journals. The basic premise—pay us and we’ll publish you—is far more open to corruption than the alternative—pay us and we’ll let you read our content. In a subscription model, the press has an incentive to keep quality high—readers will not pay for garbage; in the APC model, presses have an incentive to lower quality since authors will pay to print garbage.

That subscription models are more ethical than Open Access APC charges does not mean that Open Access itself is unethical or a bad idea. The main issue is how public money is being spent. If libraries and universities that are currently willing to risk public money going to scam artists instead used those funds to support Green Open Access journals we’d be able to have the best of both worlds: free access to freely disseminated research. That would be a good use of public funds—and it is much harder to scam.


Unessay and Standardized Testing

Posted: May 24, 2013 13:05;
Last Modified: May 24, 2013 13:05


In studying the origins of the five-paragraph essay, I stumbled across an article called “Teaching Writing in the Shadow of Standardized Writing Assessment: An Exploratory Study”, by Hunter Brimi. His article begins to dissect the relationship between standardized testing and the writing skills of students. He suggest that the standard format of a five-paragraph essay originated as a marking rubric for the markers of the state-wide tests, to determine the success of the essays written by the students (Brimi 53) And while it appears to have originated as a general standard to assess writing and argumentation skills, it quickly evolved into being the method by which writing and argumentation were taught (Brimi 54). As is typical with standardized testing, teachers begin to teach the material from the test directly to ensure that their students are successful, as well as to make sure they remain free from the trouble that may ensue if their students’ grades fall too far below the line standard set by the tests (Brimi 55).

The whole goal of essay writing in schools is to teach argument and critical thinking. While this is a difficult thing to measure, especially under the strictures of standardized testing, in a backwards sort of loop, the attempt to create a model that will test the critical thinking of students in effect diminishes it. The ability of a student to “plug in” the appropriate structure into a given format does not increase his or her writing ability, nor does it promote original or critical thought. Studies provide evidence that critical thinking and argumentation skills are not garnered most effectively from the five-paragraph structure. Rather, discussion and other modes ways of developing logical and rhetorical skills are what builds appropriate responses throughout the school years of a child (Newell et al. 277), and that a focus on a formulaic structure rather than content inhibits the writing of students.

It is agreed that essays regarding some sort of analysis or interpretation are few and far between in high school. The statistics are stunningly low for the ability of students to correctly interpret or analyze a text, as well as be able to formulate a coherent argument about it, following the recommended structure (Newell et. al. 277). This may relate to the fact that teachers themselves receive little instruction in teaching composition (Brimi 66). While teachers undoubtedly do their best to ensure the success of their students, when they themselves have receied little instruction in the actual act of teaching writing, their fallback into teaching the marking scheme is understandable.

There are also studies suggesting that the single disciplinary approach to argumentative writing in high school negatively affects the essay writing abilities of students. Teachers agreed that most of the writing their students did related to literature and the analysis of it (Brimi 70). This may diminish the capacity of students to argue persuasively across genres.

As I continue researching, I plan to look more closely into the history of the five-paragraph essay, and its relationship to standardized testing.

Works Cited

Brimi, Hunter. “Teaching Writing in the Shadow of Standardized Writing Assessment: An Exploratory Study.” American Secondary Education 41.1 (2012): 52–77. Web.

Newell, George E. et al. “Teaching and Learning Argumentative Reading and Writing: A Review of Research.” Reading Research Quarterly 46.3 (2011): 273–304. Web.


Back to content

Search my site


Current teaching

Recent changes to this site


anglo-saxon studies, caedmon, citation, citation practice, citations, composition, computers, digital humanities, digital pedagogy, exercises, grammar, history, moodle, old english, pedagogy, research, student employees, students, study tips, teaching, tips, tutorials, unessay, universities, university of lethbridge

See all...

Follow me on Twitter

At the dpod blog