May 2014

The Dictionary of Old English has published its annual progress report for 2013. The biggest innovation this year: a large number of technological advancements.

October 2013

The Nerthus Project - Smashing Success or Haphazard Flop?

Nerthus, an online lexical database of Old English, has been in the making since 2007 - time to look back and evaluate the state, progress and benefits of the project. Is Nerthus going to be an indispensable research tool for Anglo-Saxonists or an obscure grammar toy, quickly fading into oblivion?
Homepage of the Nerthus project

Tacitus’ book Germania includes an intriguing account of some Germanic tribes worshipping the powerful fertility goddess Nerthus. When the goddess was brought out of her sanctuary, no one dared leave for war, great festivities ensued and slaves were drowned in a nearby lake as a cleansing ritual. In the 21st century, Nerthus is also the name of a new Old English lexicology project hosted at the University of Rioja in Spain. Nomen est omen: The choice of the title may be symptomatic of the state of the entire project. On the one hand, Nerthus is a beautiful name with great potential for interesting interpretations and etymologies. Scholars have suggested many telling parallels with other deities like the Old Norse god Njörðr. On the other hand, Nerthus could be a vain and cruel goddess. As there is not even a single reference to this deity in the whole Old English corpus, the name may be utterly inappropriate for a project on the Anglo-Saxon language. So, which is it? Is the Nerthus Project going to be an exciting, vibrant research tool for Anglo-Saxonists? Or is going to sink into obscurity, unusable and useless?

Nerthus is an online lexical database of Old English that has been in the making for several years by the research group in Functional Grammars at the University of Rioja. Its foremost aim is to offer an exhaustive description of the lexicon of Old English. When completed, the database may be used more widely for research on semantics, cognitive linguistics and semantic webs.

As a first step, an extensive headword list was compiled in 2007-2009. It includes c. 30.000 lemmata, their word class and some information on possible inflections. For example, the entry for the verb 'to make' reads, "macian, verb, weak." The wordlist now online has been extended and also features a translation, "to make, form, construct, do; prepare, arrange, cause; use; behave, fare; compare; transform. macian ūp, to put up" and more information on inflectional characteristics, like "Inflectional paradigm: pret[erite] macode."

Since the completion of the initial list, the research team has been busy with the ambitious second phase of the project - the hierarchical organization of the lexicon into lexical paradigms. A lexical paradigm is the set of all words that can be created from a base through word formation processes. Take for instance the Modern English base drink. Its lexical paradigm would include words like drinker, drinkable, drinkability - created with the affixes -er, -able and -ity respectively - (to) drink (the verb), (a) drink (the noun) - linked through zero-derivation - or drink-related, drink-driver, all-you-can-drink - formed through compounding with other words. The researchers collect all the words in their wordlist that are related in one of these three ways, affixation, zero-derivation or compounding, and build word nets for them that specify their relation. These word nets can be displayed with special software called xglore in a three-dimensional space.

3D word net for Old English "bēodan" 'command'

In the future, the project might also include an etymological part and semantic and syntactic information for the lexical items in the word list.

This sounds like quite a decent research project, right? So, why should there be any doubt regarding its viability and soundness? Well, once you start digging under the surface, quite a few problems emerge.

The first problem is the excruciatingly slow pace at which the project is making headway. Even after seven years of work only very few of the initial goals have been achieved: Except for a few small examples containing some dozens of words (see here and here), the 3D database has not been made public. The research proposal promises that "layers [of a derivational paradigm] [will] draw a distinction between more productive and less productive derivational processes." But at the moment, it is impossible to assess the productivity and frequency of specific affixes, for instance -hād (Modern English -hood, as in childhood) as opposed to be- (as in Modern English beget, beseech, behold etc.), or of processes like compounding as opposed to zero-derivation.
The headword list is still incomplete. Many of its stems are completely absent at the moment. There is cuma, 'one who comes, a stranger', but not cuman 'to come'; bunda 'something that’s bound, a bundle', but not bindan 'to bind' and many other cases like this.

The second problem comes from the low usability of the tools that the researchers did make public. The word list is actually quite useless. Perhaps one could use it as a dictionary, but it would certainly be a bad one. The translations are given out of context. There are no subheadings. The entries for "Inflectional morphology" are entirely insufficient. For instance, the first noun in the wordlist, āð (> Modern English 'oath'), is assigned to the inflectional class "m.", for masculine. But that information is relevant only for agreement and does not say anything about its actual inflectional behavior. Is it a strong noun forming its plural as āðas, or weak with a plural āðan? Similar problems arise for other word classes. As it stands, the headword list is therefore simply a vastly inferior alternative to other Old English dictionaries like the DOE, Bosworth & Toller or Hall.
One has to wonder also what the planned word nets are supposed to be good for. Is it just a graphical interface toy, a gimmick to play around with? The graphical representation completely ignores crucial information like how common a word is, or in which manuscripts the word appears and how reliable the source is etc. "The 3D representation assures that lexical relations are displayed explicitly and exhaustively," the Nerthus research proposal states. But is this really true? By definition, derivational process cannot ever be represented "exhaustively" since word-formation processes give rise to the creation of a vast, potentially infinite number of new words. Many Old English derivations may simply not be attested at all. If you wanted to find all attested derivations of the root stinc (> Modern English 'stink), why not simply look for the string "stinc*" in the DOE? You would find all attested inflectional and suffixed forms, the manuscript source and the context. The result may not look as fancy, but could easily be more informative than the Nerthus analogue.

The lexical paradigm of 'stincan' in Nerthus and the entries for "stinc*" in the Dictionary of Old English (DOE)

However, the third and probably biggest problem lies in the many linguistically unmotivated, questionable or even erroneous choices that the researchers made when annotating and linking related words.
Why were verbs given a privileged position to form the basis in the derivational paradigm? After all, Old English verbs are formed with a special ending too, usually -an, or -ian. So, why does stincan (> Modern English 'to stink') form the stem for the noun stenc (> Modern English 'stench')? Why should it not be the other way around? Why isn’t there an independent root, say stink-, from which both words are formed? The verbal over-valuation is entirely unmotivated.
Furthermore, the researchers seem to include only four different word formation processes: prefix, suffix, zero-derivation and compound (see above) and must therefore pigeonhole every example into one of these categories. This leads to a large number of oddities. For example, Nerthus assumes that ābodian 'announce, tell' is a zeroderived form of ābēodan 'command, call out.' But the two roots have been clearly distinct since Proto-Germanic times, namely *budōna and *beudana respectively. The former word has a Modern English descendant in forebode, the latter in to bid (as in to bid in an auction, through conflation with Old English ābiddan). Why then are the two words supposed to be derived from each other at all? Why are they not derived from two independent roots? In fact, not only is the vowel change between o / ēo in the previous example ignored, but all instances of phonological alternations appear to be tagged simply as zero-derivation. It is not very enlightening to claim that drāf 'action of driving' is zero-derived from drīfan 'to drive'? Rather, the morphological process should make reference to the vowel alternation ī / ā and final devoicing (v / f). In short, it is linguistically problematic to join etymologically independent roots together into one derivational paradigm and to skim over important morpho-phonological processes under one general and inappropriate label like "zero-derivation."
Finally, Nerthus contains serious and truly indubitable mistakes: The database makes the absurd claims that cyning 'king' is related to the stem cunnan 'can, be able,' that beon 'to be' is derived from wesan 'to be' rather than an independent stem, that twelfta 'twelfth' and twelfta 'twelfth day' require two independent entries so that they may be derived from different stems, that the entries micle 'much', miclum 'much', micel 'great', micel 'greatly', miclu 'greatness' etc. are not derived from a common stem. This list is based only on a brief, cursory reading of the Nerthus database - it seems likely that there are many more errors hidden in its depth.

Some errors in the Nerthus database

Nerthus has many problems. Indeed, it’s not looking good for the compilers of the database at all. There is a real danger that the project will turn out to be a largely unusable, impractical and error-ridden thingamabob. However, it may be too early to give up hope completely. If the researcher can remedy most of the mistakes currently found in the database and invest a lot of energy into increasing the usability of the resource, there may yet be a bright future for their brain child. Only the future will tell whether Nerthus ultimately becomes a valuable research gadget or an utter and complete failure.