Needling the Old Guard: XML in Prosopography

The last few weeks we have been discussing the ongoing debate in the digital humanities between textual markup and databases. Reading K.S.B. Keats-Rohan’s “Prosopography for Beginners” on her Prosopography Portal (, I found it interesting that the tutorial focuses initially and primarily on mark-up. Essentially, Keats-Rohan outlines three stages to prosopography:
1. “Data modelling”—For Keats-Rohan, this stage is accomplished by marking up texts with XML tags “to define the groups or groups to be studied, to determine the sources to be used from as wide a range as possible, and to formulate the questions to be asked.” It does far more than that, however, since the tags identify the particular features of sources that need to be recorded. Keats-Rohan covers this activity extensively with eleven separate exercises, each with its own page.
2. “Indexing”—This stage calls for the creation of indexes based on the tag set or DTD developed in stage one. These indexes collect specific types of information, such as “names”, “persons” and “sources”. These indexes are then massaged with the addition of biographical data into a “lexicon”, with the application of a “questionnaire” (i.e. a set of questions to query your data points.) Ideally, it is suggested, this is done through the creation of a relational database with appropriately linked tables. A single page is devoted to the explanation of this stage, with the following apology:

It is not possible in the scope of this tutorial to go into detail about issues relating to database design or software options. Familiarity with the principles of a record-and-row relational database has been assumed, though nothing more complex that an Excel spreadsheet is required for the exercises.

…11 lengthy exercises for XML, but you’re assumed to appreciate how relational databases work by filling out a few spreadsheets?
3. “Analysis”—This is, of course, the work of the researcher, once the data collection is complete. This section of the tutorial includes a slightly longer page than stage 2 with 4 sample exercises. The exercises are designed to teach users how prosopographical analysis can be conducted.
It strikes me as incongruous that, for a research method that relies so heavily on the proper application of a relational database model, so little time is devoted to discussing its role in processing data. Instead, Keats-Rohan devotes the majority of her tutorial in formulating an XML syntax that, when all is said and done, really only adds an unnecessary level of complexity to processing source data. You could quite easily completely do away with stage one, create your index categories in stage two as database tables, and process (or “model”) your data at that point, simply by entering it into your database. What purpose does markup serve as a means of organizing your content, if you’re just going to reorganize it into a more versatile database structure?
Keats-Rohan’s focus on markup starkly emphasizes how XML is far more greatly valued than databases by humanities scholars. Since both are useful for quite different purposes, and relational databases have so much to offer to humanities scholarship—as prosopographies prove—I am baffled that such a bias persists.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: