Trinity College Dublin

1st International Workshop on Computational History - Biographies

Workshop Chair, Declan O'Sullivan,

Associate Professor and Head of Intelligent Systems Discipline, School of Computer Science & Statistics, Trinity College Dublin

Declan's research goal is to establish the theoretical, conceptual, algorithmic and human-supported techniques that will enable the achievement of sustainable and manageable access to knowledge across differently modelled information content.

9am - 12.30pm

Session 1: Teaching computers history

In order to apply computational analysis to historical data, the knowledge must be translated into a format that a computer can process and analyse, without losing important semantics. This is not a trivial problem. Historical societies are complex systems, with webs of relationships operating between entities at multiple levels, which can't be adequately described by simple numeric variables. Furthermore, where historical quantitative data is available, it is often unreliable and typically requires close examination by an expert in order to interpret its significance. The knowledge and skill required to properly interpret historical data is often limited to a handful of experts on a particular historical period and society.

While technical innovations have introduced significant opportunities for automating the collection of computer-processable datasets from 'big-data', this automation remains error prone. Manual human input is still required in order to produce higher-quality data and to integrate expert interpretation into the process. Technologies are emerging which allow us to tackle these problems. Semantically rich knowledge models allow us to represent complex domains, with multi-faceted relationships between entities, in interoperable, standardised ways that can be reasoned over by computers. Natural language processing technologies continue to improve as do sophisticated user interfaces to facilitate expert interpretations. The presentations in this session will focus on practical efforts to capture historical data in formats that can be processed and analysed by computers, drawn from the presenters' experiences on high-impact historical digitisation projects, such as the Down Survey, the Fagel collection, the 1641 depositions and the European Cultura and Cendari projects. It will also include details of ongoing far-reaching efforts to digitise historical data: the Seshat global history databank project and the Digital Repository of Ireland.


Digitising History – why should historians care?

Historical scholarship typically involves a long process of becoming intimately acquainted with society as it existed at a particular historical place and time. This deep knowledge is not readily reducible to variables or models. However, changes in recorded variables can still reveal important patterns in social behaviour that would not be otherwise visible. This presentation looks at this balancing act from the point of view of a historical scholar.

Prof. Micheál Ó Siochrú

Associate Professor, Department of History, Trinity College Dublin

Prof. Micheál Ó Siochrú's primary research focus is on seventeenth-century Irish political, constitutional, urban and military history, from the Ulster Plantation to the Jacobite Wars, situated in a broad European contextual framework. His most recent book examined the Cromwellian conquest and settlement of Ireland and he was Principal Investigator on the 1641 Depositions project. This is an online fully searchable digital edition of the 1641 Depositions at Trinity College Dublin Library, comprising transcripts and images of all 8,000 depositions, examinations and associated materials in which Protestant men and women of all classes told of their experiences following the outbreak of the rebellion by the Catholic Irish in October, 1641. He is currently part of the editorial team producing a new five-volume edition of Cromwell’s Letters and Papers for Oxford University Press.

Prof Micheál Ó Siochrú


Technologies for extracting knowledge from historical records and archives

Historical artefacts, documents and other sources of historical evidence tend to be messy, inconsistent, inaccurate, partial and confusing. However, technologies have emerged which allow useful and rich models of knowledge to be extraced from messy data. The 1641 depositions project created a semantic structure to describe the people and events that were refered to in the judicial records of an alleged sectarian massacre in Ireland in 1641. The Down Survey project created a rich model of land ownership changes based on the records from a land survey in 17th century Ireland. The Fagel project is currently digitising and creating a rich, computer-processable overlay for a 17th century collection of historical maps. Seamus Lawless describes the challenges encountered in this work and the best tools to tackle them.

Prof. Seamus Lawless

Assistant Professor, Knowledge and Data Engineering Group, School of Computer Science & Statistics, Trinity College Dublin

Seamus's research interests are in the areas of information retrieval, information management and digital humanities with a particular focus on adaptivity and personalisation.Seamus was one of the technical leaders on the implementation of the 1641 depositions and the Down Survey projects.


Digital Repository of Ireland and the importance of trusted digital preservation

The Digital Repository of Ireland (DRI) is Ireland’s national trusted digital repository for social and cultural data, with a mandate to archive, preserve, link, and provide access to a wealth of content across the humanities and social sciences. On 14 May 2014, DRI launched a pilot version of the repository to a select group of stakeholders, and plans are underway for a public launch in the autumn of 2014. Currently, there are collections from a range of Irish institutions, spanning a broad historical time period from the past to the present, and incorporating a variety of file formats and metadata standards. This presentation will introduce the DRI’s works, and discuss the importance of a trusted preservation infrastructure for data-driven humanities research.

Dr. Natalie Harrower

Digital Repository of Ireland, Outreach & Education Manager

Dr. Natalie Harrower is the Manager of Education and Outreach for the Digital Repository of Ireland, located at the Royal Irish Academy. She develops and delivers a skills training programme for DRI stakeholders and the broader digital preservation community in Ireland, and connects with the public through workshops, presentations, and social media. Natalie is also involved in various leveraged projects at the DRI, and is currently the Creative Lead on the Inspiring Ireland project, which brings together digital material from eight of Ireland’s national cultural institutions into one interactive portal ( Natalie has been involved in several successful European funding proposals in the areas of digital humanities, digital preservation and digital curation (including the first Dublin Researcher’s Night in September 2013), and was recently the local chair of the Research Data Alliance’s third plenary in Dublin. She contributes to a number of national or international projects and infrastructures, including DARIAH, ALLEA, RDA, and the collaborative Digital Arts and Humanities PhD program. Natalie’s background is in the humanities, as a theatre and film scholar.


Irish Record Linkages: national opportunities and international obstacles

This paper will present the preliminary findings of Irish Record Linkage, an Irish Research Council-funded interdisciplinary project that aims to create ‘Big Data’ from historic vital registration (VR) data (1864-1914). It will provide an example of how legacy data can be reinvigorated by applying linked data technologies to our primary data and, thereafter, to external datasets. Over the past two decades advances in information technologies have radically changed the way in which historians conduct research. The digitisation of primary sources has increased accessibility and brought awareness of preservation and conservation issues into the mainstream. The historic data revolution has offered several opportunities to rethink research methodologies and questions.

Our project focuses on biopower and its impact on infant and maternal mortality in Dublin from 1864-1914. The application of linked data technologies to VR data provides an opportunity to reconstruct families and to examine questions surrounding infant and maternal mortality at an unprecedented scale. Clearly these data can provide answers to complicated questions about the social determinants of health and longevity but only if historic data is permitted to become ‘Big Data’. The paper will conclude with a discussion of how transnational interdisciplinary research opportunities in burgeoning fields, such as epigenetic change, are curtailed by international regulation focusing particularly on historic hospital records in North America.

Dr. Ciara Breathnach

Lecturer, Department of History, University of Limerick

Dr Ciara Breathnach is lecturer in history at the University of Limerick. She has published on Irish socio-economic, cultural and health histories. Author of The Congested Districts Board of Ireland, 1891-1923, poverty and development in the West of Ireland (Four Courts Press, Dublin and Portland, 2005) and editor/co-editor of six conference proceedings, she has published articles in Medical Humanities, Medical History, Irish Historical Studies, Immigrants and Minorities, History of Family: an International Quarterly, the Journal of Imperial and Commonwealth History, Historical Research: the Bulletin of the Institute for Historical Research and has articles/chapters forthcoming in 2014. Her current monograph project focuses on female Irish patient experiences in Boston and New York. With Co-PIs Dr Sandra Collins and Professor Stefan Decker, she is Principal Investigator of an a two year Irish Research Council-funded Research Project Grant, entitled Irish Record Linkage, 1864-1913.


Seshat, the global history databank

The goal of the Seshat project is to build a historical database that will enable us and others to test theories about the processes responsible for the rise of large-scale societies in human history. The database will bring together, in a systematic form, what is currently known about the sociopolitical organization of human societies, and how it has evolved over time. It will be used in analyses to determine how characteristics of large-scale socioeconomic organization vary with culture, institutions, world region and historical period, and whether there are any universal features that all complex societies share. This presentation will describe the Seshat project, the data that has been collected to date and the ongoing efforts to extend and improve its coverage.

Professor Peter Turchin

Professor in the Department of Ecology and Evolutionary Biology, adjunct Professor in the departments of Anthropology and Mathematics, University of Connecticut.

Prof. Turchin was trained as a theoretical biologist, but his research interests have migrated to the fields of Cultural Evolution and Historical Social Dynamics – he is the founder and editor-in-chief of Cliodynamics: The Journal of Theoretical and Mathematical History. He works at the interface between biological, mathematical, and social sciences. Currently my research focuses on two broad questions. He is Vice-President and a Founding Member of the Evolution Institute. According to, Prof. Turchin is one of the top cited authors in the field of Ecology/Environment. He has published 12 articles in Nature, Science & PNAS, the most influential scientific publications in the world.

12 noon

Building technology platforms for data curation – Dacura and Linked Data

This presentation will provide a high-level, conceptual introduction to RDF and Linked Data and demonstrate how they can be used to build historical datasets through a case-data based upon our work with Professor Turchin on historical political violence datasets. The full process of building a schema, selection of vocabularies, etc will be described. The presentation will include a brief demonstration of the DaCura platform, which contains a fully integrated set of tools for harvesting data from the web, to collection and management, all the way through to publication of attractive visualisations. Finally, the plans for deploying the DaCura platform to curate the Seshat databank will be presented.

Dr. Kevin Feeney

Research Fellow, Knowledge and Data Engineering Group, School of Computer Science & Statistics, Trinity College Dublin

Dr. Kevin Feeney is a computer scientist in Trinity College Dublin. His research interests cover decentralised web-based management, semantic models and data curation. He is the creator and technical architect of the DaCura semantic data curation platform which is being used to support the collection and curation of historical datasets in a range of projects.

1.30 - 5pm

Session 2: Calculating the unknown

Once data is made available in a format that can be processed by computers it can be mathematically analysed. Modern mathematics provides a wealth of tools which allow such data to be analysed in order to distinguish signals from noise and identify patterns in the signals. These patterns can then be extrapolated into the gaps, providing predictions based on historical trajectories for periods where evidence is missing. These patterns can even be extrapolated into the future, providing predictions about what has yet to happen.

This session will bring together some of the researchers working on the forefront of mathematical modelling of historical data. This work uses sophisticated statistical techniques to deal with missing and noisy data, complex dynamical systems approaches to capture circular causation and is driven by theories of social and cultural evolution.


Mathematical models of complex historical processes - an introduction

Professor Sergey Gavrilets

Distinguished Professor, Arts and Sciences, Excellence Professor Department of Ecology and Evolutionary Biology, Department of Mathematics Associate Director for Scientific Activities, National Institute for Mathematical and Biological Synthesis (NIMBioS), University of Tennessee.

Professor Gavrilets focuses on transdisciplinary research at the interface of biology, social sciences, mathematics, and computational science. He uses mathematical models to study complex evolutionary processes. Over the last several years, his research interests have mostly concentrated on the following areas: Human origins and the evolution of social complexity; Major evolutionary transitions; Speciation and adaptive radiation; Sexual conflict; Holey fitness landscapes.


Modeling population dynamics in Old World agrarian empires

James Bennett

Research Scientist, University of Washington, School of Oceanography

James Bennett is a Research Scientist at the University of Washington School of Oceanography, working on autonomous underwater gliders. From 2005 to 2008, he was Vice President for Recommendation Systems at Netflix and, in 2006, designed and implemented the Netflix Prize as well as the 'quantum theory' tagging system for movies. He has been cofounder and principal engineer at several Bay Area software companies, including Pure Software where he designed the Quantify tool and worked on Purify. He has a Masters of Computer Science degree from Stanford University.


A network-based approach for studying technological evolution and revealing patterns of combinatorial change

In this talk, Dr Psorakis will describe how his team made use of a large historical dataset of patenting activity, spanning from 1790 to the present, in order to develop empirically-grounded theories and quantitative models of technological evolution. By focusing on the combinatorial nature of innovation, they built networks of technological capabilities, based on their co-appearance in patents, and study their topological, statistical and temporal properties. They made use of the mesoscopic (or cluster) organisation of such networks in order to reveal groups of closely connected technologies, and by detecting temporal changepoints in their structure, they find evidence of technological "epochs" - historical eras of stable technological evolution. He will conclude the presentation by describing other interesting topological properties of such networks, along with their preliminary work on building statistical models for explaining their observed structure.

Dr. Ioannis Psorakis

Institute for New Economic Thinking at the Oxford Martin School, University of Oxford

Dr. Ioannis Psorakis is a research fellow at the University of Oxford, holding a joint appointment at the Oxford Mathematical Institute and the Institute for New Economic Thinking. He works with Prof Doyne Farmer on mining large data sets of patent associations and economic performances towards the design of models for technological innovation and optimal tech investment portfolios. He recently concluded his doctoral research as a Microsoft Research PhD Scholar at the Machine Learning Group of Oxford University's Engineering Science department.


Geospatial Network Modelling of the Roman Empire

Spanning one-ninth of the earth's circumference across three continents, the Roman Empire ruled a quarter of humanity through complex networks of political power, military domination and economic exchange. These extensive connections were sustained by premodern transportation and communication technologies that relied on energy generated by human and animal bodies, winds, and currents. In this presentation Professor Scheidel will describe how geospatial network models can be used to analyse the flow of people goods and information through this empire.

Professor Walter Scheidel

Chair, Department of Classics, Stanford University, Dickason Professor in the Humanities, Professor of Classics and History, Catherine R. Kennedy and Daniel L. Grossman Fellow in Human Biology.

Prof. Scheidel's research focuses on ancient social and economic history, with particular emphasis on historical demography, labor, and state formation. More generally, he is interested in comparative and transdisciplinary approaches to the study of the pre-modern world, and is trying to build bridges between the humanities, the social sciences, and the life sciences. The most frequently cited active-duty Roman historian in the Western Hemisphere adjusted for age, Scheidel is the author or (co-)editor of 15 books, has published around 200 articles, chapters, and reviews, and has lectured in 23 countries.


Global-scale mathematical models of social evolution: investigating the development of the modern world

The scale and complexity of human social organization has increased dramatically over the last 10,000 years. Understanding how and why this has occurred is a long-standing interest of anthropologists and social scientists, but has received relatively little attention from evolutionary theorists. Tackling these fundamental questions requires an interdisciplinary approach; synthesizing the biological, social, and mathematical sciences. Here I describe the development of statistical and mathematical models that are being used to assess a range of competing theories about why large, complex societies tended to emerge in some parts of the world but not others. Empirical predictions from theories and the outputs of mathematical models are being tested against historical data on social complexity. Through this process we can produce new insights about the processes that have shaped the world we live in today.

Dr. Tom Currie

Lecturer in Cultural Evolution, University of Exeter.

Dr. Currie’s research focuses on investigating human behaviour and cultural diversity using evolutionary theory. He uses quantitative techniques to test competing hypotheses about how cultural traits and societies change over time, and to understand what ecological and social factors drive the evolution of social and political organization. Some of this research involves global-scale analyses, while other aspects have focused on Island Southeast Asia and the Pacific and sub-Saharan Africa, which represent ideal testing grounds for comparative studies of social and cultural evolution. He is also interested in practical applications of this approach to aid in the development of social policy to help solve real-world problems.