The Herodotos Project (Linguistics faculty Brian Joseph, Marie-Catherine de Marneffe and Micha Elsner, along with Christopher Brown from Classics) was awarded a Digital Humanities Advancement Grant this year in the amount of $74,808 from the National Endowment for the Humanities for their project “Named Entity Recognition for the Classical Languages for the Building of a Catalog of Ancient Peoples”. The Project involves the creation of a catalog of individuals and groups mentioned in ancient sources, to focus on the historical role played by those other than the "great actors." The NEH project team will use Named Entity Recognition (NER), a computational linguistics method which identifies people and place names in texts and then sorts them into pre-defined categories.
This is one phase of what has been envisioned as a much larger project. The Project began with Brian Joseph in 2010 as a way to see what the rate of language loss in the past was, so that it could be compared to the present. They began with Herodotos to see what languages he mentioned and which survived. There were a lot of unexpected challenges: there are lots of groups identified in Herodotos, but not much langauge information; and just the logistics of looking through the text manually for names. It became clear that no comprehensive listing of ancient groups of peoples existed, nor was there any ready source of general information about these groups in one place, and at that point the Project began working on developing a catalogue and compendium of information about ancient peoples. The NEH project team will primarily be working on the computational linguistic side of the problem, developing named entity recognition tools that can pick out group names from Latin texts, in the original, and Greek texts, also in the original.
The Project has already generated a number of publications, which can be found here.