Dominic OLDMAN | Cristina GIANCRISTOFARO | Jonathan MOFFETT
(British Museum, London, UK)
Keywords: GRAVITATE Archaeology Semantic Search
The GRAVITATE system uses search, discovery and enrichment processes based on data merged from:
* different forms of legacy data;
* computer generated 3D data;
* enrichments extracted from free text through Natural Language Processing (NLP);
* ongoing research contributions.
GRAVITATE helps identify relations across archaeological fragments whether physical or conceptual in nature. Problems arise from using institutional catalogue data and algorithmic analysis of 3D data of ancient artefacts which are eroded and damaged. There are both benefits and problems in attempting to enrich data using NLP. It is therefore crucial that enrichment is also based on human expertise, whether directed specifically at the GRAVITATE objectives or information taken from other research projects, but which nevertheless generates contextual information to improve artefact knowledge generally.
Existing forms of documentation are usually not designed with reassociation in mind, as they are created in forms that are not readily transferable to a knowledge-base or which lack the richness required to make good inferences. The use of geometric data to establish physical similarities (fragments of the same object) or stylistic and decorative relationships, provides a way to help the matching and association processes. The narratives that are often included within an institution’s structured data records provide additional context but require algorithms to extract and integrate relevant data into a combined knowledge graph. These narratives are of variable quality and NLP can have difficulties in recognising context within text, and therefore identifying rich or accurate relationships. However, combining all these forms of enrichment, and using an environment that encourages active participation and human-led enrichment, provides the means for researchers to refine and narrow down artefacts into groups for which physical and conceptual relations can be found. This has the added benefit that it may reduce overall effort and resources.
Relevance for the session: Semantic Searching is an integral part of the GRAVITATE system
• Low, Jyue Tyan & Doerr, Martin, 2010. A Postcard is Not a Building: Why we Need Museum Information Curators. CIDOC;
• Oldman, Dominc, Doerr, Martin & Gradman, Stefan. 2010. “Zen and the Art of Linked Data: New Strategies for a Semantic Web of Humanist Knowledge” in Schreibman, et al (ed), A New Companion to Digital Humanities. Wiley-Blackwell.