Symposium on Video Fingerprinting in Digital Humanities Research, NISV October 2017
Eggo Müller, Utrecht University, & Johan Oomen, NISV
With financial support from the DARIAH-EU 2016 theme “Public Humanities”, the Netherlands Institute for Sound and Vision (NISV) together with Utrecht University and the EUscreen Foundation initiated a trans-European network of digital humanities researchers, audiovisual archives, educators and software developers to investigate and test multimedia fingerprinting and -tracking technologies for use in the in the public domain. Whereas these technologies are so far mainly used in the private sector with the aim to detect copyright infringements, the objective of this research-oriented initiative is to create an understanding of how these emerging technologies can help search collections, uncover previously hidden relations between disparate items of the same origin and particularly to trace circulation and reuse of audiovisual heritage online.
The initiative emerges form a series of projects (Video Active, EUscreen and EUscreen XL) that brought together major European audiovisual archives, Universities and software developers to collaborate to make Europe’s audiovisual heritage accessible and searchable online. And discoverable though Europeana, the EU digital platform for cultural heritage.
Audiovisual documents speak of the entire 20th century – its fictions, its realities, its politics, its high culture and its daily lives. Increasingly so, throughout the 20th and 21st centuries, audiovisual documents have become a major source of both entertainment and information. Video traffic on the web will be 82 percent of all consumer Internet traffic by 2021, up from 73 percent in 2016.
Today, an abundance of old films, television and radio programmes, photographs and related printed materials are accessible online for consumption and re-use. Re-use in the public sector (education, broadcasters) supports the integration of the imaginative audiovisual to our material and collective cultural memory, and in line with this a greater understanding of notions of identity, of who we are and where we belong.
Multimedia fingerprinting and -tracking technologies have a high potential for creative educators, broadcasters and archivists, but have so far not yet been explored by digital humanities scholars and initiatives. Therefore, the initiative has organized a symposium and expert-meeting to explore the applicability of existing video tracking technologies in Digital Humanities research, held at the Netherlands Institute for Sound and Vision in Hilversum on 12 and 13 October 2017. Starting point was the supposition that by tracking and tracing the flow of audiovisual material online and the circulation of re-used audiovisual heritage researchers, educators and stakeholders in the field can, for example, as Eggo Müller (Utrecht University) in his opening address stated:
- help to better understand the online circulation of audiovisual content on the web;
- initiate research into the appropriation of audiovisual material by users in diverse cultural and ideological contexts and the different ways that footage from the past is re-used on video sharing sites in particular;
- investigate in what contexts the same material is re-used and how its meaning varies in the diverse contexts;
- help to better understand and conceptualize digitized audiovisual material as public data;
- gain a better understanding of the socio-cultural ’life of audiovisual data’ in general;
- and of the potential of video recognition, tracing and fingerprinting technologies to contribute to the notion of public humanities in particular.
These explorations can provide historians, media scholars, ethnographers, archivists and educators with innovative insights and tools that contribute to the notion of public humanities and open up innovative research perspectives in the Digital Humanities. After the opening address, experts in audio and video fingerprinting technologies discussed the state of the art of video recognition, fingerprinting and data tracing technologies. Patrick Aichroth (Fraunhofer Institute, Ilmenau) illustrated in his presentation on A/V segment matching, audio forensics and phylogeny” a number of applications of fingerprinting technologies used for audio and/or video identification and comparison. Aichroth discussed four main areas of use for these technologies:
- metadata extraction and audio forensics (item-based),
- dataset analysis (segment matching, audio phylogeny),
- integrating different tools and multimodal analysis (see MICO, OSS framework), and
- privacy-aware personalization and recommendation based on a combination of content-based recommendation and collaborative filtering.
At the Fraunhofer Institute, his team has build and tested tools to carry out AV segment matching, audio forensics, audio phylogeny to identify roots of material. All these applications work so far on basis of datasets collected for tool testing, but have not been applied in ‘real life’ contexts.
Maarten Zeinstra (Kennisland, Amsterdam) presented Kennisland’s Videorooter as a proof of concept for open source video fingerprinting. No license for the tool is needed, and it can be used to search a set of 100k open licensed videos accessible on Wikimedia Commons, Europeana, and the Internet Archive. Zeinstra explained the general approaches to video fingerprinting technologies in more detail as opposed to watermarking. There are two types of socalled ‘hashes’ to identify an image or sound:
- Cryptographic hashes, that do not survive changes in a media file; even different in resolutions of a video would not allow the hash of the original to be matched with the changed file;
- Likewise, watermarking also does not easily survive changes of the media file or the format of it (jpg -> png);
- However, perceptual hashes as used in Kennisland Videorooter’s algorithm blockhash.io are persistent and have an average accuracy for images of 94%, and 72% for video.
Lars Wieneke (Centre for Contemporary & Digital History, Luxembourg) then presented histoGraph, software that helps historians to deal with big historical multimedia collections and allows, for example, for detecting and visualizing connections between events, places and people. This is particularly useful to detect some unknown and unexpected connections between documents, and thus persons, places and events. The tool supports more traditional approaches to political and contemporary history so far, but will be expanded in the future to offer richer analysis o.
During the concluding round table Digital Humanities researchers Jasmijn van Gorp (CLARIAH-NL), Sonja de Leeuw (EUscreen Foundation) and Patrick Vonderau (Media Studies, Stockholm University) highlighted the potential that video tracking and particularly fingerprinting technologies could have for Digital Humanities research in general and for research into audiovisual heritage online in particular. Other areas could be the tracing of historical film stock used repeatedly in divers productions, but research into early film and different copies and language versions of the ‘same’ film might also benefit from these technologies. Jasmijn van Gorp talked about her vision of developing a ‘meta tool box’ integrating the numerous and divers applications developed for Digital Humanities research, as this is under development for her own area with CLARIAH’s Media Suite. Sonja de Leeuw addressed particularly the necessity to collaborate with curators and archivists not only to identify collections that could benefit from research based on these technologies, but also to contribute to the research agenda and define the research questions. In a more critical note, Patrick Vonderau discussed the tacit shaping of the research agenda in this field by a tool-driven approach. Tools to be developed should always reflect the research interests and questions defined by the community of researchers and curators that long for new tool for novel research.
In his concluding remarks, Johan Oomen (Head of Research and Development, NISV) identified a number of tensions for future research given the current state of development of the technologies, the institutional settings and copyright policies in the digital domain. At the current state, the development of specific applications for well-defined research projects used within specific collections might be more preferable than the development of generic tools. Also, the work with well-defined collections might be more promising than research that aims at, for example, identifying reused material ‘on the whole internet’. Poofs of concept are at the moment more likely to be successful than large-scale projects. Also, for copyright reasons, collaborations with archives or collections that have their material already fingerprinted, seems more promising than starting from scratch. And finally, whereas most commercial players do use these technologies to protect their material, there seem to be no institutions in the public domain that have the urge to explore these technologies for their own benefit. The dialogue about these questions with representatives in the professional associations of the archival field has yet to begin.
This first exploratory symposium on the potential of video tracing and tracking technologies in Digital Humanities research will be followed up by a symposium on The many lives of Europe’s Audiovisual Heritage Online to be held at Utrecht University on May 16th, 2018. Here the focus will shift from the exploration of the current technical state of the art (as discussed on the October meeting) to the circulation and appropriation of Europe’s audiovisual heritage online.
Interested archivists, curators, researchers and software developers can contact the organizer of the symposium (Eggo Müller) or of the network (Johan Oomen) via email: e.mueller[at]uu.nl; joomen[at]beeldengeluid.nl.