Finding Old Images through a New Lens: Use of Computer Vision for Searching Historic Newspaper Collections

OATES, Anna and SCHLAACK, William (2019) Finding Old Images through a New Lens: Use of Computer Vision for Searching Historic Newspaper Collections. Paper presented at: IFLA WLIC 2019 - Athens, Greece - Libraries: dialogue for change in Session 85 - News Media with Digital Humanities/Digital Scholarship.

Bookmark or cite this item: http://library.ifla.org/id/eprint/2484
[img]
Preview
Language: English (Original)
Available under licence Creative Commons Attribution.

Abstract

Finding Old Images through a New Lens: Use of Computer Vision for Searching Historic Newspaper Collections

The capacity to find images with granularity—i.e., finding images through image-based searching—is requisite to increase the usefulness of image-based research for the digital humanities. As evident through the National Endowment for the Humanities’s data challenge, which invited scholars and students “to produce creative web-based projects demonstrating the potential for using the [textual] data found in Chronicling America,” digital newspaper collections are a rich source for digital humanities research. Chronicling America’s search interface and application programming interface (API), however, are restricted to text-based searches, therein limiting the findability of image-based content. In their paper on “Library Collections as Humanities Data: The Facet Effect,” Thomas Padilla and Devin Higgins elucidate the value of images as a resource which might meet digital humanists’ inquiry of images as a substantive source for research. The use of computer vision for querying image-based digital collections enables users to find image content which, due to the lack of image tagging and description, might not otherwise be found through keyword searching. For example, in a digitized newspaper collection a user might query the image of an advertisement for spectacles which would not return through a keyword search. The authors propose a case study that applies computer vision image searching to the Farm, Field, and Fireside Collection, a collection of 22 historic agricultural newspapers published across the United States’s Midwestern region between 1841-1983. The authors will investigate the success of using VGG Image Search Engine (VISE), an open source computer vision software developed by the Visual Geometry Group (VGG) at the University of Oxford, for searching Farm, Field, and Fireside Collection images to find like-images. Using image recognition, VISE identifies and establishes correspondences between images to create an index of image features. Through this research, the authors will investigate the potential for integrating VISE within existing newspaper access system frameworks, such as Open-ONI, an open source software for searching and browsing digitized newspaper collections. Successful implementation of this tool would enable researchers an avenue to capitalize upon extensive image collections, thus expanding the capacity for inquiries in the digital humanities.

FOR IFLA HQ (login required)

Edit item Edit item
.