Innovative Approaches of Historical Newspapers: Data Mining, Data Visualization, Semantic Enrichment
Tools
MOREUX, Jean-Philippe (2016) Innovative Approaches of Historical Newspapers: Data Mining, Data Visualization, Semantic Enrichment. Paper presented at: IFLA WLIC 2016 – Columbus, OH – Connections. Collaboration. Community in Session S21 - Satellite Meeting: News Media. In: News, new roles & preservation advocacy: moving libraries into action, 10-12 August 2016, Lexington, KY, USA.
Bookmark or cite this item: https://library.ifla.org/id/eprint/2076
Language:
English (Original)
Available under licence Creative Commons Attribution.
Bookmark or cite this item: https://library.ifla.org/id/eprint/2076/1/S21-2016-moreux-en.pdf
Abstract
Innovative Approaches of Historical Newspapers: Data Mining, Data Visualization, Semantic Enrichment
In this age of Big Data this paper describes how digital librairies can apply at large scale innovative approaches to better valorize and bring better experiences of old newspapers. On the first hand, the state-of-the-art OLR (optical layout recognition) technique in one of the largest heritage press digitization projects in Europe (Europeana Newspapers, www.europeana-newspapers.eu, 2012-2015) was used in a data mining experiment. Data analysis was applied to quantitative metadata derived from a 850K pages subset of six XIXth-XXth c. French newspaper titles from the BnF collection. The METS/ALTO XML data was analyzed with data mining and data visualization techniques that show promising ways for the production of knowledge about historical newspapers that are of great interest for library professionals (digitization programs management, curation and mediation of newspaper collections) and for end-users, particularly the digital humanities community. On the other hand, the Retronews web portal showcases how advanced semantic annotation techniques can improve the retrieval efficiency on a digital newspapers collection; thus the rediscovery and reappropriation of these documents by various types of users: teachers, students, researchers, general public.Item Type: | Conference or Workshop Item (Paper) | ||||||
---|---|---|---|---|---|---|---|
Conference details: | IFLA WLIC 2016 – Columbus, OH – Connections. Collaboration. CommunitySession S21 - News, new roles and preservation advocacy: moving libraries into action - Satellite Meeting: News Media |
||||||
Related URLs: | |||||||
Divisions: | Division 2 Library Collections > News Media Section | ||||||
Authors: |
|
||||||
Uncontrolled Keywords: | OCR/OLR; metadata; data mining; data visualisation; semantic enrichment; named entities recognition; digital mediation; digital humanities | ||||||
Date Deposited: | 27 Dec 2017 16:17 | ||||||
Last Modified: | 27 Dec 2017 16:17 | ||||||
URI: | https://library.ifla.org/id/eprint/2076 |
FOR IFLA HQ (login required)
Edit item |