Computational analysis, browsing, screening, and access for email

CHAN, Peter, SCHNEIDER, Josh and EDWARDS, Glynn (2018) Computational analysis, browsing, screening, and access for email. Paper presented at: IFLA WLIC 2018 – Kuala Lumpur, Malaysia – Transform Libraries, Transform Societies in Session 153 - Poster Session.

Bookmark or cite this item: https://library.ifla.org/id/eprint/2356
[img]
Preview
Language: English (Original)
Available under licence Creative Commons Attribution.

Abstract

Computational analysis, browsing, screening, and access for email

ePADD is software that enables the computational analysis of email using named entity recognition and other natural language processing algorithms. It was created to aid archival repositories and other cultural memory institutions in the appraisal, processing, discovery, and delivery of historically and culturally significant email. It can also be used by journalists, private and family historians, and other individuals seeking to search, browse, analyze, screen, and share email. Fine-Grained Named Entity Type Browsing: ePADD uses a custom named entity recognizer/classifier that recognizes categories of entities bootstrapped from DBPedia. These include persons, organizations, locations, government entities, political parties, companies, universities, diseases, and awards. ePADD learns from these categories and is also able to recognize likely entities it has not come across before. Multi-Entity Search: ePADD includes a multi-entity search to aid in comparative entity analysis between the archive and any other textual corpus. Matching entities are highlighted and link to message results. Lexicon Search: ePADD includes tiered thematic keyword searches geared towards broad analysis of a variety of email collections, including lexicons to identify categories of sensitive correspondence. These lexicons can be edited and tuned, or the user can create all new lexicons to suit their research goals. Bulk Actions and Annotation: ePADD allows the user to apply labels (including restriction periods, processing actions, or more general descriptive labels) and annotations to sets of messages meeting user-defined criteria, including all messages associated with a given correspondent, all messages from a given date range, all messages containing certain keywords or named entities in the subject or message fields, or some combination of the above. Full Access to Messages and Attachments: ePADD provides a Delivery module that provides full access to all messages and attachments that have been screened and approved by the creator and the institution.

FOR IFLA HQ (login required)

Edit item Edit item
.