An innovative approach to scalable semantic embedding

Tools

KOOPMAN, Rob and WANG, Shenghui (2019) An innovative approach to scalable semantic embedding. Paper presented at: IFLA WLIC 2019 - Athens, Greece - Libraries: dialogue for change in Session S15 - Big Data. In: Data intelligence in libraries: the actual and artificial perspectives, 22-23 August 2019, Frankfurt, Germany.

Bookmark or cite this item: https://library.ifla.org/id/eprint/2747

Preview

PDF (367kB)

Language: English (Original)

Available under licence Creative Commons Attribution.

Bookmark or cite this item: https://library.ifla.org/id/eprint/2747/1/s15-2019-koopman-en.pdf

Abstract

English

An innovative approach to scalable semantic embedding

Embedding words, entities and documents in compact, semantically meaningful vector spaces allows for computable semantic similarity/relatedness which could make search more intelligent and benefit other tasks conducted in libraries, such as entity disambiguation, de-duplication, clustering, recommendation, subject prediction, etc. Deep learning models are powerful but require high computing power and careful tuning hyperparameters for optimal performance. In our quest for practical solutions to support libraries in this field, we revisit the global co-occurrence based embedding methods and propose a conceptually simple and computationally lightweight approach. Our experiments show highly competitive results with a few state-of-the-art embedding methods on different tasks, including the standard STS benchmark and a subject prediction task, at a fraction of the computational cost. We will show the potentials of this scalable semantic embedding method for other applications such as entity disambiguation, citation recommendation, clustering and collection exploration.

Item Type:

Conference or Workshop Item (Paper)

Conference details:

IFLA WLIC 2019 - Athens, Greece - Libraries: dialogue for change