An innovative approach to scalable semantic embedding

KOOPMAN, Rob and WANG, Shenghui (2019) An innovative approach to scalable semantic embedding. Paper presented at: IFLA WLIC 2019 - Athens, Greece - Libraries: dialogue for change in Session S15 - Big Data. In: Data intelligence in libraries: the actual and artificial perspectives, 22-23 August 2019, Frankfurt, Germany.

Bookmark or cite this item:
Language: English (Original)
Available under licence Creative Commons Attribution.


An innovative approach to scalable semantic embedding

Embedding words, entities and documents in compact, semantically meaningful vector spaces allows for computable semantic similarity/relatedness which could make search more intelligent and benefit other tasks conducted in libraries, such as entity disambiguation, de-duplication, clustering, recommendation, subject prediction, etc. Deep learning models are powerful but require high computing power and careful tuning hyperparameters for optimal performance. In our quest for practical solutions to support libraries in this field, we revisit the global co-occurrence based embedding methods and propose a conceptually simple and computationally lightweight approach. Our experiments show highly competitive results with a few state-of-the-art embedding methods on different tasks, including the standard STS benchmark and a subject prediction task, at a fraction of the computational cost. We will show the potentials of this scalable semantic embedding method for other applications such as entity disambiguation, citation recommendation, clustering and collection exploration.

FOR IFLA HQ (login required)

Edit item Edit item