Classification of keywords selected from research articles on physics and development of a quantitative subject access tool

DUTTA, Bidyarthi and MAJUMDER, Krishnapada and SEN, Bimal Kanti (2013) Classification of keywords selected from research articles on physics and development of a quantitative subject access tool. Paper presented at: IFLA WLIC 2013 - Singapore - Future Libraries: Infinite Possibilities in Session 112 - Classification and Indexing.

Bookmark or cite this item: http://library.ifla.org/id/eprint/164
[img]
Preview
Language: English (Original)
Available under licence Creative Commons Attribution.
[img]
Preview
Language: Chinese (Translation)
Available under licence Creative Commons Attribution.

Abstract

Classification of keywords selected from research articles on physics and development of a quantitative subject access tool

All research articles begin with a title. Most include an abstract. Several include keywords. All three of these features describe an article’s content in details. The title sends an instant reflection of the central theme of the research topic. The abstract summarizes the content. The keywords indicate the core and allied fields of concern. The researchers and indexers quickly and easily locate particular articles within their areas of interest with the aid of keywords. Keywords hold prime importance in abstracting and indexing services. Keywords play major role in information retrieval function. This paper is based on analysis of 14,221 keywords collected from 2,526 research articles published in three journals, viz. Chaos, Physics of Plasmas and Low Temperature Physics since 2006 to 2012. Out of all these author-assigned keywords, the number of distinct bits obtained was 2571. After collection, the lexically close keywords are identified that form clusters. Several such clusters are found and the composition of keywords in nearly all clusters varies over the said time span. Four indicators have been defined on the basis of fluctuating keyword composition within clusters. The name given to these four indicators are stability index, integrated visibility index, momentary visibility index and potency index respectively. These indicators hold different values for different clusters. The value ranges of them are categorized in five groups, viz. very high, high, medium, low and very low. A new quantitative subject access tool has been proposed on the basis of these indicators, which can predict the probable new and obsolete keywords in any subject domain. The name given to this new tool is keysaurus, i.e., keyword-based-thesaurus.

物理学论文的关键词分类及定量主题检索工具开发

所有的论文都有一个标题,多数含有摘要,一些含有关键词。以上三个特征详细描述了一篇文章的内容:标题直接反映论文的中心主题,摘要概述了研究内容,关键词指出了文章关注的主要领域或相关领域。利用关键词,研究人员与索引人员能快速方便的查找到感兴趣的特定文献。关键词对抽取及索引服务非常重要,并且在信息检索中发挥着重要的作用。本文收集了《Chaos》、《Physics of Plasmas》以及《Low Temperature Physics》三种期刊2006-2012年发表的2526篇学术论文,对其中的14221个关键词进行分析。在这些作者定义的关键词中,不同的群组有2571个。采集之后,词法上相似近关键词形成了词簇。以此方式创建了几个词簇,发现在整个时间段内几乎所有词簇中的关键词组合都有所变化。 基于词簇中不断变化的关键词组合,本文定义了四个指标,分别是:稳定度指数、综合可见性指数、短时可见性指数以及能力指数。不同的词簇拥有不同的指标值,对其值域进行分类,可分为5组:非常高、高、中、低、非常低。在这些指标的基础上,本文提出了一个新的定量主题检索工具,它能够预测某一主题领域中可能的新关键词与废弃关键词。该工具命名为 keysaurus(keyword-based-thesaurus)。

FOR IFLA HQ (login required)

Edit item Edit item
.