A decade of web archiving in the National and University Library in Zagreb

HOLUB, Karolina and RUDOMINO, Ingeborg (2015) A decade of web archiving in the National and University Library in Zagreb. Paper presented at: IFLA WLIC 2015 - Cape Town, South Africa in Session 90 - Preservation and Conservation with Information Technology.

Bookmark or cite this item: https://library.ifla.org/id/eprint/1092
[img]
Preview
Language: English (Original)
Available under licence Creative Commons Attribution.

Abstract

A decade of web archiving in the National and University Library in Zagreb

Due to the dynamic nature of the web, its explosive growth, short lifespan, instability and similar characteristics, the importance of its archiving has become priceless for future generations. The National and University Library in Zagreb (Nacionalna i sveučilišna knjižnica u Zagrebu, NSK), as a memory institution responsible for collecting, cataloguing, archiving and providing access to all types of resources, recognized the significance of collecting and storing online content as part of the NSK's core activities. This is supported by positive legal environment since 1997 when Croatia passed the Law on libraries which subjected online publications to legal deposit. In 2004 NSK established the Croatian Web Archive (Hrvatski arhiv weba, HAW) in collaboration with the University Computing Centre (Srce) and developed a system for capturing and archiving Croatian web resources. From 2004 to 2010 only selective archiving of web resources was conducted according to pre-established selection criteria. Taking into account NSK’s responsibility to preserve resources on Croatian social, scientific and cultural history, the importance of taking a snapshot of all publicly available resources under the national top level domain (.hr) was been recognized in 2011. Since then national domain harvestings have been conducted annually. In addition, in 2011 NSK started to run thematic harvestings of national importance. The paper will present the NSK's ten years’ experience in managing web resources with the emphasis on implementation of the system for selective and domain harvesting as well as the challenges for providing access to archived resources. Also, the harvested data from 2004 to 2014 will be analysed. The findings will illustrate the variability of URLs, frequency of harvesting and types of content. The data from the last four .hr harvestings will also be presented.

FOR IFLA HQ (login required)

Edit item Edit item
.