If you walk into one of LSHTM’s labs and ask the nearest person to explain digital preservation, you will probably receive one of two responses: an angry demand for you to leave, or a confused look accompanied by a shoulder shrug. Digital preservation may not be immediately recognised, but it plays an essential role in enabling health research. For a research discipline built upon the scientific principles of open exchange of knowledge, technology lock-in and digital obsolescence represents a significant barrier that limits the ability to use data in further research and share research findings. A 2013 study found that the likelihood of scientific data being accessible fell by 17% per year due to the use of old email addresses and outdated storage devices, resulting in 80 percent of scientific data being lost within two decades.
The LSHTM Archives was setup in 2002, with a remit to collect, preserve and provide access to resources associated with the history, function, and development of the London School of Hygiene & Tropical Medicine. Traditionally, these resources were provided in physical form, but it was recognised that we also needed to do something to curate and preserve digital resources as well.
We began preserving digital material in 2015, with the launch of the School’s research data repository. LSHTM Data Compass provided LSHTM researchers (and their collaborators) with an institutional service that they can use to host and share research outputs in compliance with funding and publication requirements.
In the past 3 years, we’ve published descriptions of over 750 reusable resources produced by LSHTM researchers, covering qualitative and quantitative data, software, search strategies, R scripts and STATA Do files, research instruments, and other resources, of which just under 400 are hosted in the repository itself (the other 350 items contain hyperlinks to other locations).
The majority of LSHTM researchers upload data to the repository in order to obtain a DOI for a journal publication or study report. However, we promote preservation services as a value-added feature, performing the following activities:
1. Format conversion:Research data are often provided in file formats generated by software products in use by the researcher, such as STATA, MS Excel, MS Access, SPSS and SAS. These software products are used because they offer features tailored to the needs of the research community. However, there is often a high price to gain access – a subscription or purchase cost – that is prohibitive to researchers without institutional support. For this reason, content is exported to open formats such as CSV.
2. Documentation enhancement: We spend a lot of time working with data depositors to ensure documentation is sufficient to answer core questions about the data, such as how it was collected, what question was asked to obtain the recorded response, and how the response was measured. In some cases, this simply requires clarification of understanding, while in others there’s a need to address gaps (what does a missing value represent?) or errors (the definition doesn’t match the data). Once finalised, the updated documentation is converted to HTML and made available alongside the dataset.
3. Promotion of supplementary material: We also encourage researchers to share support materials used to capture and process data, such as interview guides, survey forms, consent forms, processing scripts written in STATA and R, and other content. The value of these resources are often unrecognised by their creator, but provide valuable insight into the research process and serve as a research output in their own right that can be applied by students who wish to perform similar research.
LSHTM Data Compass and its sister repository, Research Online, were recently relaunched (in mid-November 2018) with a new user interface that offers improved access on mobile devices. Why not take a look?
Preserving LSHTM’s digital archives
The majority of material held by LSHTM Archives are in paper form, but we do have some born-digital material, provided on 5 ¼ and 3 ½ inch floppy disk, CD-ROM and DVD, and a range of digital formats.
We have recently began a Digital Archives project, which will investigate the born-digital material held by LSHTM Archives and develop processes and procedures for ensuring it can be curated and preserved in the long-term.
This project will be performed with input from our first digital archives trainee, Manasseh Boyd. Manasseh joins us as part of ‘Bridging the Digital Gap’, a Heritage Lottery funded training programme led by The National Archives UK that aims to bring people with technical expertise into the archives sector. We look forward to working with Manasseh on this exciting new project!