The need to make data available in compliance with funder and journal requirements is a daunting task for many health researchers, but one that must be considered when performing research. On November 12th 2014, the London School held a workshop to discuss disparate expectations for data sharing and emerging best practice in public health.
Gareth Knight – Shareable by Design
The workshop was opened by Gareth Knight, Manager of LSHTM’s Research Data Management Service, who outlined the origins and motivation for sharing data in public health. He noted that data sharing was not a new practice – science is built on the notions of openness and transparency, to ensure results can be verified, replicated, and used to inform health policy. The growth of digital capture methods and changing perceptions of what constitutes data has only changed practices in how it is shared. Whereas physical publication methods limit data sharing to static tables and graphs in a journal article, the rise of digital publication methods have resulted in increased demand for data creator’s to provide access to data in a form that is machine-processable and accompanied by licence conditions that permit re-use.
David Carr – Enabling access to research data: a funder perspective
David Carr of the Wellcome Trust offered an insight into the Wellcome Trust’s motivations for encouraging data sharing, noting its importance in enabling health professionals to address challenges that can be only be solved through cross-domain research, as well as the potential for informing policy-making in public health. The Wellcome Trust have adopted a collaborative approach to data sharing, encouraging researchers to share data in accordance with practice within their own discipline, while working with them to address common challenges and barriers. Researchers encounter a number of infrastructural, cultural, technical, professional and ethical issues, varying across subject domains. David considered it important that funders work with them to address many of these issues.
Rosemary Dickin – The PLOS Data Policy
Rosemary Dickin of the Public Library of Science provided an overview of the new PLOS policy that research data which underpin research findings reported in publications should be made available in an open & accessible form. She began with an explanation of motivations for changing their expectations in the area, noting that many researchers found it difficult to obtain datasets from data creators, or were provided with data that were difficult to access and understand. By changing their position to one where data sharing is expected, and reason not to make data available must be justified, PLoS are seeking to improve access to high-quality research. Rosemary emphasised that PLoS do not expect authors to breach patient or participant confidentiality and that they are willing to work with authors to find an appropriate solution if they have specific concerns.
Katherine Ker – Sharing injury and emergency research data using freeBIRD
Katherine Ker, a lecturer in LSHTM’s Department of Population Health gave a talk on the Freebird system, a tailor-made data repository developed to share data generated by the CRASH-2 project. Although its development was initially motivated by funding requirements, the system has become a showcase of the Clinical Trial Unit’s research and improved access to anonymised, blinded data that can be used to improve patient care. Katharine gave a demonstration of the system in action, showing how a user can register and download datasets held as comma separated text files.
Paul Snell – Artemisinin-based Combination Therapy
Paul Snell, Data Manager of the ACT Consortium, gave a talk on the ACT repository, which is being built to manage data generated through analysis of artemisinin-based combination therapy in malaria treatment across Africa, Afghanistan, and Cambodia.
Paul outlined several challenges they had encountered when attempting to obtain data from each study partner and ensure it was managed correctly. Before data could be obtained from the 25 partner institutions, it was necessary to develop a Data Transfer Agreement that met the needs of each country’s government and institution. The practicalities of capturing metadata and manually mapping dataset variables to form variable has also proved to be challenging, requiring an average of 4 days work per study.
Veerle Van den Eynden – Preparing qualitative data for sharing
Veerle Van den Eynden of the UK Data Service offered practical suggestions on the steps that researchers could take to prepare qualitative data – transcripts, coded databases, still images and other material – for sharing. Careful planning at the start of the project was essential, to ensure research participants were made aware of data preservation and sharing obligations and were able to provide their consent. This should be followed by activities to remove identifiable information and document contextual information and key decisions.
Brian Hole – Data Journals and Data Papers
Brian Hole of Ubiquity Press explored the role of publishers in improving access to research. He introduced Data Journals and Data Papers as a new dissemination form – entities that were unfamiliar to many audience members, but which attracted considerable interest. A data paper is a short document describing a dataset that enables it to be easily discovered, understood and cited by end users. Ubiquity Press have expanded rapidly in this field, launching a number of data journals relevant to health researchers, including Open Health Data and the Open Journal of Bioresources.
Gareth Knight – Data sharing resources for LSHTM researchers
Gareth Knight of LSHTM’s Research Data Management Service closed the workshop with an overview of the resources available within the institution to help researchers wishing to make their data available. He drew the audience’s attention to the Research Data Management website, training modules for staff and students, and advisory service. The session concluded with a demonstration of a prototype version of LSHTM’s research data repository, which is to be launched in early 2015.