Research data are the valuable basis of research results. For this reason you should think about data safety in the same way you think about data collection, data processing, data analysis or data publication. The loss of data can destroy the research work of days, weeks, months or even years, especially if it is not possible to reproduce the lost or destroyed data. This difficulty has been recognised by political actors and funding institutions for quite some time.
The long-term preservation of research results and their underlying data is a kind of challenge for researchers and research institutions which needs to be addressed in a joint effort. International as well as German research institutions and funding organisations emphasize the importance of data safety in their guidelines on research data management. You can find an overview on important guidelines under Funders’ Requirements and Guidelines. The infrastructure providers at the Göttingen Campus build up (new) storage and archiving services according to these guidelines.
Data loss can affect anyone
- Personal storage media can be damaged, lost, stolen or destroyed. Data loss can also happen with institutional storage media, e. g. by accidental erasure.
- If researchers leave projects or institutions or the project phase is completed, data and other research results may not be accessible or understandable.
- Some data are reproducible only with very strong efforts or not at all (e.g. observations in meteorology, astrophysics, earth or social sciences).
- Outdated storage media may not be supported by future software environments. Stored data and files will be useless then.
The main reason for data loss is hardware failure, but also faults by humans can cause data loss during or after a research process. Effective data safety and management strategies are necessary to minimize data loss through these or other causes like software errors, viruses, hacker attacks, power failures or natural disasters (see e. g. http://www.zdnet.com/article/how-data-gets-lost/ and http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf).
Making regularly backup copies and archiving copies of the original files are two aspects of the preservation of research data. Think about how and where to store your data already during the planning phase of your research project, e. g. by
- determining which metadata are necessary to document and explain your data properly,
- deciding how long your data should be stored and archived,
- how you will deal with data protection, copyright or sensitive data,
- creating a data management plan
When you decide on how your data should be stored, you should try to use a format that
- is not proprietary but open with documented standards
- is widely used in your discipline
- uses standards for encryption
- is not compressed
Please keep in mind that proprietary or company-defined file formats such as .doc or .xls are not ideal for long-term archiving. Store an additional version in an open format for archiving (Source: http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf). The UK Data Archive has published a table with recommended file formats for several types of data. In addition, find a safe place for your data to store: There are numerous discipline-specific repositories and data bases which offer the archiving of research data. Many of them can be found on www.re3data.org.
Information on the web
We compiled a few helpful web pages on data storage and data safety.
- Data management and Analysis at Johns Hopkins university libraries (US)
- Data management best practices at University of Oregon (US)
- Backup and Storage at Research Data Oxford (UK)
- Data storage at ANDS (AUS)
A short overview video (in German) on the 3-2-1 method of data backup is also available on the YouTube channel of Göttingen University:
- Schofield, Paul N., Tania Bubela, Thomas Weaver, Lili Portilla, Stephen D. Brown, John M. Hancock, David Einhorn, Glauco Tocchini-Valentini, Martin Hrabe de Angelis, und Nadia Rosenthal. 2009. „Post-Publication Sharing of Data and Tools“. Nature 461 (7261): 171–73. doi:10.1038/461171a.
- Neuroth, Heike, Stefan Strathmann, Achim Oßwald, Regine Scheffel, Jens Klump, und Ludwig, Hrsg. 2012. Langzeitarchivierung von Forschungsdaten – Eine Bestandsaufnahme. Boizenburg: Verlag Werner Hülsbusch. (in German)