Research data are the valuable basis of research processes. For this reason you should think about data safety in the same way you think about data collection, data processing, data analysis or data publication. The loss of data can destroy the research work of days, weeks, months or even years, especially if it is not possible to reproduce the lost or destroyed data. This difficulty has been recognised by political actors and funding institutions for quite some time. The long-term preservation of research results and their underlying data is a kind of challenge for researchers and research institutions which needs to be addressed in a jointl effort. International as well as German research institutions and funding organisations emphasize the importance of data safety in their guidelines on research data management. You can find an overview on important guidelines under Funders' Requirements and Guidelines. The infrastructure providers at the Göttingen Campus build up (new) storage and archiving services according to these guidelines.

Data loss may affect everyone

  • Personal storage media can be damaged, lost, stolen or destroyed. Data loss can also happen with institutional storage media, e. g. by accidential erasure.
  • If researchers leave projects or instituions or the project phase is completed, data and other reserach results may not be accessible or understandable.
  • Some data are reproducible only with very strong efforts or not at all (e.g. observations in meteorology, astrophysics, earth or social sciences).
  • Outdated storage media may not be supported by future software environments. Stored data and files will be useless then.

The main reason for data loss is hardware failure, but also faults by humans can trigger data loss during or after a research process. Effective data safety and management strategies are neccessary to minimize data loss through these or other causes like software errors, viruses, hacker attacks, power failures or natural disasters (see e. g. http://www.zdnet.com/article/how-data-gets-lost/ and http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf).

Storage concept

Making regularly backup copies and archiving copies of the original files are two aspects of the preservation of research data. Think about how and where to store your data already during the planning phase of your research project - e. g. by

  • determining which meta data are necessary to document and explain your data properly,
  • deciding how long your data should be stored and archived,
  • how you will deal with data protection, copyright or sensitive (person related) data,
  • creating a data managemtent plan
  • Data formats

    When you decide how your data should be stored, please use - if possible - a format that is

    • not proprietary but open with documented standards
    • widely used in your disciplin
    • uses standards for encryption
    • is not compressed

    Please keep in mind that propretary or company-defined file formats like .doc. or .xls are not ideal for long-term archiving. Store an additional version in an open format for archiving (Source: http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf). The british data archive has published a table with recomended file formats for several types of data. Find additionally a safe place for your storage media: There are numerous disciplin-specific repositories and data bases which offer the archiving of research data. Many of them can be found on www.re3data.org.

    Information on the Web

    We compiled a few helpful web pages on data storage and data safety.

    Further reading

  • Neuroth, Heike, Stefan Strathmann, Achim Oßwald, Regine Scheffel, Jens Klump, und Ludwig, Hrsg. 2012. Langzeitarchivierung von Forschungsdaten - Eine Bestandsaufnahme. Boizenburg: Verlag Werner Hülsbusch. http://nestor.sub.uni-goettingen.de/bestandsaufnahme/nestor_lza_forschungsdaten_bestandsaufnahme.pdf (in German).
  • Schofield, Paul N., Tania Bubela, Thomas Weaver, Lili Portilla, Stephen D. Brown, John M. Hancock, David Einhorn, Glauco Tocchini-Valentini, Martin Hrabe de Angelis, und Nadia Rosenthal. 2009. „Post-Publication Sharing of Data and Tools“. Nature 461 (7261): 171–73. doi:10.1038/461171a.
  • Das FOSTER-Projekt bietet einen e-learning-Kurs (in englischer Sprache) zu “Repository compatibility with Horizon2020” an.