Backup and Archiving
Adequate protection and archiving of digital research data is a central concern of all researchers. To prevent loss of collected data, the infrastructure provider at the Göttingen Campus offer services which will support you in the maintenance and backup of your research data. Find out in our Service Catalogue!
Research data are the valuable basis of any research process. Science policy actors and research funding institutions are becoming increasingly aware of this fact. The long-term preservation of the underlying data represents both the researchers, as well as the research institutions with new challenges that must be mastered in concert. German scientific institutions and funding organisations are stressing the importance of safeguarding data in their published guidelines and policies on research data. The DFG, for example, requires research data to be stored at least for 10 years already since 1997 (Deutsche Forschungsgemeinschaft. 2013. „Sicherung guter wissenschaftlicher Praxis - Safeguarding Good Scientific Practice). For this purpose, it supports building new or adapting existing infrastructure.
For all these reasons, the University of Göttingen has adopted its own research data guidelines in July, 2014. In addition, new storage and archiving offers at Göttingen Campus are constructed on the basis of this guideline.
For the individual researcher, the following facts are important to keep in mind:
- Data loss can easily happen to anyone. Personal storage devices can become corrupted, be lost, stolen or destroyed. Even institutional storage can break down, be accidentally erased or destroyed.
- When scientists leave projects and institutions and after the end of research projects, the results and data might not be stored as safely and accessible as needed.
- Some data are hardly reproducable, if at all (e.g. observational information such as in meteorology, astrophysics, earth science, and social science).
- Future software and hardware might have problems reading or interpreting the data or storage devices. Data and file formats can become obsolete.
The most common cause of data loss are hardware failures. Additionally, human errors cause a great part of data loss. Other causes include software failures, viruses or hacking attempts, power failures or natural desasters (See also http: //www.zdnet.com/article/how-data-gets lost and http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf).
As a consequence, preservation of research data is necessary!
Two aspects of preserving research data are backing up (frequently storing one or more copies of original files) and archiving (permanent long-term preservation of files).
What can you do about it?
Think about how to store data already in the planning phase of your research, e.g. by setting up a data management plan. Consider which metadata are required to properly document and explain your data. Decide on how long your data should be saved, and how you will deal with issues such as copyright, privacy or sensitive data.
- When deciding how to save your files, try to use a format that is:
- non-proprietary, but open with documented standards
- used by your community
- encoded using standard character encoding
- Proprietary file types like .doc or .xls. are usually not ideal for long-term preservation of data. So save a version using open data formats for archiving (source: http://researchdata.library.ubc.ca/files/2015/10/RDM_DataGuide_V03.1_20151020.pdf). The UK data archive has a table of recommended file formats for a variety of data types.
Find a safe place for data storage.
There are numerous discipline-specific repositories and databases which provide archiving of research data. Most of them can be found at re3data.org. In the near future, an institutional repository will be setup at Göttingen University. At Göttingen Campus you can also use back up and archiving services provided by GWDG: https://www.gwdg.de/storage-services.
Some helpful webpages on data storage and backup:
Some publications on long-term preservation:
- Neuroth, Heike, Stefan Strathmann, Achim Oßwald, Regine Scheffel, Jens Klump, und Ludwig, Hrsg. 2012. Langzeitarchivierung von Forschungsdaten - Eine Bestandsaufnahme. Boizenburg: Verlag Werner Hülsbusch. http://nestor.sub.uni-goettingen.de/bestandsaufnahme/nestor_lza_forschungsdaten_bestandsaufnahme.pdf. (in German only)
- Schofield, Paul N., Tania Bubela, Thomas Weaver, Lili Portilla, Stephen D. Brown, John M. Hancock, David Einhorn, Glauco Tocchini-Valentini, Martin Hrabe de Angelis, und Nadia Rosenthal. 2009. „Post-Publication Sharing of Data and Tools“. Nature 461 (7261): 171–73. doi:10.1038/461171a.
- e-learning course provided by FOSTER project on “Repository compatibility with Horizon2020”