Recommended Original Article

Managing collaborative research data for integrated, interdisciplinary environmental research

by: Michael Finkel, Albrecht Baur, Tobias K.D. Weber, Karsten Osenbrück, Hermann Rügner, Carsten Leven, Marc Schwientek, Johanna Schlögl, Ulrich Hahn, Thilo Streck, Olaf A. Cirpka, Thomas Walter, and Peter Grathwohl

Published in: Earth Science Informatics (2020), Vol. 13[1], 641-654

Link: https://doi.org/10.1007/s12145-020-00441-0

Abstract: The consistent management of research data is crucial for the success of long-term and large-scale collaborative research. Research data management is the basis for efficiency, continuity, and quality of the research, as well as for ample impact and outreach, including the long-term publication of data and their accessibility. Both funding agencies and publishers increasingly require this long term and open access to research data. Joint environmental studies typically take place in a fragmented research landscape of diverse disciplines; researchers involved typically show a variety of attitudes towards and previous experiences with common data policies, and the extensive variety of data types in interdisciplinary research poses particular challenges for collaborative data management. In this paper, we present organizational measures, data and metadata management concepts, and technical solutions to form a flexible research data management framework that allows for efficiently sharing the full range of data and metadata among all researchers of the project, and smooth publishing of selected data and data streams to publicly accessible sites. The concept is built upon data type-specific and hierarchical metadata using a common taxonomy agreed upon by all researchers of the project. The framework’s concept has been developed along the needs and demands of the scientists involved, and aims to minimize their effort in data management, which we illustrate from the researchers’ perspective describing their typical workflow from the generation and preparation of data and metadata to the long-term preservation of data including their metadata.

(Abstract by the publication author(s), licensed under Creative Commons Attribution 4.0 License.

Comment

This articles reports about the achievements in research data management within the collaborative research project CAMPOS on diffuse pollution of soils, surface waters, and groundwater by a multitude of anthropogenic contaminants and their turnover at landscape scale. Among these achievements, the concept of hierarchically and flexibly structured data type-specific metadata (Media 1) appears to be very appropriate if a variety of data types has to be managed, i.e. when data sets differ in terms of size, dimension, structure, format, temporal frequency, and origin, amongst others. The concept offers large flexibility and efficiency because metadata can be defined in a data type-specific way. Splitting metadata into pieces (that are logically linked via identifiers) allows accounting for and tying in with existing procedures, protocols, and documentation standards, which vary among the different activities and data types.

Scheme of hierarchical metadata — Media 1: Concept of hierarchical metadata (here taking the example of field measurement and sampling data , respectively) that reference (i.e. link) to respective metadata on higher hierarchy levels. (Icons are modified based on icons made by Freepik from www.flaticon.com.) Source: Figure 3 of Finkel et al. 2020, licensed under CC-BY 4.0)