WhoOwnsTheData ENG
Introduction
The collection or generation of research data is usually associated with significant financial, personnel, and time expenditure. But who actually "owns" this research data? A more accurate way to phrase this question is "Who has what rights to research data?", as there is no ownership of digital research data.
This text serves to outline various (legal) positions and aims to make them more accessible to researchers, especially at German universities. Understanding these positions is important, e.g., for creating documented agreements on usage rights to research data, as recommended by the Deutsche Forschungsgemeinschaft (DFG). This text pertains to the German legal framework. In addition to the legal requirements of German copyright law (German: Urheberrechtsgesetz) and General Data Protection Regulation, ethical frameworks such as the DFG Code of Conduct Guidelines for Safeguarding Good Research Practice [@dfg2022] or the CARE Principles [@carroll2020] are also relevant for researchers.
When it comes to questions such as "Am I allowed to publish the data at all?" or “Can I take data with me when moving to another institution” (See e.g., chapter 4 in [@wuensche2022]), legal issues often become complex and difficult for laypeople. Generally, cases need to be examined individually, and different legal positions may be in tension with one another. Judgements generally should be made in the spirit of balancing the interests and regulations described below.
Highly recommended: Documented agreements on usage rights
To avoid conflicts over data from arising in the first place, it is recommended to discuss responsibilities and expectations at an early stage and to document agreements on usage rights in writing. In most German scientific institutions, this provision is even mandatory in documents on good scientific practice, since the guidelines of the DFG Code of Conduct are legally binding for almost all institutions in Germany (see also Chapter 3). In "Guideline 10: Legal and ethical frameworks, usage rights" of the Code, the authors state:
"Where possible and practicable, researchers conclude documented agreements on usage rights at the earliest possible point in a research project. Documented agreements are especially useful when multiple academic and/or non-academic institutions are involved in a research project or when it is likely that a researcher will move to a different institution and continue using the data he or she generated for his or her own research purposes"
Agreements on the usage rights of data should be concluded in the spirit of balancing interests and fairness to all parties involved.
Copyright Protection
In the text below, the term "copyright" always refers to German copyright law. Unlike, for example, books or journal articles, which are generally subject to copyright, research data are only copyright protected under certain conditions. This distinction is relevant because only in the case of data that are copyright protected, the rights to publication and attribution are reserved for the copyright holders (It should be noted that, for example, the right to publication may be restricted by other legal positions, such as employment relationships, or that the obligation to attribute is also applicable to public domain data according to good scientific practice). Only in the case of copyright-protected data can copyright holders grant permission for use by third parties, for example, with an open license. Granting an open license for copyright-free (=public domain) data has no legal effect.
The distinction as to whether copyright protection applies or not to research data is simple in many cases. The next two sections provide guidance on this. In case of uncertainty, it is safer to assume existing copyright protection. Personnel and/or financial resources used for data collection are in any case irrelevant to the assessment.
Copyright-Protected Research Data
The requirements for research data to obtain copyright protection are minimal. Usually, it is sufficient that intellectual work has been provided by the data-producing person, which is manifested in a concrete creation showing a minimum level of originality. Literary works, pictorial works (including photographic works), cinematographic works, computer programs, drawings, plans, maps, and sketches are generally protected by copyright. However, it is a prerequisite that these works are personal intellectual creations - a certain "threshold of originality" must be reached.
Images and film recordings that do not reach the threshold of originality may enjoy neighbouring rights protection (ancillary copyright; German: verwandte Schutzrechte), which has similar effects to copyright but has different requirements. For example, images taken by a wildlife camera could enjoy neighbouring rights, even if none of these images contain any intellectual work of a photographer because there was no photographer. In the case of qualitative research data, such as data from participant observation in ethnology or an interview, the necessary threshold of originality is usually reached, so that they are protected by copyright.
There are also very simply designed texts (e.g., technical instructions), sequences of sounds, or computer programs that do not enjoy copyright protection, but these are exceptions. Likewise, measurement data generally do not enjoy copyright protection (see the next section).
Research Data not Protected by Copyright
Research data from experiments or tests are not protected by copyright, nor are measurements of element concentrations in rocks or remote sensing data from satellites. The cost required for data generation or the intended use of the data are irrelevant for the assessment of copyright status. Even measurement data obtained from highly complex and expensive large equipment, requiring substantial investment for collection, does not qualify for copyright protection. Also, data that have been further processed according to standardised professional scientific practices are not considered as personal intellectual creations. An example of this is the representation of stable isotope measurements: these measurements are typically presented as δ-values, calculated according to professional scientific practices.
Tables/Databases
Tables in which data are arranged in certain orders are considered databases according to the definition of copyright law. They often enjoy copyright protection because the creative effort of a person selecting data from a larger set and arranging them in a table in a specific way is protected. Exceptions are tables where this creative personal effort is not evident (for example, if data are only ordered alphabetically or chronologically, aiming for completeness, or according to other simple criteria based on academic conventions). For example, a person creating a table of temperature hourly readings ordered chronologically for a month does not obtain personal copyright rights to the table. However, if only selected readings of this series were displayed in a table due to a specific research question, this individual selection decision could potentially generate copyright protection for the table. The rights arising from this creative effort belong to the person who performed it.
Additionally, there is protection of the investment if the acquisition, verification or presentation of the database requires a significant investment, e.g., personnel or material costs. The holder of this protection is the one who made the investment: in the case of research institutions or universities this is usually the institution itself or a third-party funder or contractor.
It is important to distinguish between the copyright protection of the table itself and the potential copyright protection of the data in the table: if a copyrighted table contains, for example, readings, the individual readings are not protected by copyright (as long as they are not taken to such an extent that the structure of the database and thus the intellectual effort behind the database work becomes apparent). However, if a table contains, for example, responses from an interview study, the data in the database (in this case textual data) are also protected by copyright.
Regulations of Good Scientific Practice
The regulations of good scientific practice formulated by the DFG (DFG Code) are legally binding for most publicly funded universities and non-university research institutions in Germany, for example, through good scientific practice regulations, and therefore apply to most researchers in Germany. However, institutional good scientific practice regulations can deviate to some extent from the DFG Code within narrow limits, so that it is important to always refer to the legally binding institutional regulations as the basis for individual case assessments.
The DFG Code Guidelines 10 and 14 particularly address questions of authorship of research data, authority over publication, and modalities of data reuse by third parties. According to Guideline 14 of the Code, the authorship of research data is subject to the same principles as the authorship of textual publications, such as articles in professional journals. Usage rights are addressed in the explanations on Guideline 10: "In particular, the researcher who collected the data is entitled to use them" This is likely to refer not to those who technically collect the data but to those who are responsible for the scientific conception of data collection [@baumann2021]. This person or group of people is also entitled to determine whether third parties are granted usage rights, but it remains open whether in working groups or in case of collaborative data collection, other researchers would also have usage rights. Also, the possibility of taking data when changing institutions is not specifically defined in the Code.
To avoid possible uncertainties, individual „documented agreements on usage rights relating to data and results“ are recommended in Guideline 10. This important instrument is addressed in Chapter 1, Section „Highly recommended: Documented Agreements on Usage Rights“.
Employment Relationships and Status Groups
When copyright-protected works are created by an employee, the copyright does not belong to the employer but to the employee. On the other hand, usage rights are usually with the employer, but due to the academic freedom (German: Wissenschaftsfreiheit) anchored in the German constitution, this rule is modified. Depending on a person's status or affiliation with a university, the employer may have a right to use the work products (here: copyright protected research data), limiting a researcher’s possible copyright. In addition, it should be noted that contractual loyalty and duty of care obligations exist between the employer and individual researchers, and must always play a role when assessing conflicts.
The decision as to whether the copyrights of researchers, which may for example include the right to decide whether or not to publish results, can be restricted by usage rights of the research institution, depends on whether a person conducts research independently, i.e., freely and autonomously, or is subject to instructions from the employer, i.e., works per instructions. Professors who represent their field independently in research and teaching are considered to research independently. Scientific staff, working for example on a doctoral or habilitation thesis (which are not part of a larger research project), are also considered independent researchers. In such cases, these persons generally have full usage rights to their research data.
When employees work per instructions/are dependent on instructions, their employer is generally granted usage rights to the products of their work. This can lead to problems, for example regarding the right to publish, as both researchers and employers have legal claims. For scientific staff at universities, the distinction between independent and dependent work is often hard to establish. A clear distinction can be difficult, for example, in doctoral projects conducted on the same database as research activities in a third-party funded project. In each individual case, a balance of interests must then be sought. Even with work subject to instructions, there is often room for manoeuvre for the researcher to influence the research results methodically. In general, such creative participation of employees is desired in research and often even crucial for good scientific results.
In other situations, the employment contract might suggest that it is not the person who collected the data, but rather the employer or the head of the research group who has the right to decide how the research data is handled. This could be the case, for example, if the employer has certain agreements with research funders that oblige them to publish certain research data within a specified period, and the researchers involved are aware of this. In such cases, it may be in the interest of all parties involved for the employer or group leader to have sole decision-making authority over the data.
Even with research data not protected by copyright, conflicts can arise, for example when deciding whether a person has the right to continue using the research data they have collected if they leave the university. In such cases, both parties may potentially have rights to use the data, depending on the individual case. Enrolled students are generally not in employment relationships with the university, but they are members of the university. Especially for advanced students, independent choice of methods and knowledge acquisition may be assumed, such that usage rights to research data arising within the scope of qualification work can indeed be attributed to the students.
For employees at non-university research institutions, on the other hand, it tends to be assumed that research data produced by them, regardless of any copyright protection of the data, are so-called "duty works". In these cases, the usage rights lie with the employer.[@baumann2021, page 43]
To prevent conflicts arising due to often difficult-to-assess legal situations, the Code of Good Scientific Practice clearly states: "Where possible and practicable, researchers conclude documented agreements on usage rights at the earliest possible point in a research project." (Guideline 10). These documented agreements should be concluded on the basis of a fair balance of interests.
Rights of Researched Individuals or Groups: Data Protection and CARE Principles
Data Protection
Data protection refers to the rights of subjects of research, i.e., the protection of the privacy of individuals because their personal data are subject to special protection. Therefore, when personal data are collected or processed in research projects, the provisions of data protection law must be observed (see e.g., [@lauber-roensberg2021], or for practical tips on how to proceed in research projects the Data Protection Guide, [@ratswd2020]. The applicable regulations are laid down in the EU General Data Protection Regulation (GDPR) as well as in federal and state data protection laws.
The processing of personal data is only permitted if there is a legal basis and in compliance with data protection regulations. Fundamental elements of data protection can be, for example, the consent of the data subject(s) to the processing of their personal data, regulations on the processing of personal data, and the rights of the researched individuals to information, access, correction of incorrect data, and deletion of data.
CARE Principles
The current movement toward open data and open science does not fully engage with Indigenous Peoples rights and interests. Existing principles within the open data movement (e.g., FAIR: findable, accessible, interoperable, reusable [@wilkinson2016]) primarily focus on characteristics of data that will facilitate increased data sharing among entities while ignoring power differentials and historical contexts. The emphasis on greater data sharing alone creates a tension for Indigenous Peoples who are also asserting greater control over the application and use of Indigenous data and Indigenous Knowledge for collective benefit (see Global Indigenous Data Alliance).
The CARE Principles describe how data should be treated to ensure that indigenous control over the data and its use is respected. This includes, for example, the right to create value from indigenous data based on indigenous worldviews and the right to use the opportunities of knowledge economy. Helpful for understanding and for the practical application of the CARE principles are, for example, explanations in the references, see footnote [@carroll2020; @jennings2023].
The CARE Principles complement the existing FAIR Principles. CARE stands as an acronym for:
- Collective Benefit
- Authority to Control
- Responsibility
- Ethics
Further readings
- @baumann2023
- @baumann2018
- @bmbf2023
- @brettschneider2021
- @kreutzer2021
- @kuschel2018
- @kuschel2020
Publication details and contact
- Publication date: 26 April 2024
- DOI: 10.5281/zenodo.11059315
Andreas Hübner Freie Universität Berlin, Universitätsbibliothek
Garystraße 39, 14195 Berlin, Germany
Tel: +49 30 838 71062
E-Mail: andreas.huebner@fu-berlin.de
ORCID: 0000-0001-7342-9789
License
This work is licensed under CC BY 4.0.