Research Data Management
What are Research Data?
In the research process, data appears in many different contexts. Research data are, on the one hand, the results of research. On the other hand, research data are also all the data that arise in the course of scientific work, including intermediate products in the research process towards research results. This includes, for example, observations, experiments, simulation calculations, surveys, interviews, source research, recordings, digitization, analyses, and also project-specific, individual compilations of existing data (cf. German Council for Scientific Information Infrastructure. (2020). The Data Quality Challenge. Recommendations for Sustainable Research in the Digital Turn, 105.)
Research Data in Text+
Various research data are generated in the data domains of Text+; think of
- individual text corpora, text annotations, interviews, transcriptions, sensor data, surveys, etc. (Collections)
- translations, terminologies, word networks, word lists, etc. (Lexical Resources)
- OCR files, manuscripts and their transcriptions, text apparatus, etc. (Editions)
A concrete example is the corpus “Letters from Jean Paul” (Collections, Editions, and Lexical Resources) with a total of 5,004 letters. These letters come from a digital edition created based on the print edition retro-digitized using the double-keying method. The letters are encoded in TEI XML format, with metadata and standard data, and linked to a historical-critical edition. The research data of the corpus, published under an open license, can be widely reused, for example, as a resource in dictionary work.
Another example is the collection “Digitalized Table of Contents” (TA Collections) of the German National Library. It consists of over 2.3 million table of contents of books published mainly in Germany since 1913 and can serve as a basis for further research.
Finally, data on Text+ User Stories should be mentioned. These are available as individual full texts, but as a data basis for the associated publication, there are also tables with metadata, annotations, and keywords available for download and further processing.
What is RDM about?
Research Data Management as Part of Good Scientific Practice
Research data management is an integral part of good scientific practice. Scientists deposit their publications, as far as possible, with research data in widely used and interoperable formats to ensure traceability, continuity of research, and reusability. The archiving of data in archives and repositories is done in accordance with the FAIR principles (“Findable, Accessible, Interoperable, Re-Usable”). See also the guidelines for safeguarding good research practice of the German Research Foundation (DFG). The individual process phases that research data go through during their collection, processing, and archiving are illustrated in the research data lifecycle:
Why is Research Data Management Relevant for You?
In brief:
Benefits for You
Thoughtful research data management ensures efficient work from the beginning of the research, avoids extra work, prevents data loss, and ensures high data quality. It enables easy sharing of data within your own research circle and beyond. Research data management is a prerequisite for data journals, thus favoring data publications and increasing your citation count and scientific reputation. Last but not least, research data management is a foundation of good scientific practice (see German Research Foundation. (2022). Guidelines for Safeguarding Good Research Practice. Code of Conduct. DOI: 10.5271/zenodo.6472827).
Benefits for Others
Research data created through transparent research data management and using recognized (meta)data formats are more interoperable and therefore more reusable, both within your own research circle and externally. This promotes their further processing and contributes to the promotion of further research.
Benefits for Project Funding
Regarding the handling of research data, the German Research Foundation states: “One essential component of quality-oriented, compatible research is that the data a research project is based on or generates is handled in a way that is appropriate to the subject-specific discipline. For this reason, the handling of research data and the objects on which the data is based have to be carefully planned, documented and described.”
Key Questions
- What research data do you want to collect or use for your research question?
- Do you already have a concept for structuring your data or do you already have an RDM plan?
- Are you already pursuing a sustainable data backup concept, or are you looking for ways to implement this through trusted partners?
- What concepts for citing your research data do you plan to document your scientific work and make the data reusable?
- Are you familiar with the licenses under which the data you are using are available?
- What requirements does your funder have regarding documentation of how you handle your data?
- If you plan to use reference data and third-party data, how will you receive the data and under what conditions can you use the data?
Answering these questions is a central part of planning research data management (RDM). Text+ supports you in managing your research data, storing it (see the list of data centers in Text+), and making it reusable in accordance with the FAIR principles, helps you create RDM plans, and is available for consultation at all times.
RDM Support by Text+
Researchers are supported by the institutions applying for and involved in Text+ with their certified data repositories according to their institutional mission and expertise at all steps of data management from the beginning of the research process. FAIRification of existing data is also part of the service portfolio. These institutions have years of solid experience with the special language- and text-based requirements for research data management, such as different metadata formats, copyright requirements, etc., as well as with the construction and operation of a distributed infrastructure.
Additional Information and Tools for RDM
Tools
There are various tools to support RDM, which highlight important RDM elements and help in creating an RDM plan based on questions:
- RDMO: comprehensive, community-developed general collection of questions for research data management
- CLARIN-D Wizard: shorter collection of questions for RDM, well-suited for beginners
- argos: European data management tool by OpenAIRE
- DMP-online from the Digital Curation Centre in the UK
RDMO
We – the institutions involved in Text+ – offer advise to researchers from the very beginning of the research process in all steps of the systematic organisation of their research data. In doing so, we draw on the expertise of our certified data repositories. To guide you through this process and to document your research data in the form of a data management plan, Text+ offers a questionnaire which is closely based on the Max Weber Foundation’s question catalogue “MWS Initial Recording” and the standard catalogue of the Research Data Management Organiser (RDMO). It was adapted in collaboration with participants of the Text+ consortium and expanded with discipline-specific examples. It will be continuously developed and revised and is integrated into the eResearch Alliance of the SUB Göttingen as a reusable XML catalogue. The questionnaire is closely linked to our consulting services. Researchers can answer the questions independently or as part of an accompanying consultation with the Text+ Helpdesk and deal with the challenges that potentially arise together with one of our Helpdesk Agents. The questionnaire supports you with your self-organisation, but it also helps raising awareness of how to handle research data reasonably. Consequently, this questionnaire makes it easier to work with research data.
This offer is explicitly targeted at our community which focusses on language and text-based data. Potential use cases include applications for projects, preparation for the (long-term) storage of research data in corresponding repositories, and the basic organisation of research data.
Literature and Links
You can find further information on RDM on the following pages. The mentioned pages represent only a selection of the immense information available online and in print.
- Overview handout by the Research Data Working Group of the Focus Initiative “Digital Information” of the Alliance of German Science Organizations from 2018
- forschungsdaten.info: extensive, constantly expanding information platform with numerous selected information; searchable, for example, by scientific field, e.g., humanities; by keywords in the glossary; by RDM topics
- List of RDM services of the DHd-AG data centers
Further, partly more specific information can be found, for example, at many subject-specific information services, on the pages of state initiatives for RDM, as well as at numerous universities, libraries, academies, and other non-university research institutions, and also at the German Council for Information Infrastructures, the DINI/nestor-AG Research Data, and also on the pages of the DFG and BMBF.
Contact our RDM team at Text+ Helpdesk! The Text+ Helpdesk is available to assist you with any questions regarding RDM, especially regarding the three data domains Collections, Lexical Resources, and Editions, as well as infrastructure, legal, and ethical questions.