Standardisation
For FAIR research data, orientation towards established and widely used standards for object and metadata is essential. In particular, uniform formats and standards across repositories are essential to ensure findability, interoperability and reusability in a distributed infrastructure such as Text+.
Text+ pursues two goals with regard to data standards: Firstly, the initiative aims to increase data literacy in the community with regard to standards by providing advice on research data management (e.g. via the Text+ Helpdesk) and by organising workshops and seminars. Text+ is also working on the development and establishment of new standards (e.g. for derived text formats). To this end, numerous Text+ employees are active in national and international standardisation committees (see below).
Standards for text and language data
To ensure that researchers can work well with the data provided by Text+, the Text+ centres document the standards used in their resources in detail. However, data quality - and therefore the standards used - is also at the centre of the data depositing process. For this reason, there is always a detailed bilateral consultation between the data providers and employees of the responsible Text+ centre prior to the data transfer.
Text+ provides general information on common standards for language and text data in several publications:
- Standards and preferred data formats for corpora and text collections - both for metadata and object data and for data packages - have been compiled in this paper: https://zenodo.org/doi/10.5281/zenodo.12800179.
- Recommendation for the creation, processing and publication of FAIR research data in the data domain of editions - A Living Handbook on the how-to of evaluating and guaranteeing FAIR, high-quality edition-based research data.
- Overview of writing systems and character encoding standards and article formats used in the domain: https://doi.org/10.5281/zenodo.13867283.
Application Examples
On this page there are examples of successful implementations of standards in the field of text and language data:
GND Agency
The GND Agency Text+ is a service that is being set up at the SUB Göttingen as part of the NFDI program Text+. As a partner of the GND (Gemeinsame Normdatei), the Agency acts as a central point of contact, initially for projects in the humanities and cultural sciences that are funded as part of the NFDI Text+ consortium. More on the GND Agency can be found here:
Standardisation committees
Members of Text+ are actively involved in various committees and organisations that promote the further development of recognised standards.
- DIN – Deutsches Institut für Normung
- DINI – Deutsche Initiative für Netzwerkinformation e.V. (Letter of Support, PDF)
- IIIF – International Image Interoperability Framework
- ISO – International Organization for Standardization
- RDA – Research Data Alliance (Letter of Support, PDF)
- TEI – Text Encoding Initiative
- W3C – World Wide Web Consortium