Tools and Services

Text+ and the participating institutions offer a wide range of services related to language and text data. In addition to research data, services and tools are an essential part of the Text+ offerings for users.

For the provision and maintenance of this service overview, Text+ uses the SSH Open Marketplace, a discovery platform from the Social Sciences and Humanities Cluster within the EOSC.

The list of offerings mentioned below is subject to constant development and expansion. Both a search function and a filter function are implemented for convenient access. Feedback and requests can be directed to the Helpdesk with the subject Infrastructure/Operations.

Search

BAS CLARIN Repository

Repository of 50+ annotated speech corpora. Most corpora may be accessed and downloaded by members of academic research institutions, some corpora require licenses, e.g. highly sensitive data, or data for commercial use.

Further information

BAS Services

Further information

BAS Web Services

Further information

Cascaded Analysis Broker des Deutschen Textarchiv (DTA::CAB)

Web service at Zentrum Sprache of the Berlin-Brandenburg Academy of Sciences and Humanities to analyse and annotate German-language texts linguistically.

Annotating

Further information

CLARIAH-DE Tutorial Finder

The Tutorial Finder allows users to browse freely available and reusable teaching and training materials on procedures, tools, research methods, and topics in the humanities and its related disciplines.

This resource is supported by Text+.

Searching

Further information

CLARIND-UdS

CLARIND-UdS is a repository for language resources at Saarland University.

Further information

CLARIND-UdS FCS Endpoint (Saarbrücken)

The CLARIND-UdS data center is part of the Text+ infrastructure and operated by the Department of Language Science and Technology at Saarland University.

Searching

Further information

CLARIND-UdS Repository (Saarbrücken)

The CLARIND-UdS data center is part of the Text+ infrastructure and operated by the Department of Language Science and Technology at Saarland University.

Archiving Publishing

Further information

CMDI Explorer

CMDI Explorer is a tool that empowers users to easily explore the contents of complex CMDI records and to process selected parts of them with little effort.

Further information

Corpus Services

The (HZSK) Corpus Services were initially developed at the Hamburg Centre for Language Corpora (HZSK) as a quality control and publication framework for EXMARaLDA corpora. Since then, most development work has been done within the INEL project.

Analyzing

Further information

correspSearch - Search and Connect Scholarly Editions of Correspondence

With correspSearch you can search within the metadata of diverse edited letters from different scholarly editions and other scholarly publications. One can search according to the letter's sender, adressee, as well as place and date of the letter's creation.

Searching Annotating

Further information

Critical Pāli Dictionary Online

The Critical Pāli Dictionary Online (CPD) is a digital version of the "Critical Pāli Dictionary", which is a comprehensive dictionary of the Pāli language.

Searching

Further information

DARIAH-CAMPUS

Looking for learning resources?

Publishing Communicating Searching

Further information

DARIAH-DE and OPERAS-GER academic blogging with Hypotheses

Hypotheses is a non-commercial blog portal for the humanities and social sciences. The portal provides a free service that facilitates scientific blogging and ensures greater visibility as well as archiving of content.

Communicating Publishing

Further information

DARIAH-DE Data Federation Architecture (DFA)

The DARIAH-DE data federation architecture is the term for services and tools that enable research data and collection descriptions to be found from various sources (such as cultural institutions, libraries, archives, research facilities, and data centers) and to be used for analysis.

Searching Annotating

Further information

DARIAH-DE Data Modeling Environment

Environment for modeling data and their relationships.

The Data Modeling Environment (DME) is a tool for modeling and associating data. By means of the DME, data models and mappings between them are defined and provided in terms of interfaces (REST-API).

Annotating

Further information

DARIAH-DE Generic Search

Search engine that allows to search in the metadata records of the Collection Registry

The Generic Search creates a comprehensive search facility in DARIAH-DE.

Searching

Further information

DARIAH-DE Geo-Browser

The DARIAH-DE Geo-Browser (or GeoBrowser) allows a comparative visualisation of several requests and facilitates the representation of data and their visualisation in a correlation of geographic spatial relations at corresponding points of time and sequences.

Annotating Analyzing

Further information

DARIAH-DE Helpdesk (DARIAH-EU, CLARIAH-DE, Text+)

A good starting point to receive support for DH-related questions, tools and resources provided by CLARIN-D, DARIAH-EU and DARIAH-DE, CLARIAH-DE and Text+ is the helpdesk.

Communicating

Further information

DARIAH-DE Monitoring of research infrastructures and services using Icinga

Monitoring is a important factor for the operation of a digital research infrastructure. The data centers focus the hardware and the state of the basic software. Monitoring can be used to correct any faults and failures as quickly as possible.

Further information

DARIAH-DE Publikator

For whom? Researchers who want to deposit their research data safe, persistent, and referencable in a research data repository.

The DARIAH-DE Publikator offers the possibility to prepare, manage and import research data for the import into the DARIAH-DE Repository.

Publishing Archiving

Further information

DARIAH-DE Repository dhrep

The DARIAH-DE Repository is a central component of the DARIAH-DE research data federation architecture. The DFA aggregates various services and, thus, ensures a convenient use.

Archiving Publishing

Further information

Deutsches Textarchiv (DTA)

The German Text Archive is a Repository for historical, German-language text corpora at Zentrum Sprache of the Berlin-Brandenburg Academy of Sciences and Humanities.

Publishing Archiving

Further information

DH-Stipends

Our DH fellowships foster research into our freely accessible data and collections using text and data mining methods. The yearly call is mainly directed at young, tech-savvy researchers in all disciplines who already have some experience with Digital Humanities‘ methods and tools.

Further information

Digital Collection of Germany National Library

Our digital collections consist of e-books, e-papers and e-journals, online dissertations, audio books, digitally recorded music, websites and digitised works.

Searching

Further information

Digital Humanities Call

Digital Humanities Call (starting every year in March): We will gladly support you with your research project by providing metadata, digital objects and infrastructures. The provisions in section 60d UrhG apply.

Communicating

Further information

Digitales Wörterbuch der deutschen Sprache (DWDS)

The Digital Dictionary of the German Language (DWDS) is a lexical system at the Berlin-Brandenburg Academy of Sciences and Humanities that provides information about German vocabulary in the past and present.

Further information

DNBLab

The German National Library offers free access to its bibliographic data and several collections of digital objects. As the central access point for presenting, accessing and reusing digital resources, DNBLab allows users to access our data, object files and full texts.

Analyzing

Further information

DTA Base Format (DTABf)

Format standard for TEI-compliant text annotation of digital full texts of historical prints with an extension for manuscripts.

Annotating

Further information

DTA FCS-Endpoint

Endpoint for the Text+ Federated Content Search to query DTAXL, the historical corpora of the DTA at BBAW.

Searching

Further information

DWDS Corpus Search (Digitales Wörterbuch der deutschen Sprache – Korpussuche)

This is a tool for browsing DWDS corpora. The DWDS is part of the Center for Digital Lexicography of the German Language (ZDL), funded by the Federal Ministry of Education and Research.

Searching Analyzing

Further information

DWDS FCS-Endpoint

Endpoint for the Text+ Federated Content Search to query DWDSXL, a collection of contemporary text corpora at BBAW.

Searching

Further information

DWDS word history curve

Tool of the Digital Dictionary of the German Language (DWDS) at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW) for the diachronic analysis of word usages.

Analyzing Searching

Further information

DWDS-API

An application programming interface (API) for querying DWDS-corpora at Berlin-Brandenburg Academy of Sciences and Humanities (BBAW). It's the API interface of the Digital Dictionary of the German Language (DWDS).

Searching

Further information

ediarum

With ediarum researchers can comfortably transcribe, encode and edit manuscripts in TEI-XML, as well as publish their results in an online or print edition. The solution, developed by TELOTA, is based on three software components: exist-db, Oxygen XML Author, and ConTeXt.

Further information

EmuR

The main R package for the EMU Speech Database Management System (EMU-SDMS). R package to integrate signal processing, customized graphical output and interactive phonetic segmentation into the R statistics package.

Further information

entityXML

entityXML is (so far) a concept study in version 0.6.5 (BETA), which aims to model a standardised XML-based data format for the GND Agency Text+.

Annotating

Further information

GermaNet API

GermaNet kann über eine API in jeweils Python- und Java angesprochen werden.

Further information

GermaNet FCS Endpoint

Further information

GermaNet Rover

Rover ist eine Online-Anwendung, die zur Nutzung der GermaNet-Daten oder zur Berechnung der semantischen Verwandtschaft/Ähnlichkeit zwischen zwei Synsets verwendet werden kann.

Further information

GND Agency Text+

The GND Agency Text+ is a service that is being set up at the Göttingen State and University Library (SUB Göttingen) as part of the NFDI consortium Text+.

Annotating

Further information

GND-Explorer

Informative, visual, interconnected: The GND Explorer will be the new tool for presenting and searching the Integrated German authority file (GND)! In the future, the GND explorer should provide a convenient and comprehensive access to the GND and its semantic network for all users.

Searching Analyzing

Further information

HedgeDoc - GWDG Pad

Online tool for collaborative text editing to work together on the same texts at the same time.

The HedgeDoc-pad is an open-source-based web editor that allows multiple users to work on a single text simultaneously from different locations.

Communicating

Further information

IDS Repository

The IDS Repository deals with the long-term archiving of linguistic and language resources in the field of German studies and serves as an interface to the Virtual Language Observatory, where the data can be explored via a faceted search.

Searching Archiving

Further information

Indico Event Management

The open source software Indico developed by Cern is a web application. Lectures, meetings and conferences can be created using Indico. Three different event types (lecture, meeting and conference) can be created in Indico.

Communicating

Further information

Integrated Authority File (GND)

The Integrated Authority File (GND) is a service facilitating the collaborative use and administration of authority data. These authority data represent and describe entities, i.e.

Annotating

Further information

KorAP (on DeReKo)

This is a corpus analysis platform that is suited for large, multiply annotated corpora and complex search queries independent of particular research questions.

Searching Analyzing

Further information

KorAP FCS-Endpoint

Searching

Further information

Language Archive Cologne (LAC)

The Language Archive Cologne (LAC) supports research, learning and teaching with high quality and dependable digital language resources. The LAC facilitates free and open online access to research data.

Publishing Archiving

Further information

Language Resource Switchboard

A web application that suggests language analysis tools for specific data sets.

Further information

MeineDGS - ANNIS

ANNIS Instances of the Public German Sign Language Corpus, created by at Institute of German Sign Language and Communication of the Deaf in the framework of the longterm project DGS-Corpus.

DGS-Corpus is a long-term project by the Academy of Sciences in Hamburg.

Analyzing Annotating

Further information

Metadata Service

The metadata provided by the German National Library includes current and retrospective bibliographic data for individual series in the German National Bibliography, authority data from the Integrated Authority file, metadata from the German Union Catalogue of Serials, and data on new releases.

Searching

Further information

MONAPipe

MONAPipe stands for "Modes of Narration and Attribution Pipeline". It provides natural-language-processing tools for German, implemented in Python/spaCy.

Publishing Annotating

Further information

Notebook Actions - TextGrid Import UI

Archiving

Further information

Octra Backend

Web-based management tool for organizing transcription projects: audio files, transcribers, tools and assignments are managed in a graphical user interface.

Further information

OCTRA Transcription Editor

Web-based editor for orthographic transcripts. Octra provides different views of signal and transcript, supports a flexible organization of work, and provides tools to split signal files into segment-sized chunks.

Further information

OpenProject

The Project Management Service is a collaboration self service that allows you to manage and track your projects and source code repositories. By using DARIAH-DE OpenProject, users can independently coordinate their projects, keep track of their issues and document their results.

Communicating

Further information

OWID FCS-Endpoint

Searching

Further information

Percy Web-Experiment

Further information

Persistent Identifier Service

In all aspects of research, the amount of digitally stored data is increasing continuously. Thereby the management will be more and more complex, so that the sustainable reference to data and thus their permanent censibility represent a challenge.

Publishing

Further information

perspectivia.net

Scientific publications of the Max Weber Foundation (MWS, Max Weber Stiftung) and its partners can be published free of charge on the OA publication platform perspectivia.net in the sense of Diamond Open Access.

Publishing

Further information

RocketChat

Rocket.Chat is a web-based, persistent messaging service focusing on group communication. It is a simple and intuitive platform for all users with a GWDG account/AcademicID for their communication.

Communicating

Further information

Russian Regions Acoustic Speech Database

The Russian Regions Acoustic Speech Database (RuReg) is a collection of speech recordings from various regions of Russia. Rureg aims to capture the diversity of accents, dialects, and speech characteristics across different regions of Russia.

Analyzing

Further information

SpeechRecorder

Platform independent tool for performing scripted recordings. Flexible organisation of scripts into sections and groups for sequential or ordered recordings. Each utterance is saved in a separate file.

Further information

Text+ Curated Tool Platform for Editions (CSP)

Version 1.0 of a curated software platform for scholarly editions developed in Text+/NFDI is now available. The platform improves the visibility of software (and its authors) used in the field of scholarly editions.

Searching

Further information

Text+ Federated Content Search (FCS)

The Federated Content Search (FCS) is a specification and technical infrastructure for querying and aggregating distributed research data.

Searching

Further information

Text+ GitLab

Web-based source code management with a wide range of functionalities to support development processes. Configuration of continuous integration per project. Support for the merge request workflows. Consultancy and support in setting up your projects. Connection to the GWDG user administration.

Archiving

Further information

Text+ Web Portal

Web portal for Text+ based on HugoCMS and a CI/CD deployment pipeline. This resource is supported by Text+. In case of questions you may get in touch with the Text+ helpdesk at textplus-support@gwdg.de.

Further information

Text+ Zenodo Community

The Text+ Zenodo Community gathers a growing collection of affiliated research outputs, guidelines, project deliverables and other documents affiliated with Text+, the NFDI consortium for text- and langeuage-related research data.

Publishing

Further information

TextGrid Repository & Laboratory

TextGrid is a virtual research environment for text-based humanities scholarship. It offers a variety of tools and services for collaboratively creating, analyzing, editing, and publishing texts.

Archiving Searching Annotating Communicating Publishing

Further information

tg-model - TextGrid Import Modeller

Whats the aim? This project focuses on attemps for a simple import of text corpora (encoded in XML/TEI) to TextGrid Repository by modeling the required metadata file structure.

Archiving

Further information

tgadmin - TextGrid repository administration cli tool, based on tgclients

What is the aim? A command line tool for managing your projects in the TextGrid repository without TextGridLab.

The actual data import is finally carried out by the Python tools tgadmin and tgclients, which in turn communicate with the TextGrid backend via the various TextGridRep APIs.

Archiving

Further information

tgclients - TextGrid Python clients

What is the aim?

The actual data import is finally carried out by the Python tools tgadmin and tgclients, which in turn communicate with the TextGrid backend via the various TextGridRep APIs.

Archiving

Further information

TSAKorpus Hosting

Hosting of instances of multimodal/spoken corpora ANNIS Instances Language Corpora (i. e. multimodal Corpus of German Sign Language).

Further information

Tübinger Treebank Collection

Further information

TüNDRA

TüNDRA is a web application for searching in 1004 treebanks for 173 languages such as treebanks for German and the full set of Universal Dependency treebanks. TüNDRA uses a lightweight query language based on TIGERSearch application.

Searching

Further information

TüNDRA Treebanks FCS Endpoint

FCS Endpoint which allows users to query TüNDRA treebanks from the Text+ Federated Content Search Aggregator. It does this by translating the FCS queries into the TüNDRA query language and returning the results.

Searching

Further information

WebLicht

WebLicht is an execution environment for automatic annotation of text corpora. Linguistic tools such as tokenizers, part of speech taggers, and parsers are encapsulated as web services, which can be combined by the user into custom processing chains.

Analyzing Annotating

Further information

Weblicht as a Service

WebLicht as a Service (WaaS) is a REST service that executes WebLicht chains. This allows you to run WebLicht chains from your UNIX shell, scripts, or programs.

There are several advantages of using WaaS rather than making direct requests to WebLicht web services.

Further information

WebLicht Batch

WebLicht-Batch is a web-based interface to WebLicht’s chainer back-end. WebLicht-Batch helps users to automatically feed large input data, or input data of multiple files into WebLicht.

Further information

WebLicht Const Parsing EN

WebLicht Easy Chain for Constituency Parsing (English). The pipeline makes use of WebLicht's TCF converter, the Stanford tokenizer, and the statistical BLLIP/Charniak parser.

Analyzing

Further information

WikiSpeech

Web-based managment tool for scripted speech recordings via the Internet based on SpeechRecorder scripts.

What's it all about?

WikiSpeech is a content management system for the web-based creation of speech databases for the development of spoken language technology and basic research.

Further information

Guidelines for the description of Text+ services on the SSH Open Marketplace

This short guideline documents how services affiliated with Text+ may be described in the SSH Open Marketplace.

Service definition

The collection of resources on this website includes, in addition to genuine Text+ developments, further offerings from partners contributing to Text+. How services become part of the Text+ portfolio is addressed in the Text+ Services Policy, Version 0.9, which is subject to ongoing internal project discussions and further development.

Changelog

upcoming: differentiation between genuine Text+ offerings and other offerings relevant to the community; addition of funding references in the service descriptions
October 2024: Expansion of the collection to 79 resources. Enhancement and curation of existing contributions. Update of the Text+ Services Policy to v0.9.
July 2024: linking of the guide & description guidelines for Text+ services in the SSH Open Marketplace; addition of filtering by categories and keywords
June 2024: expansion to 35 entries as well as partial revision of individual descriptions and the introductory text on this page
May 2024: initial version of a service list with 29 entries

last modified: Jul 6, 2025