Tools and Services
Text+ and the participating institutions offer a wide range of services related to language and text data. In addition to research data, services and tools are an essential part of the Text+ offerings for users.
For the provision and maintenance of this service overview, Text+ uses the SSH Open Marketplace, a discovery platform from the Social Sciences and Humanities Cluster within the EOSC.
The list of offerings mentioned below is subject to constant development and expansion. Both a search function and a filter function are implemented for convenient access. Feedback and requests can be directed to the Helpdesk with the subject Infrastructure/Operations.
BAS CLARIN Repository
Repository of 50+ annotated speech corpora. Most corpora may be accessed and downloaded by members of academic research institutions, some corpora require licenses, e.g. highly sensitive data, or data for commercial use.
BAS Services
BAS Web Services
Cascaded Analysis Broker des Deutschen Textarchiv (DTA::CAB)
Web service at Zentrum Sprache of the Berlin-Brandenburg Academy of Sciences and Humanities to analyse and annotate German-language texts linguistically.
CLARIAH-DE Tutorial Finder
The Tutorial Finder allows users to browse freely available and reusable teaching and training materials on procedures, tools, research methods, and topics in the humanities and its related disciplines.
This resource is supported by Text+.
CLARIND-UdS
CLARIND-UdS Repository (Saarbrücken)
The CLARIND-UdS data center is part of the Text+ infrastructure and operated by the Department of Language Science and Technology at Saarland University.
CMDI Explorer
CMDI Explorer is a tool that empowers users to easily explore the contents of complex CMDI records and to process selected parts of them with little effort.
Corpus Services
The (HZSK) Corpus Services were initially developed at the Hamburg Centre for Language Corpora (HZSK) as a quality control and publication framework for EXMARaLDA corpora. Since then, most development work has been done within the INEL project.
Critical Pāli Dictionary Online
The Critical Pāli Dictionary Online (CPD) is a digital version of the "Critical Pāli Dictionary", which is a comprehensive dictionary of the Pāli language.
DARIAH-CAMPUS
Looking for learning resources?
DARIAH-DE and OPERAS-GER academic blogging with Hypotheses
Hypotheses is a non-commercial blog portal for the humanities and social sciences. The portal provides a free service that facilitates scientific blogging and ensures greater visibility as well as archiving of content.
DARIAH-DE Data Federation Architecture (DFA)
The DARIAH-DE data federation architecture is the term for services and tools that enable research data and collection descriptions to be found from various sources (such as cultural institutions, libraries, archives, research facilities, and data centers) and to be used for analysis.
DARIAH-DE Data Modeling Environment
Environment for modeling data and their relationships.
The Data Modeling Environment (DME) is a tool for modeling and associating data. By means of the DME, data models and mappings between them are defined and provided in terms of interfaces (REST-API).
DARIAH-DE Generic Search
Search engine that allows to search in the metadata records of the Collection Registry
The Generic Search creates a comprehensive search facility in DARIAH-DE.
DARIAH-DE Geo-Browser
The DARIAH-DE Geo-Browser (or GeoBrowser) allows a comparative visualisation of several requests and facilitates the representation of data and their visualisation in a correlation of geographic spatial relations at corresponding points of time and sequences.
DARIAH-DE Helpdesk (DARIAH-EU, CLARIAH-DE, Text+)
A good starting point to receive support for DH-related questions, tools and resources provided by CLARIN-D, DARIAH-EU and DARIAH-DE, CLARIAH-DE and Text+ is the helpdesk.
DARIAH-DE Monitoring of research infrastructures and services using Icinga
Monitoring is a important factor for the operation of a digital research infrastructure. The data centers focus the hardware and the state of the basic software. Monitoring can be used to correct any faults and failures as quickly as possible.
DARIAH-DE Publikator
For whom? Researchers who want to deposit their research data safe, persistent, and referencable in a research data repository.
The DARIAH-DE Publikator offers the possibility to prepare, manage and import research data for the import into the DARIAH-DE Repository.
DARIAH-DE Repository dhrep
The DARIAH-DE Repository is a central component of the DARIAH-DE research data federation architecture. The DFA aggregates various services and, thus, ensures a convenient use.
Deutsches Textarchiv (DTA)
The German Text Archive is a Repository for historical, German-language text corpora at Zentrum Sprache of the Berlin-Brandenburg Academy of Sciences and Humanities.
DH-Stipends
Our DH fellowships foster research into our freely accessible data and collections using text and data mining methods. The yearly call is mainly directed at young, tech-savvy researchers in all disciplines who already have some experience with Digital Humanities‘ methods and tools.
Digital Collection of Germany National Library
Our digital collections consist of e-books, e-papers and e-journals, online dissertations, audio books, digitally recorded music, websites and digitised works.
Digital Humanities Call
Digital Humanities Call (starting every year in March): We will gladly support you with your research project by providing metadata, digital objects and infrastructures; we can also digitise a limited volume of analogue objects for you if required. The provisions in section 60d UrhG apply.
Digitales Wörterbuch der deutschen Sprache (DWDS)
The Digital Dictionary of the German Language (DWDS) is a lexical system at the Berlin-Brandenburg Academy of Sciences and Humanities that provides information about German vocabulary in the past and present.
DNBLab
The German National Library offers free access to its bibliographic data and several collections of digital objects. As the central access point for presenting, accessing and reusing digital resources, DNBLab allows users to access our data, object files and full texts.
DTA Base Format (DTABf)
Format standard for TEI-compliant text annotation of digital full texts of historical prints with an extension for manuscripts.
DTA FCS-Endpoint
Endpoint for the Text+ Federated Content Search to query DTAXL, the historical corpora of the DTA at BBAW.
DWDS Corpus Search (Digitales Wörterbuch der deutschen Sprache – Korpussuche)
This is a tool for browsing DWDS corpora. The DWDS is part of the Center for Digital Lexicography of the German Language (ZDL), funded by the Federal Ministry of Education and Research.
DWDS FCS-Endpoint
Endpoint for the Text+ Federated Content Search to query DWDSXL, a collection of contemporary text corpora at BBAW.
DWDS word history curve
Tool of the Digital Dictionary of the German Language (DWDS) at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW) for the diachronic analysis of word usages.
DWDS-API
An application programming interface (API) for querying DWDS-corpora at Berlin-Brandenburg Academy of Sciences and Humanities (BBAW).
API-Schnittstelle des Digitalen Wörterbuchs der Deutschen Sprache (DWDS).
EmuR
entityXML
entityXML is (so far) a concept study in version 0.5.2 (ALPHA), which aims to model a standardised XML-based data format for the GND Agency Text+. This resource is supported by Text+. In case of questions you may get in touch with the Text+ helpdesk at textplus-support@gwdg.de.
Federated Content Search (Saarbrücken)
The CLARIND-UdS data center is part of the Text+ infrastructure and operated by the Department of Language Science and Technology at Saarland University.
GermaNet API
GermaNet kann über eine API in jeweils Python- und Java angesprochen werden.
GermaNet FCS Endpoint
GermaNet Rover
Rover ist eine Online-Anwendung, die zur Nutzung der GermaNet-Daten oder zur Berechnung der semantischen Verwandtschaft/Ähnlichkeit zwischen zwei Synsets verwendet werden kann.
GND Agency Text+
The GND Agency Text+ is a service that is being set up at the Göttingen State and University Library (SUB Göttingen) as part of the NFDI consortium Text+.
GND-Explorer
Informative, visual, interconnected: The GND Explorer will be the new tool for presenting and searching the Integrated German authority file (GND)! In the future, the GND explorer should provide a convenient and comprehensive access to the GND and its semantic network for all users.
HedgeDoc - GWDG Pad
Online tool for collaborative text editing to work together on the same texts at the same time.
The HedgeDoc-pad is an open-source-based web editor that allows multiple users to work on a single text simultaneously from different locations.
IDS Repository
The IDS Repository deals with the long-term archiving of linguistic and language resources in the field of German studies and serves as an interface to the Virtual Language Observatory, where the data can be explored via a faceted search.
Indico Event Management
The open source software Indico developed by Cern is a web application. Lectures, meetings and conferences can be created using Indico. Three different event types (lecture, meeting and conference) can be created in Indico.
Integrated Authority File (GND)
The Integrated Authority File (GND) is a service facilitating the collaborative use and administration of authority data. These authority data represent and describe entities, i.e.
KorAP (on DeReKo)
This is a corpus analysis platform that is suited for large, multiply annotated corpora and complex search queries independent of particular research questions.
KorAP FCS-Endpoint
Language Archive Cologne
The Language Archive Cologne (LAC) supports research, learning and teaching with high quality and dependable digital language resources. The LAC facilitates free and open online access to research data.
Language Resource Switchboard
A web application that suggests language analysis tools for specific data sets.
Metadata Service
The metadata provided by the German National Library includes current and retrospective bibliographic data for individual series in the German National Bibliography, authority data from the Integrated Authority file, metadata from the German Union Catalogue of Serials, and data on new releases.
MONAPipe
MONAPipe stands for "Modes of Narration and Attribution Pipeline". It provides natural-language-processing tools for German, implemented in Python/spaCy.
Octra Backend
Web-based management tool for organizing transcription projects: audio files, transcribers, tools and assignments are managed in a graphical user interface.
Octra Transcription Editor
Web-based editor for orthographic transcripts. Octra provides different views of signal and transcript, supports a flexible organization of work, and provides tools to split signal files into segment-sized chunks.
OpenProject
The Project Management Service is a collaboration self service that allows you to manage and track your projects and source code repositories. By using DARIAH-DE OpenProject, users can independently coordinate their projects, keep track of their issues and document their results.
OWID FCS-Endpoint
Percy
Persistent Identifier Service
In all aspects of research, the amount of digitally stored data is increasing continuously. Thereby the management will be more and more complex, so that the sustainable reference to data and thus their permanent censibility represent a challenge.
perspectivia.net
Scientific publications of the Max Weber Foundation (MWS, Max Weber Stiftung) and its partners can be published free of charge on the OA publication platform perspectivia.net in the sense of Diamond Open Access.
RocketChat
Rocket.Chat is a web-based, persistent messaging service focusing on group communication. It is a simple and intuitive platform for all users with a GWDG account/AcademicID for their communication.
Russian Regions Acoustic Speech Database
The Russian Regions Acoustic Speech Database (Rureg) is a collection of speech recordings from various regions of Russia. Rureg aims to capture the diversity of accents, dialects, and speech characteristics across different regions of Russia.
SpeechRecorder
Platform independent tool for performing scripted recordings. Flexible organisation of scripts into sections and groups for sequential or ordered recordings. Each utterance is saved in a separate file.
Text+ Curated Tool Platform for Editions (CSP)
Version 1.0 of a curated software platform for scholarly editions developed in Text+/NFDI is now available. The platform improves the visibility of software (and its authors) used in the field of scholarly editions.
Text+ Federated Content Search (FCS)
The Federated Content Search (FCS) is a specification and technical infrastructure for querying and aggregating distributed research data.
Text+ GitLab
Web-based source code management with a wide range of functionalities to support development processes. Configuration of continuous integration per project. Support for the merge request workflows. Consultancy and support in setting up your projects. Connection to the GWDG user administration.
Text+ Web Portal
Web portal for Text+ based on HugoCMS and a CI/CD deployment pipeline This resource is supported by Text+. In case of questions you may get in touch with the Text+ helpdesk at textplus-support@gwdg.de.
Text+ Zenodo Community
The Text+ Zenodo Community gathers a growing collection of affiliated research outputs, guidelines, project deliverables and other documents affiliated with Text+, the NFDI consortium for text- and langeuage-related research data.
TextGrid Import UI
A Jupyter Notebook-based user interface for TextGrid Import Modelling, which is a command line tool that facilitates the creation of the metadata files required for importing data into TextGrid Rep.
TextGrid Repository & Laboratory
TextGrid is a virtual research environment for text-based humanities scholarship. It offers a variety of tools and services for collaboratively creating, analyzing, editing, and publishing texts.
tg-model - TextGrid Import Modeller
Whats the aim? This project focuses on attemps for a simple import of text corpora (encoded in XML/TEI) to TextGrid Repository by modeling the required metadata file structure.
tgadmin - TextGrid repository administration cli tool, based on tgclients
What is the aim? A command line tool for managing your projects in the TextGrid repository without TextGridLab.
The actual data import is finally carried out by the Python tools tgadmin and tgclients, which in turn communicate with the TextGrid backend via the various TextGridRep APIs.
tgclients - TextGrid Python clients
What is the aim?
The actual data import is finally carried out by the Python tools tgadmin and tgclients, which in turn communicate with the TextGrid backend via the various TextGridRep APIs.
TSAKorpus Hosting
Hosting of instances of multimodal/spoken corpora ANNIS Instances Language Corpora (i. e. multimodal Corpus of German Sign Language).
Tübinger Treebank Collection
TüNDRA
TüNDRA is a web application for searching in 1004 treebanks for 173 languages such as treebanks for German and the full set of Universal Dependency treebanks. TüNDRA uses a lightweight query language based on TIGERSearch application.
TüNDRA Treebanks FCS Endpoint
FCS Endpoint which allows users to query TüNDRA treebanks from the Text+ Federated Content Search Aggregator. It does this by translating the FCS queries into the TüNDRA query language and returning the results.
WebLicht
WebLicht is an execution environment for automatic annotation of text corpora. Linguistic tools such as tokenizers, part of speech taggers, and parsers are encapsulated as web services, which can be combined by the user into custom processing chains.
Weblicht as a Service
WebLicht as a Service (WaaS) is a REST service that executes WebLicht chains. This allows you to run WebLicht chains from your UNIX shell, scripts, or programs.
There are several advantages of using WaaS rather than making direct requests to WebLicht web services.
Weblicht Batch
WebLicht-Batch is a web-based interface to WebLicht’s chainer back-end. WebLicht-Batch helps users to automatically feed large input data, or input data of multiple files into WebLicht.
WebLicht Const Parsing EN
WebLicht Easy Chain for Constituency Parsing (English). The pipeline makes use of WebLicht's TCF converter, the Stanford tokenizer, and the statistical BLLIP/Charniak parser.
WikiSpeech
Web-based managment tool for scripted speech recordings via the Internet based on SpeechRecorder scripts.
What's it all about?
WikiSpeech is a content management system for the web-based creation of speech databases for the development of spoken language technology and basic research.
Guidelines for the description of Text+ services on the SSH Open Marketplace
This short guideline documents how services affiliated with Text+ may be described in the SSH Open Marketplace.
Service definition
The collection of resources on this website includes, in addition to genuine Text+ developments, further offerings from partners contributing to Text+. How services become part of the Text+ portfolio is addressed in the Text+ Services Policy, Version 0.9, which is subject to ongoing internal project discussions and further development.
Changelog
- upcoming: differentiation between genuine Text+ offerings and other offerings relevant to the community; addition of funding references in the service descriptions
- October 2024: Expansion of the collection to 79 resources. Enhancement and curation of existing contributions. Update of the Text+ Services Policy to v0.9.
- July 2024: linking of the guide & description guidelines for Text+ services in the SSH Open Marketplace; addition of filtering by categories and keywords
- June 2024: expansion to 35 entries as well as partial revision of individual descriptions and the introductory text on this page
- May 2024: initial version of a service list with 29 entries