Leibniz Institute for the German Language (IDS)

Text+ center: IDS Text and Language Data Repository

Type of center: data center/competence center

IDS has the world’s largest linguistically motivated collection of electronic corpora with written German-language texts from the present and the recent past. In addition, the IDS has corpora with conversational and variety linguistic recordings of German. It offers tools and interfaces for querying and analyzing the corpora.

The IDS Repository aims at the long-term archiving of linguistic resources (text and spoken language) in the field of German studies. It is also one of the central nodes in the Text+ clusters covering two domains, collections and lexical resources. The IDS is home to the most important collections of contemporary German and dictionaries with a thematic focus on neologisms, discourse vocabulary, foreign words and loan words among others. The metadata in Dublin Core and CMDI formats is made searchable by the Text+ Registry via an interface.

Highlights of provided data and services

Collections:

  • German Reference Corpus (DeReKo): the world’s largest linguistically motivated collection of electronic corpora of written German-language texts from the present and recent past
  • Archive for Spoken German (AGD) hosts more than 80 different corpora, including the most important resources for the documentation of spoken German in interaction and the regional variation of spoken German

Lexical resources:

  • OWID and OWIDplus: online platform for various dictionaries and experimental, data-centered, multilingual resources
  • grammis: scientific information system on German grammar, which presents current research results, explanations and background knowledge

Third-party data reception

The IDS primarily accepts third-party data that matches the inventory data. This includes large textual resources, which are available in I5 format similar to the DeReKo corpus, but also spoken language data which are available according to ISO 24624 (“Transcription of spoken language”). Further data formats on request.

Contact

Contact for Text+: data-steward@ids-mannheim.de