Provision of processable textual data in libraries

Motivation

The academic libraries already have a great number of digitised items. In some cases, OCR procedures have already been used to offer users more extensive options. Unfortunately, the text offered here is not always of a quality that allows digital processing without having to put further work into pre-processing. A situation that burdens the already scarce resources in research projects.

Objectives

It would be desirable if Text+ and its participants could address the problem recorded here, so that higher, more easily processable data quality plays a more important role in future in the services offered by libraries to support digital scientists.

Solution

Now that the provision of the full text to users is more or less firmly anchored in the portfolios of the libraries, the next step would be to raise awareness of the need to provide high data quality (beyond metadata). Despite many good examples of approaches and implementations, further activities are needed here to change the mindset and the current practice. Text+ and its relevant participants would be the ideal ambassadors to advance the standards of data provision for digital texts.