A lexical resource for the study of polysemy and morphosyntax of deverbal nominals

Motivation

This user story presents a case of possible interaction with the Text+ network in which a lexical resource that I am creating in my project may be optimized for more general use in linguistic research. It concerns the DFG research areas 104-01 of comparative linguistics and 104-02 of individual languages (English and Italian).

Theoretical linguistics has a long tradition of using introspection-based empirical input, dating at least since the early 1960s. The field has witnessed great empirical and theoretical progress in individual and cross linguistic studies over the decades. Yet, the more the domain advances, the less reliable this one-sided empirical methodology has been perceived by researchers. First, most of these studies draw on very limited amounts of data, leaving the great majority of the relevant empirical facts unstudied. Second, the assessment of these data is often subjective, which leads to conflicting theoretical claims. Consequently, the reliability of such theoretical hypotheses heavily suffers from the narrow coverage and the questionable status of their empirical background. The impressive volume of digital resources such as text corpora and lexical resources available at present offer a great opportunity for theoretical linguists to employ larger amounts of naturally occurring and unbiased data in their research, which would lead to more solid and verifiable theoretical hypotheses.

My current research project (Zero-derived nouns and deverbal nominalization: An empirically-oriented perspective, funded by the DFG) lies at the interface of morphology with lexical semantics and syntax and is concerned with the study of deverbal zero-derived nouns (e.g. to climb > the climb). Zero-drived (or conversion) nouns have long been assumed to lack important semantic and morphosyntactic properties that are available in suffix-based nominals (e.g. to climb > the climb-ing) — a difference that is argued to lie in the lack of an overt nominalizing suffix in zero-derived nouns. However, these claims rely on limited amounts of data, which have often been challenged. In my project I investigate zero-derived nouns in relation to their semantic verb classes, which are known to influence the behavior of suffix-based nominals. The aim is to find out how their semantic and morphosyntactic properties may depend on the semantic class of the base verb.

To address the challenge of the narrow empirical scope of previous research, I have been creating an inventory of zero-derived nouns based on verbs of well-studied semantic classes. This database gathers various information relevant for the study of zero-derived nouns: dates of first attestation and frequency ranges for the noun and the base verb, etymology and the available interpretations of the noun (all of these taken from the online Oxford English Dictionary), information on the corresponding verb classes (from the online resource VerbNet: https://verbs.colorado.edu/verb-index/vn3.3/), as well as some examples from natural text corpora that illustrate the possibility of zero-nouns to realize arguments of their base verbs. A smaller such collection is also being developed for Italian zero-derived nouns.

Objectives

A collaboration of theoretical linguists with researchers of the consortium Text+ is crucial for the future of theoretical linguistics in order to fulfill the demand for an extended data coverage in theoretical research by means of independently documented data. This would ensure sustainability of theoretical hypotheses within and across the discipline itself.

For my individual project, which would belong to the Lexical Resources domain of Text+, I envisage the assistance of Text+ in the organization and the presentation of my zero-nouns collection in a transparent format and a user-friendly (possibly web-based) interface that would make it easily accessible to other researchers. There are indeed several data repositories (as also recommended by the DFG) but the network proposed by Text+ with separate areas for specific purposes would have great chances of reaching the research community given their targeted focus. I expect that my collection of zero-nouns would also be accessible by a larger community through the Text+ initiative.