Deverbais de ação em corpus histórico: contribuição computacional para a morfologia construcional do português
Resumo
In the past few years, several studies based on constructional morphology to describe the mechanisms of construction of deverbal nouns, that is, nouns derived from verbs, using a synchronic perspective were carried out. The interest in this topic comes from the productivity of the construction mechanism of this type of noun. The impact in the Portuguese language, especially on its formal use, is not irrelevant. There aren't studies showing the variations of these mechanisms in Brazilian Portuguese (PB), much less, a diachronic perspective of him. We believe the current mechanisms be very similar with from XVI, XVII e VXIII. The suffix productivity and formation process can been suffering some kind of changing to specific words. The project Historical Dictionary of Brazilian Portuguese (DHPB) of centuries from XVI through XVIII, sponsored by the Programa Institutos do Milênio, and the construction of a corpus collected for this project, are, at the same time, a challenge and an opportunity to deepen the knowledge about this aspect of the Portuguese language. The analysis of this corpus gives us an opportunity to observe how these mechanisms evolved on the Portuguese language. However, there aren't yet tools that automate this type of research, enabling the morphologists to acquire this kind of data efficiently. The objective of this research was, first of all, to describe the different mechanisms of deverbal nouns construction in PB according to the SILEX morphological construction model (cf. Corbin 1987, 1991, 1997, Correia 1999, Rio-Torto (org.) 2004 and Rodrigues 2006). A second objective was the development of a computational system, named EXTRADEV, which allows easy access to the following data: (a) deverbal nouns (current and historic) with various morphological structures; and (b) graphical variants of nominal action (historic), to allow easier information retrieval. The methodology used on the construction of this system is grounded on: (i) description of deverbal nouns and computational rule‟s building about these; (ii) a pilot study of the fifty most frequent verbs of the DHPB project, extracted using UNITEX, and on the analysis of the graphical variations of these verbs; (iii) the knowledge of the Python programming language and regular expressions; and (iv) the use of the resources constructed for the DHPB project, such as a system of generation of graphical variants, SIACONF. We found 1,742,663 action deverbal instances and 15,633 distinct forms of the same without change of spelling. This number more variants extracted in the second module EXTRADEV, totalling 22,442 occurrences of deverbal history (6,809 variants and 15,633 deverbal no change in spelling). We followed some criteria analysis: frequency data, analysis about deverbal form based on in etymoly and history dictionaries and the last criteria was based on the observation of the final list of historical deverbal. With this study, we aim to increase the knowledge of the diachronic variations of deverbal nouns and to motivate the joint linguistics and computer science contribution, particularly the one made by the natural language processing area, to empower future studies about the Portuguese language.