Formação de gentílicos a partir de topônimos : proposta de geração automática
Antunes, Roger Alfredo de Marci Rodrigues
MetadataShow full item record
It is a common habit to use the adjective of the city name to indicate people’s origin, however the formulating rules of the adjective has been rarely discussed in the literature. The main objective of this work is to describe the gentile adjectives, which originate from the place names called toponyms. Using specific morphological rules of combination and proposing the formal representation of their regularities we can formulate the basis for a computational system, which can automatically generate the gentiles from their place names. The system proposed here is founded on the methodological principles of Dias-da-Silva (1996) - with respect to the three-phase methodology of the Natural language processing (NLP) - and the theoretical assumptions in the works of Borba (1998), Biderman (2001), Dick (2007) Jurafsky (2009) and Sandmann (1992, 1997). The corpus consists of 5,570 municipalities’ names (toponyms) and their respective gentiles, extracted in a form of a list from the database of the Instituto Brasileiro de Geografia e Estatística (IBGE). It was observed that only from a small set of recurrent unities, such as suffixes and ends of lexical entities, it is possible to extract patterns which can be subsequently used to formulate combination rules for automatic word processing. During this work, the issue of computational representation stands out and proves natural language complexity. Although natural languages can be in principle automatically processed using computers, their inherent features may deviate from the formulated rules and make the processing more intricate. Nonetheless, the results show that it is possible to automatize 52% of the generation of gentiles from the municipal toponyms. Conclusively the inherent opacity of the Portuguese does not allow direct processing of all of the language toponyms.