Enriquecendo a previsão de séries temporais usando informação textual
Abstract
The ability to extract knowledge and forecast stock trends is crucial to mitigate investors' risks and uncertainties in the market. The stock trend is affected by non-linearity, complexity, noise, and especially the surrounding events. External factors such as daily news became one of the investors' primary resources for making decisions about buying or selling assets. However, this kind of information appears very fast. There are thousands of news generated by numerous web sources, taking a long time to analyze them, which can cost millions of dollars losses for investors due to a late decision. Recent contextual language models have transformed the area of natural language processing. However, classification models that use news that influence stock values need to deal with the unlabeled, class imbalance, and dissimilar texts. Recent studies show that the prediction of time series substantially improves by considering external information. This work proposes a hybrid methodology with three phases, one for news mining, a model for representation compact features, and the forecast model of time series, which merge for a more accurate prediction of prices. Initially, a small corpus is built using as support the time series. After that, we label the corpus based on semi-supervised learning to assign labels to other unlabeled news. In the second phase, the mining model with a classifier is used, whose output is concatenated with time series features, so the compact model representation extracts new features in a latent space. Finally, we predicted future prices with this fused knowledge. In a case study with Bitcoin cryptocurrency, the proposed methodology achieved a 1.62% decrease in the mean absolute percentage error.
Collections
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Brazil
Related items
Showing items related by title, author, creator and subject.
-
Enhancing solar flare forecasting: a multi-class and multi-label classification approach to handle imbalanced time series
Discola Junior, Sérgio Luisir (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Ciência da Computação - PPGCC, Câmpus São Carlos, 21/06/2019)Solar flares are huge releases of energy from the Sun. They are categorized in five levels according to their potential damage to Earth (A, B, C, M, and X) and may produce strong impacts to communication systems, threatening ... -
Mineração de regras de associação espaço-temporais temáticas aplicada a imagens de explosões solares
Silveira Junior, Carlos Roberto (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Ciência da Computação - PPGCC, Câmpus São Carlos, 14/09/2018)Introduction. Space weather analysis is a complex task that involves spatiotemporal data from satellite images added to data from daily bulletins. These data are characterized as time series of georeferenced images and ... -
Análise de séries temporais fuzzy para previsão e identificação de padrões comportamentais dinâmicos
Santos, Fábio José Justo dos (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Ciência da Computação - PPGCC, Câmpus São Carlos, 30/04/2015)The good results obtained by the fuzzy approaches applied in the analysis of time series (TS) has contributed significantly to the growth of the area. Although there are satisfactory results in TS analysis with methods ...