Aprendizado de mudança de conceito por floresta de caminhos ótimos
Iwashita, Adriana Sayuri
MetadataShow full item record
Classification algorithms take their decisions according to a learning process on the training set. Therefore, the data to be classified in the test set must have the same distribution as the training set to be correctly identified. Nowadays, industrial and enterprise applications generate a huge amount of data streams, such as sensor network data, and call records, among others. Also, with the new technologies being developed in internet services, data can stream from diverse domains, including internet transactions and web searches. These data streams present characteristics that traditional data mining methods have to deal with, which are databases with high volume and susceptible to concept drift, which refers to a non-stationary learning problem over time, i.e., the classifier of a certain problem may not be suitable as time goes by for being "outdated." This occurs because a concept may change over time. For example, a reader might like news articles on "sports"; but over time your reading preference may change to "economy" and the previous topic becomes irrelevant, i.e., the concept of an article relevant to this reader has changed. The present research proposes the study of Optimum-Path Forest (OPF) classifier in dynamic environments, both in supervised approach (using some methods to deal with concept drift as data windows and decision committees) as in the unsupervised approach, and we conducted experiments on databases observed in the literature.
The following license files are associated with this item: