Inclusão de diversidade em consultas aos vizinhos mais próximos usando descritores distintos para similaridade e diversidade
Cardoso, Ana Claudia
MetadataShow full item record
One of the ways to recover images in a database is through similarity queries. Using characteristics extracted from these images, such as color, shape or texture, this work seeks to identify similarities to a central query element. However, the results may be very similar to each other, which is not always the expected result. In addition to the redundancy in the results, the problem of the ’semantic gap’, which is a divergence in the evaluation of similarity between images performed by the computer considering its numerical representation (low level characteristics) and the human perception about the image (high level characteristics). In order to improve the quality of the results, we sought to minimize the issue of redundancy and the ’semantic gap’ through the use of more than one descriptor in queries for similarity. We sought to explore the inclusion of diversity using one descriptor to treat similarity and another descriptor to treat diversity, more generally a metric space for similarity and another for diversity. For the implementation of the query by similarity was used the consultation to several neighbors closer. Considering that the descriptors may be distinct and one of them may have greater numerical representativeness, it was necessary to do the normalization, considering the methods of normalization by the greater distance and normalization by the greater approximate distance with balancing by the intrinsic dimension. An exhaustive search algorithm was used to perform the tests. The experiments were carried out in a classified database. To evaluate the semantic quality of the results, a measure was proposed that evaluates the inclusion of diversity considering the diversity present in the query only considering the similarity and the maximum diversity that can be included. A comparison was made between the result obtained and the considered ideal, which refers to the value of l defined by the user himself. By comparing the results obtained with the results obtained in the queries for a single descriptor, the evaluation of the included diversity followed the trend of l, which allows to say that normalization and balancing is necessary. In addition, it is intended in the future to study new ways of normalizing.