Paralelização de algoritmos de busca de documentos mais relevantes na web utilizando GPUs
Abstract
Search engines are facing performance challenges because of the large number of documents and the increase of query loads in the Web environment. The success of a search engine is related to the ability of the query processing system to find documents that match the needs of information expressed in user queries in a short time interval. Despite the large amount of documents, users are more interested in fewer results in a query. This causes few documents to be highly relevant in most queries. DAAT dynamic pruning algorithms have been exploring the efficiency of query processing systems, avoiding wasting time sorting documents that are not likely to be relevant. To handle the scale and dynamics of user query traffic, query processing needs to make efficient use of hardware resources.
The main objective of this doctoral thesis is to investigate the use of parallel computing in the process of identifying the most relevant documents to a given query in the GPU architecture. For this, strategies of parallelization of algorithms that aim to reduce the latency of response of a given query and to increase the flow of queries are proposed and evaluated in the GPU. The parallelization proposals are well suited to the category of DAAT algorithms and dynamic pruning algorithms. In the DAAT category, partitioning strategies are offered in a way that performs an investigation into the location of occurrences of the same document in the memory hierarchy of the GPU. At the level of dynamic pruning algorithms, threshold propagation policies among processors are proposed and the impacts generated on the efficiency of the parallel algorithms are analyzed. To verify efficiency in practice, the parallel proposals were implemented and tested in the Pascal GPU architecture and obtained a performance of 4x to 40x relative to the fundamental algorithms.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Agrupamento de sequências de miRNA utilizando aprendizado não-supervisionado baseado em grafos
Kasahara, Viviani Akemi (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Ciência da Computação - PPGCC, Câmpus São Carlos, 12/08/2016)Cluster analysis is the organization of a collection of patterns into clusters based on similarity which is determined by using properties of data. Clustering techniques can be useful in a variety of knowledge domains ... -
Otimização de projeto de viadutos de múltiplas longarinas considerando critérios ambientais e econômicos
Trentini, Eduardo Vicente Wolf (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Engenharia Civil - PPGECiv, Câmpus São Carlos, 14/06/2023)Brazil lacks investments in road infrastructure, as it is responsible for more than half of the volume of cargo transportation in the country. When addressing the topic of road infrastructure, one of the main variables is ... -
Geração genética de classificador fuzzy intervalar do tipo-2
Pimenta, Adinovam Henriques de Macedo (Universidade Federal de São Carlos, UFSCar, Programa de Pós-Graduação em Ciência da Computação - PPGCC, , 30/10/2009)The objective of this work is to study, expand and evaluate the use of interval type-2 fuzzy sets in the knowledge representation for fuzzy inference systems, specifically for fuzzy classifiers, as well as its automatic ...