Distance Range Queries in SpatialHadoop
Identifiers
URI: http://hdl.handle.net/10835/5210
DOI: http://hdl.handle.net/11705/JISBD/2016/031
DOI: http://hdl.handle.net/11705/JISBD/2016/031
Share
Metadata
Show full item recordAuthor/s
García García, Francisco; Corral Liria, Antonio Leopoldo; Iribarne Martínez, Luis Fernando; Vassilakopoulos, MichaelDate
2016Abstract
Efficient processing of Distance Range Queries (DRQs) is of great importance in spatial databases due to the wide area of applications. This type of spatial query is characterized by a distance range over one or two datasets. The most representative and known DRQs are the ε Distance Range Query (εDRQ) and the ε Distance Range Join Query (εDRJQ). Given the increasing volume of spatial data, it is difficult to perform a DRQ on a centralized machine efficiently. Moreover, the εDRJQ is an expensive spatial operation, since it can be considered a combination of the εDR and the spatial join queries. For this reason, this paper addresses the problem of computing DRQs on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes new algorithms in SpatialHadoop to perform efficient parallel DRQs on large-scale spatial datasets. We have evaluated the performance of the proposed algorithms in several situations with big synthetic and r...