SPARQL Query Optimization for Federated Linked Data

Author
Desared Osmanllari
License
Creative Commons CC BY 4.0
Abstract

The Web has evolved from a system of internet servers supporting formatted documents into a web of linked data. In the last years, the Web of Data is constantly growing. Consequently, it has developed a large collection of interlinked data sets from multiple domains. To exploit the diversity of all available data, federated queries are needed. However, many problems such as processing power, query response time, high workload or outdated information are hindering the query processing. In this paper, I am aiming to explain various optimization techniques which have the potential to lead a significant improvement on the final query runtime. I will start by briefly introducing recent approaches of federation and show why SPARQL federation endpoints are mostly in my focus. Specifically, I will compare state-of-the-art SPARQL query federation engines and analyze respective optimization approaches. The main federation engines I will analyze in terms of query optimization are FedX, DARQ and SPLENDID. As the result I provide concrete examples and conclude which of the engines has the best performance based on the query execution time as key criterion.

SPARQL Query Optimization for Federated Linked Data