Management and Processing of Discographic Data with Amazon Elastic MapReduce

Author
Pierluigi Videsott
View Count
918
License
Creative Commons CC BY 4.0
Abstract

The purpose of this report is to explain how – by leveraging on the capabilities of the amazon web services – it is possible to manage and process a set of data that is too large and complex for traditional data processing techniques and technologies.

The report discusses the implementation of a set of services – from the retrieval of external data to its transformation, through the storage on non relational databases and finally the parallel computation on an external cluster – meant for the management of discographic information in order to easily join different data in an agile manner and subsequently perform additional processing based on the joined output.

Management and Processing of Discographic Data with Amazon Elastic MapReduce