ELASTIC PERFORMANCE FOR ETL+Q PROCESSING

 


ELASTIC PERFORMANCE FOR ETL+Q PROCESSING

Pedro Martins, Maryam Abbasi, Pedro Furtado

Department of Informatics, University of Coimbra, Portugal

ABSTRACT

Most data warehouse deployments are not prepared to scale automatically, although some applications have large or increasing requirements concerning data volume, processing times, data rates, freshness and need for fast responses. The solution is to use parallel architectures and mechanisms to speed-up data integration and to handle fresh data efficiently. Those parallel approaches should scale automatically. In this work, we investigate how to provide scalability and data freshness automatically, and how to manage high-rate data efficiently in very large data warehouses. The framework proposed in this work handles parallelization and scales of the data-warehouse when necessary. It does not only scale-out to increase the processing capacity, but it also scales in when resources are underused. In general, data freshness is also not guaranteed in those contexts, because data loading, transformation, and integration are heavy tasks that are done only periodically, instead of row-by-row. The framework we propose is designed to provide data freshness as well.

KEYWORDS

Scalability, ETL, freshness, high-rate, performance, parallel processing, distributed systems, database, load-balance, algorithm

Full Text: https://aircconline.com/ijdms/V8N1/8116ijdms02.pdf

Volume Link: https://airccse.org/journal/ijdms/current2016.html

 

Comments

Popular posts from this blog

3rd International Conference on Computer Science, Engineering and Artificial Intelligence (CSEAI 2025)

A REVIEW OF THE USE OF R PPROGRAMMING FOR DATA SCIENCE RESEARCH IN BOTSWANA

HYBRID ENCRYPTION ALGORITHMS FOR MEDICAL DATA STORAGE SECURITY IN CLOUD DATABASE