ELASTIC PERFORMANCE FOR ETL+Q PROCESSING
ELASTIC PERFORMANCE FOR ETL+Q PROCESSING
Pedro Martins, Maryam Abbasi, Pedro Furtado
Department of Informatics, University of Coimbra, Portugal
ABSTRACT
Most data warehouse deployments are not prepared to scale automatically, although some applications have large or increasing requirements concerning data volume, processing times, data rates, freshness and need for fast responses. The solution is to use parallel architectures and mechanisms to speed-up data integration and to handle fresh data efficiently. Those parallel approaches should scale automatically. In this work, we investigate how to provide scalability and data freshness automatically, and how to manage high-rate data efficiently in very large data warehouses. The framework proposed in this work handles parallelization and scales of the data-warehouse when necessary. It does not only scale-out to increase the processing capacity, but it also scales in when resources are underused. In general, data freshness is also not guaranteed in those contexts, because data loading, transformation, and integration are heavy tasks that are done only periodically, instead of row-by-row. The framework we propose is designed to provide data freshness as well.
KEYWORDS
Scalability, ETL, freshness, high-rate, performance, parallel processing, distributed systems, database, load-balance, algorithm
Full Text: https://aircconline.com/ijdms/V8N1/8116ijdms02.pdf
Volume Link: https://airccse.org/journal/ijdms/current2016.html
Comments
Post a Comment