DHSDJArch: An Efficient Design of Distributed Heterogeneous Stream-Disk Join Architecture
Heterogeneity is the key aspect of complex networks and smart devices for using it as nature of live streams. The heterogeneous stream-disk join is a significant research topic in real-time processing applications because it can directly affect the data analytics. Multiple issues, including stream loss, scalability, disk access cost, and data accuracy, should be considered during heterogeneous stream-disk join transformation. In this work we overcome these issues by introducing a distributed heterogeneous stream-disk join architecture (DHSDJArch) which can prevent stream data loss as well as maintaining balance between heterogeneous distributed data sources and accuracy of stream-disk join. A four phased distributed architecture is proposed for the multi-objective optimization to transform heterogeneous incomplete stream. To prevent stream loss, configuration of log retention is proposed based on the characteristics of distributed event streaming platform (DESP) . Specifically, two transformations are proposed to pre-process heterogeneous streams and to join pre-processed stream with distributed disk data by performing real-time disk access while compensating the differences between data sources and streaming application, respectively. We conduct comprehensive experimental study on real datasets to verify the performance of proposed architecture in terms of accuracy, log retention policy, scaling, stability and cloud data storage.
Other Information
Published in: IEEE Access
License: http://creativecommons.org/licenses/by/4.0
See article on publisher's website: https://dx.doi.org/10.1109/access.2023.3288284
Funding
Open Access funding provided by the Qatar National Library.
History
Language
- English
Publisher
IEEEPublication Year
- 2023
License statement
This Item is licensed under the Creative Commons Attribution 4.0 International LicenseInstitution affiliated with
- Community College of Qatar
- University of Doha for Science and Technology
- College of Computing and Information Technology - UDST