Cloud Computing
Cloud Computing
High-performance computing is key to spatial ocean data processing and information extraction, as low data processing efficiency is a dominant factor of “showy but not useful” ocean GIS. Although high computing capacity can be achieved in conventional massive GIS data computing modes, e.g., parallel computing and grid computing, such computing capacity is greatly challenged by supermassive, variable, and quickly updated spatiotemporal marine data, making it difficult to provide on-demand computing services. Thus, we proposed an innovative high-performance computing strategy with 3D architecture, integrated high-performance calculation of ocean space data calculation, minimized network consumption in the software system and proposed a CPU/GPU hybrid, two-state cloud environment. The parallel acceleration method satisfies the requirements of intensive and fast computation for high concurrent access.
As a world first, we proposed a method for high-performance storage and computing integration of spatial ocean information in a distributed cloud environment, and invented a method for CPU/GPU hybrid parallel acceleration in a duplex cloud environment, breaking the bottleneck of core technology in high-performance computing of large, spatiotemporal marine data, and enabling exponential growth in the efficiency of large-scale spatial ocean data computing and processing.
Key Technology 1: Development of an integrated method to ensure high-efficiency energy storage and calculation of ocean-space information in a distributed cloud environment, thereby solving the bottleneck of resource scheduling technology in a cloud environment.
Ocean spatial data tend to have a large volume and complex structure. The mainstream distributed cloud computing system architecture, although improved in usability, can be problematic. Specifically, data integration needs to be completed before data calculation, resulting in data transmission consumption. We proposed an innovative “accumulation and integration” strategy under the cloud-GeoPlex environment, built a highly scalable, high-performance, large ocean spatiotemporal data processing framework based on Hadoop+MPP architecture, and performed multi-objective optimization of adaptive complex ocean computing task scheduling, which was designed to calculate the data-intensive and high-concurrency superposition of large ocean spatiotemporal data and automatically match the computational framework of messaging, mapping protocols, memory computing, real-time computing, and streaming computing. The unified management and optimal configuration of resources can be obtained, thus satisfying the performance requirements, which are difficult for any single computing resource (e.g., cluster, multi-core, GPU), and solving network congestion caused by data migration in complex marine scientific computing and transmission.
Key Technology 2: Development of a CPU/GPU hybrid acceleration method supported by a two-state cloud, with improvement in computing resource utilization.
Traditional distributed computing only utilizes the general computing power of nodes, which is a huge waste of hardware resources. With the development of multi-core, many-core CPU and GPU technologies, small-scale computing clusters and even individual computing nodes, are capable of high-performance computing tasks and large spatiotemporal data processing can be parallelized. Based on conventional vector space indexing, classic vector, raster, and network analysis algorithms, we developed parallelizable algorithms accordingly and multi-unit CPUs/GPUs for hybrid parallel computing, where computing tasks are assigned to different CPUs/GPUs depending on the demand of high-performance oceanographic computing and analysis varying in order of magnitude and complexity. The advantage of GPU computing power is fully exploited to solve the reading bottleneck. The display memory data reading rate of a GPU is far higher than that of a hard disk. This advantage was effectively utilized to make full use of the computing capacity of a GPU to address the reading bottleneck; a memory file system was introduced to store processed data, which improved data reading and writing I/O in full compliance with the reading/writing demands of the GPU display memory.