Big Data Analysis
Large Data Analysis

Marine spatiotemporal data are multi-sourced, large-scale, and extremely complex in structure. Conventional database management methods exhibit obvious disadvantages in data aggregation flexibility, storage scalability, and resource synergy, which limit the general application of multi-sourced marine spatiotemporal data.

By fusing and expanding the Spark memory computing framework and Shark distributed SQL query engine, Zhang and Bai (2018) developed a high-performance distributed solution for the concurrent retrieval and deep mining of large-scale, spatiotemporal marine data, thereby effectively solving quick retrieval problems for massive high-frequency buoy data in cloud environment, high-resolution remote sensing images, and other marine spatiotemporal data.

Ye et al. (2019) proposed a tile service-driven architecture in which the original marine spatiotemporal dataset is compressed into a lossless pyramid structure so as to increase massive dataset access efficiency and loading speed. Given limited hardware resource, storage on this data model using a hybrid database file system enables high-performance computing and analysis services.

Representative articles:

Zhang, F., Bai, Y. A distributed space-time data model and online analyst system for marine environmental research. Journal of Global Change Data & Discovery. 2018, 2(3): 283-296. DOI:10.3974/geodp.2018.02.03
Ye, W., Zhang, F., Bai, Y., Du, Z., & Liu, R. (2019). A tile service-driven architecture for online climate analysis with an application to estimation of ocean carbon flux. Environmental Modelling & Software, 118, 120-133.