Big Data meets GeoComputation: combining research reproducibility and processing efficiency at Yale
In recent years there has been an explosion of available geo-datasets derived from an increasing number of remote sensors on satellites, field instruments, sensor networks, and other GPS-equipped “smart” devices. “Big Data” processing requires flexible tools that combine efficient processing, either on a local PC or on remote servers (e.g, High Performance Computing - HPCs). However, leveraging these new data streams requires new tools and increasingly complex workflows often involving multiple software and/or programming languages. This is also the case for GIS and remote sensing analyses where statistical/mathematical algorithms are implemented in complex geospatial workflows combining processing efficiency and research reproducibility. I will show examples of global geo-computation applications on a 1 km spatial grain to calculate solar radiation layers, freshwater-specific environmental variables, topography complexity layers, urban accessibility, and land surface temperature layers, where I combined various open-source geo-libraries for massive computation in the Yale-HPC.