Workshops: Geo-Computation and big data processing

Event Date: 
Thursday, April 30, 2015 - 4:30pm
Location: 
ESC 110 See map

Dr. Giuseppe Amatulli and Steve Weston

These workshops integrate the computational skills acquired during the Spatial and environmental data analysis in conservation and biodiversity science course by focusing more on massive processing of geographical datasets using high-performance computing. It assumes BASIC KNOWLEDGE of BASH (e.g., a “for loop”), GDAL/PKTOOLS command line software, and Geographic Information Systems and Remote Sensing concepts (projection, spectral signature, etc…). Peoples with such backgrounds are welcome to participate even if they did not follow the pre-cited course.    

The workshops are drop-in, but we strongly encourage people to pre-register by sending an email to giuseppe.amatulli@yale.edu to have full access to the material needed for the workshops and the Linux-like Virtual Machine (LVM) pre-installed in participants’ personal computers. This LVM is an ad-hoc customization of the Ubuntu distribution with Remote Sensing, GIS and Statistics open source software; with sample geo-data, scripts, and example exercises linked with the material stored at http://www.spatial-ecology.net.  The installation of the LVM effectively installs Linux inside Windows or MacOS. In other words, your main OS will remain Windows or MacOS, and you will boot the PC as before. There is no risk to your data or for your main OS. The installation of the LVM can be done during the Giuseppe’s “Officeless” Hours.

 

Workshop times & dates:

  • Workshop 1: Thursday, February 19th - 4.30pm:7.00pm - Location: ESC 110
  • Workshop 2: Thursday, March 26th - 4.30pm:7.00pm - Location: ESC 110
  • Workshop 3: Thursday, April 16th - 4.30pm:7.00pm - Location: ESC 110
  • Workshop 4: Thursday, April 30th - 4.30pm:7.00pm - Location: ESC 110

Workshop 1: Introduction to GRASS GIS (Giuseppe)

Description:

This workshop introduces students to the powerful Geographic Resources Analysis Support System (GRASS) GIS software to manipulate raster and vector data. We will see the graphical user interface but also we will use simple BASH scripts to automate many common geo-data processing tasks such as cropping and re-projecting images. You will learn how to script processes for complex geo-functions. This workshop assumes BASIC KNOWLEDGE of BASH command lines and basic knowledge of Geographic Information Systems and Remote Sensing concepts (projection, spectral signature, etc…). Participants will need a pre-installed Linux Virtual Machine in their own laptops to follow the workshop.

Material:

Hands on GRASS - First steps

GRASS in the YALE-HPC

Workshop 2: Parallel processing in local pc to perform Geographic Information Systems and Remote Sensing analysis. (Giuseppe)

Description:

In computer science a “for loop” is a programming language statement which allows code to be repeatedly executed. During this workshop, we will see how transform a simple “for loop” in multicore “for loop” being able to perform several tasks simultaneously in your laptop. This workshop assumes a PRIOR KNOWLEDGE of BASH/GDAL/OGR/R command lines acquired with Workshop 1. Participates will need a pre-installed Linux Virtual Machine to follow the workshop.

Cluster processing

Workshop 3: Parallel processing in Yale-HPC (Steve)

Description:

Cloud computing and remote server access are considered major milestones in processing large amounts of data, while multi-core computing allows several processes to run simultaneously. During this workshop, we will explore the potential of parallel processing at the Yale-HPC by understanding its architecture, how to send job to the the queue system, how to transfer data and manage and deal with memory limitation. This workshop assumes a PRIOR KNOWLEDGE of BASH command lines acquired with Workshop 1&2. Participates will need a pre-installed Linux Virtual Machine to be able to log-in in the Yale-HPC through a ssh tunnel.

Workshop 4: Geo-Computation in Yale-HPC to perform Geographic Information Systems and Remote Sensing analysis (Giuseppe)

Description:

Remote sensors and geostationary stations are collecting large amount of geographical data. During this workshop,  we will use the geo-software installed in the HPC. We will learn how to set up a massive geo-computation, setting up on-the-fly GRASS Location/Mapset in the RAM-folder,  working with Virtual Raster Format  for tiling images…etc. This workshop assumes a PRIOR KNOWLEDGE of BASH/GDAL/OGR/R command lines acquired with Workshop 1&2&3. Participates will need a pre-installed Linux Virtual Machine  to be able to log-in in the Yale-HPC through a ssh tunnel.

Contact Giuseppe Amatulli with any questions.