CliMetLab – Machine Learning on weather and climate data

CliMetLab is a Python package aiming at simplifying access to climate and meteorological datasets, allowing users to focus on science instead of technical issues such as data access and data formats. 

This project aims at handling the data loading as well as interpreting the output from the machine learning models with the use of plots, graphs, etc. This will remove the overhead of manual data retrieval, writing specific data loaders per dataset. 

The plugin architecture in CliMetLab aims at easy addition of data sources, datasets, plotting styles and data formats. 

Specific goals of the project: 

1) extend CliMetLab so that it offers the user with high-level Matplotlib-based plotting functions to produce graphs and plot which are relevant to weather and climate applications. 

2) Python package Intake is a lightweight set of tools for loading and sharing data in data science projects. Extend CliMetLab so that it seamlessly interfaces with Intake and allows users to access all intake-compatible datasets. 

3) Xarray uses the data format Zarr to allow parallel read and parallel write. Convert large already available datasets to xarray-readable zarr format, define appropriate configuration (chunking/compression/other) according to domain use cases, develop tools to benchmark when used on a cloud-platform, compare to other formats (N5, GRIB, netCDF, geoTIFF, etc.).

Follow the developments on GitHub

Mentors

  • Florian Pinault
  • Baudouin Raoult

Participants

Ashwin Samudre