its correct functionality and cannot be turned off. Optional cookies are used to improve the page with analytics, by
WE ARE HAVING A COFFEE BREAK, BUT SOON WE WILL BE BACK ONLINE
WE WILL BE BACK IN
More news about the upcoming edition of ECMWF Code for Earth 2024 will be published soon! STAY TUNED!!
About Code for Earth
Code for Earth is an innovation programme run by the European Centre for Medium-Range Weather Forecasts (ECMWF). Its aim is to drive innovation and open source developments in the Earth sciences community - supporting developments in weather and climate, Copernicus and Destination Earth (DestinE). Learn more about ECMWF, Copernicus and DestinE.
Since 2018, each summer, up to ten developer teams work together with experienced mentors from ECMWF and partner organisations on innovative projects. These projects are related to the broad scope of activities at ECMWF, including data science, weather, climate or other earth sciences, visualisation and more.
Copernicus is the Earth observation component of the European Union’s Space programme, looking at our planet and its environment to benefit all European citizens. It offers information services that draw from satellite Earth Observation and in-situ (non-space) data. ECMWF is implementing the EU-funded Copernicus Climate Change Service and the Copernicus Atmosphere Monitoring Service on behalf of the European Union.
Through CAMS ECMWF is providing consistent and quality-controlled information to air pollution and health, solar energy, greenhouse gases and climate forcing globally. The C3S offers free and open access to climate data and tools based on the best available science and supports adaptation and mitigation policy and decision-making by providing consistent and authoritative information about climate change.
Destination Earth is a European Union funded initiative launched in 2022, with the aim to build a digital replica of the Earth system by 2030. The initiative will be jointly implemented by three entrusted entities: the European Centre for Medium-Range Weather Forecasts (ECMWF) responsible for the creation of the first two "digital twins" and the "Digital Twin Engine", the European Space Agency (ESA) responsible for building the "Core Service Platform", and the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), responsible for the creation of the "Data Lake".
ECMWF is the European Centre for Medium-Range Weather Forecasts. We are both a research institute and a 24/7 operational service, producing global numerical weather predictions and other data for our Member and Co-operating States and the broader community. The Centre has one of the largest supercomputer facilities and meteorological data archives in the world. Other strategic activities include delivering advanced training and assisting the WMO in implementing its programmes.
WeatherBench is a benchmark dataset that explores the potential of Machine Learning methods for weather forecasting. WeatherBench is comprised of ERA5 reanalysis data and covers the entire globe. Various spatial resolutions are available, the time step is 1 hour. Authors compete to predict meteorological variables as well as possible 3 and 5 days into the future.
Diffusion Models are a recently popularised class of Machine Learning models and have proven especially effective at generating images. Particularly successful examples include Stable Diffusion and DALL-E 2. Diffusion Models can also be trained to generate output conditioned on input data such as text or other images.
We will employ Diffusion Models for weather forecasting: we plan to give the model the current state of atmospheric variables as conditioning information and train it to predict realistic future states.
Specifically we plan to:
- Explore the potential of Diffusion Models on the WeatherBench challenge - which has never been done before.
- Publish code and trained models to make it easy to replicate and build on our results.
Atmospheric Composition Dataset Explorer
The goal of this project is to create an API and an interactive application that generate atmospheric composition diagnostics plots according to user specifications. The data source is the CAMS Atmosphere Data Store, specifically the GUI application shall deal with CAMS Greenhouse Gas Fluxes and CAMS Global Reanalysis EAC4 datasets.
The application shall automatize the process of creating frequently used time series, hovmoeller and geospatial plots for parameters such as spatial and temporal domains, time resolution and atmospheric variables.
The ideal outcome would be to provide generic enough APIs which can be used for data retrieval, data homogenization, data slicing and sub-setting, aggregation and visualization of different CAMS datasets; also, it shall be generic enough that adding new plot types won't be too difficult. We also plan to provide a GUI for easy generation of a report based on parameters selected by the user.
The aim of the project is to create a Machine learning (ML) model that can generate high-resolution regional reanalysis data (similar to the one produced by CERRA) by downscaling global reanalysis data from ERA5.
This will be accomplished by using state-of-the-art Deep Learning (DL) techniques like U-Net, conditional GAN, and diffusion models (among others). Additionally, an ingestion module will be implemented to assess the possible benefit of using CERRA pseudo-observations as extra predictors. Once the model is designed and trained, a detailed validation framework takes the place.
It combines classical deterministic error metrics with in-depth validations, including time series, maps, spatio-temporal correlations, and computer vision metrics, disaggregated by months, seasons, and geographical regions, to evaluate the effectiveness of the model in reducing errors and representing physical processes. This level of granularity allows for a more comprehensive and accurate assessment, which is critical for ensuring that the model is effective in practice.
Moreover, tools for interpretability of DL models can be used to understand the inner workings and decision-making processes of these complex structures by analyzing the activations of different neurons and the importance of different features in the input data.
Compression of Geospatial Data with Varying Information Density
Geospatial data can vary in its information density from one part of the world to another. A dataset containing streets will be very dense in cities but contains little information in remote places like the Alps or even the ocean. The same is also true for datasets about the ocean or the atmosphere. The variability of sea surface temperatures and currents is much larger in the vicinity of the golf stream than in the middle of the Atlantic basin. This variability might also change in time. A hurricane, for example, has a lot of variability in winds, temperature and rain rates, and travels in addition across entire ocean basins.
The challenge of this project is to improve `xbitinfo` to preserve the natural variability of these features but not to save random noise where the real information density is rather low. This means in particular that the number of bits needed to preserve in compression changes with location. A hurricane has a different information density than a same-sized area in the steadily blowing trade-wind regimes. Compressibility of climate data therefore can change drastically in time and space, which we want to exploit.
Currently in the bitinformation framework, to preserve all real information, the maximum information content calculated by `xbitinfo` needs to be used for the entire dataset. However, bitinformation can also be calculated on subsets, such that the ‘boring’ parts can therefore be more efficiently compressed.
Sketchbook Earth is a project aiming to democratize the production of climate intelligence reports, traditionally restricted due to the reliance on internal ECMWF tools. We propose developing a series of Jupyter notebooks that will illustrate our planet's climate stories in an accessible and engaging manner.
Leveraging the new cads-toolbox Python package, these notebooks will retrieve and process data from the Copernicus climate data store (CDS), transforming raw information into expressive visual narratives. We will focus on downloading and preprocessing Essential Climate Variables (ECVs), calculating climate anomalies, and generating visualizations that echo the vibrant storytelling found in a sketchbook.
The resulting Jupyter notebooks will not only provide meaningful climate insights but also serve as a comprehensive training resource. Through Sketchbook Earth, we aim to offer a more visual, comprehensible, and reproducible approach to climate intelligence.
We will develop a framework to forecast wildfires in Europe with machine learning from GFAS fire data and meteorological forecasts. Our team will evaluate different machine learning tools for forecasting and aim to integrate the tools into the operational pipeline of the ECMWF.
Global reanalysis data sets such as ERA5 constitute an important backbone for a wide range of topics, most notably including applications related to renewable energy and agriculture, as well as driving fields for climate control simulations. However, due to its resolution of 31 km and even lower resolved uncertainty information, ERA5 lacks details and applicability for regions with heterogeneous terrain or renewable energy applications.
To increase the ERA5 spatial resolution in a step-wise manner for the whole globe without the need for large computational resources, parameter-wise downscaling with statistics and/or machine learning using a higher resolved reanalysis data set as target/proxy is a promising approach. For such a purpose, the CERRA data covering Europe with a spatial resolution of 5.5 km (and 11 km for the ensemble) is ideal.
Here, we aim at implementing a model output (baseline) approach, and two deep learning approaches for post-processing and downscaling using residuals.
Validation of soil moisture and soil temperature is crucial for Numerical Weather Prediction (NWP), as they control surface heat fluxes that directly affect near-surface weather. This can be done with LANDVER, which is a validation package for land surface variables, currently consisting of soil moisture and soil temperature.
The tool provides an independent validation of soil moisture and soil temperature data using in situ observations from the international soil moisture network. What is currently missing in this software package is the capability to validate latent and sensible surface heat fluxes against Eddy-Covariance measurements, which can provide useful information about how well ECMWF’s Land-Surface Modelling Component ECLand is able to translate soil moisture stress into surface heat fluxes.
Implementing an additional routine into the already existing software package paves the way for a standardized land-surface benchmarking tool for the ECMWF.
ECMWF has an extensive amount of real-time and historical weather data, as well as an comprehensive documentation knowledge base. Currently, the data can be accessed via three different API — the chart discovery API, the dataset API, and the dataset DOIs — all of which require some level of coding, as well very precise queries. This constitutes a high barrier to entry to third parties who want to make use of ECMWF's very large amount of information.
Recent developments in the field of natural language processing, such as the Transformers technologies and large language models fine-tuned to interact conversationally — such as ChatGPT — allow for a search engine to reply to queries formulated in natural language. The large language model maps the request to an API query, and provide seamlessly the required information.
The aim of this challange is to develop a search engine - the chatECMWF - that can reply to a number of queries related to ECMWF datasets, charts and documentation, from general enquiries — such as "What is the license of ECMWF open data" or "Where can I find ozone data?" — to very specific requests — think of "What air quality data is available in CAMS for Europe for the period from September to October 2014?".