ChatECMWF
ECMWF has an extensive amount of real-time and historical weather data, as well as an comprehensive documentation knowledge base. Currently, the data can be accessed via three different API — the chart discovery API, the dataset API, and the dataset DOIs — all of which require some level of coding, as well very precise queries. This constitutes a high barrier to entry to third parties who want to make use of ECMWF’s very large amount of information.
Recent developments in the field of natural language processing, such as the Transformers technologies and large language models fine-tuned to interact conversationally — such as ChatGPT — allow for a search engine to reply to queries formulated in natural language. The large language model maps the request to an API query, and provide seamlessly the required information.
The aim of this challange was to develop a search engine – the chatECMWF – that can reply to a number of queries related to ECMWF datasets, charts and documentation, from general enquiries — such as “What is the license of ECMWF open data” or “Where can I find ozone data?” — to very specific requests — think of “What air quality data is available in CAMS for Europe for the period from September to October 2014?”.
Follow the developments on GitHub
Mentors
- Baudouin Raoult
- Sylvie Lamy-Thepaut
- Helen Setchell
- Myranda Uselton Shirk