Project Gaia: Enabling climate risk analysis

Project Gaia seeks to help analysts search corporate climate-related disclosures quickly and efficiently. Together with our Eurosystem project partners, the Bank of Spain, the Deutsche Bundesbank and the European Central Bank, the project will establish an open, web-based tool to facilitate climate and environmental risk assessments based on a large corpus of publicly available corporate reports.

A variety of cutting-edge technologies will be assessed with the aim of helping analysts to extract relevant information from non-standardised PDF reports. Besides classical machine learning approaches, Project Gaia will explore how large language models (such as GPT) might be integrated into a reliable workflow for fact-finding.

Need for climate-related data

Central banks and financial regulators increasingly need to take into account the financial stability risks posed by climate change. Climate change can affect banks' portfolios – for example, through collateral values – or via the exposures of other financial institutions. It can thus affect overall financial stability.

High-quality data on the exposure of financial intermediaries to the financial risks posed by climate change are essential if the importance of climate-related risk is to be properly assessed. At present, however, climate-related data are scattered across a great variety of documents, including integrated, annual and ESG reports, and financial statements. Furthermore, companies disclose this information in multiple formats, for example by including sustainability data in advertising, or within images, tables or graphs in regulatory documents.

The main challenges of currently available climate-related data

International standard-setting bodies and policymakers have highlighted the need to standardise climate-related disclosures and to close existing climate data gaps. Until such standards are agreed, however, existing and potentially useful climate-risk information will remain unstructured. Accessing this data will require extensive web-based research and browsing through lengthy documents. For this reason, publicly available company reports on sustainability-related disclosures remain a largely underused source of information.

A single platform to facilitate working with climate related reports

Project Gaia will explore the possibilities of standardising different globally recognised ESG disclosures. It will use publicly available company reports as its core sources of information. The prototype will consist of a data-agnostic model that meets the needs of the central banking community. It will feature the required flexibility and open access to the data, as well as underlying algorithms.

On the technical front, Gaia will explore the potential of using natural language processing including the use of large language models to extract and structure climate-related data. These state-of-the-art tools will make unstructured data more usable, facilitating climate risk assessments. The project will test and compare different approaches to extracting the data and achieving the best possible model performance.

The first project output will be a repository of textual corporate reports coupled with a full-text and semantic search engine to identify specific sustainability-related disclosures. In its second stage, the project will build a graphical user interface (GUI) to access the company-level climate-related database. The GUI will include data visualisation tools to facilitate climate and environmental risk assessments. The final output will be a report summarising the project findings and lessons learnt.