Policy Significance Statement
Climate data stewardship emerges as a key agenda item in both global and national policy discourses and will be an important area of focus at the upcoming 2024 United Nations Climate Change Conference (COP29). The potential of climate data to improve decision-making processes, enhance transparency, foster equity, and spur innovation in climate action is well established. Yet, challenges in the availability, accessibility, and reliability of climate data present considerable barriers. This paper contends that without a concerted effort to develop climate data governance frameworks and advance climate data stewardship, evidence-based policymaking for climate change is at risk, particularly imperiling the participatory engagement of vulnerable nations in global climate policy dialogues. Furthermore, the integrity of Monitoring, Reporting, and Verification (MRV) mechanisms—crucial for ensuring adherence to international policy commitments—relies fundamentally on robust climate data stewardship. This paper also argues for an enhanced focus on climate data stewardship as a means to bridge the gap between scientific communities and policy formulation, thereby enabling more effective and inclusive climate policy action.
1. Introduction
We are living, as the historian Adam Tooze has argued, in the age of the polycrisis.Footnote 1 From pandemics to rampant inequality, from global warming to the rise of illiberal populism, the world faces intertwining and overlapping problems whose complexity and intractability seem to defy conventional governance methods. Increasingly, there is a sense that we need not only new solutions but also new approaches to solutions.
Data is often upheld as offering potential pathways: new frameworks for governance, new paradigms to help mitigate our most pressing problems. A growing chorus of experts argue that—at a time of increasing datafication, the exponential increase in data and sophisticated methods for analyzing and using it—there may be fresh avenues for governanceFootnote 2 and social and economic re-ordering.Footnote 3
At the same time, the risks of data are also becoming increasingly apparent: greater inequality and new forms of exclusion; implicit (and explicit) biases; and access asymmetries that graft themselves onto and exacerbate existing socio-economic inequities. These tensions—between the challenges and opportunities of data—are central to our age, and must be navigated by policymakers and other stakeholders seeking to address mounting crises.
In this article,Footnote 4 we examine the potential of data to address one of our most severe challenges: climate change. Climate data has often been cited as one of the most promising ways to address the vast and intertwined series of risks associated with global warming.Footnote 5 Indeed, while the bulk of public attention is typically directed at the financial requirements for combating climate change, several global efforts (such as the Intergovernmental Panel on Climate Change (IPCC) or others conducted under the United Nations) equally emphasize the importance of reliable data and information transparency. The UNDP’s Global Climate Promise, for instance, states that “[t]he main challenge lies in obtaining good-quality, long-term data,” and emphasizes the importance of countries “fulfilling Enhanced Transparency Framework obligations under the Paris Agreement.” It probably fair to say that few—if any—of the targets or goals set out by the Paris Agreement or in recent IPCC reports are attainable without more solid, data-led evidence foundations.
At the same time, much as with data in general, the use of climate dataFootnote 6 is accompanied by certain challenges and tensions. These tensions are exacerbated by what I (2024) have called an impending “data winter”Footnote 7 —a period of decreased funding for and access to data, marked by restrictive data access policies by social media platforms, legislative and policy inaction, and the effective privatization of data. In what follows, we address ten tensions, some specifically associated with climate data and some more generally with the use of data to address social problems. Navigating these tensions, we argue, is essential to unlock the potential of climate data, and developing a framework for sustainable, systematic, and responsible use of data to address global warming and its many associated challenges. Throughout the article, we discuss the vital role of data stewardshipFootnote 8 in this process, illustrating the role of trusted data intermediaries and repositories with a series of examples. In the conclusion, we seek to draw some general observations from these examples, concluding with a discussion of the 3 key responsibilities played by data stewards in fostering a data ecology to serve the public good.
Tension #1: Diversity of Sources, Actors, Purposes, and Products
First, the diversity of (new) climate data sources and of key actors and stakeholders involved in the climate data chain, each bringing their own priorities and values to the discussion, creates difficulties and tensions in the data ecology.
Diversity of new data sources resulting from new methods, and instrumentation
-
• Advances in sensing and monitoring technology: The Internet of Things (IoT) and new satellite technologies have significantly expanded the scope of climate data collection. IoT devices, such as sensors deployed in various environments, collect real-time data on temperature, air quality, and water levels. New satellite technologies offer high-resolution imagery and data on land use changes, deforestation, and ice melt rates.
-
• New data collection methods: Citizen science initiatives empower individuals to contribute to data collection, using simple tools or smartphone apps to report local weather conditions, species counts, or pollution levels. This democratization of data collection diversifies and enriches the climate data ecosystem.
-
• Advances in machine learning: The development of sophisticated machine learning models, including generative pre-trained transformers (GPT), has revolutionized data analysis. These models can process vast datasets to identify patterns, trends, and anomalies, making climate predictions more accurate and actionable.
Diversity of actors and stakeholders
-
• Institutions executing climate law: International bodies (such as the United Nations), national governments, and local authorities implement and enforce climate-related regulations, relying on data to inform their policies.
-
• Statistical agencies: These agencies compile and analyze environmental and climate data, producing vital statistics that inform research and policy.
-
• Climate researchers: Scientists and researchers at universities and research institutes analyze climate data to understand climate change’s mechanisms and impacts.
-
• Private sector: Companies across various sectors use climate data for risk management, product development, and sustainability initiatives.
-
• Citizen scientists: Individuals participating in data collection contribute valuable localized insights, enriching the global understanding of climate change.
Diversity of purposes for and users of climate data
-
• Science: Researchers use climate data to deepen our understanding of environmental processes and the impacts of climate change.
-
• Policymaking: Governments leverage climate data for informed governance, policy-making, and enforcement of environmental regulations.
-
• Advocacy: NGOs and activists use climate data to advocate for climate justice and raise awareness about climate change’s impacts on vulnerable populations.
-
• Planning and response: Data informs emergency response strategies, infrastructure planning, and environmental management to mitigate climate risks.
-
• Economic: In sectors like agriculture, insurance, and energy, climate data supports resource management and helps develop climate-resilient business models.
Diversity of climate data products
-
• Indicators: Climate indicators, such as greenhouse gas concentrations, sea level rise, and global temperature anomalies, offer concise, critical insights into the state of the climate system.
-
• Statistics: Statistical analyses provide a quantitative basis for understanding trends, variations, and projections in climate data.
-
• Visualizations: Maps, graphs, and interactive platforms transform complex climate data into accessible and understandable formats for diverse audiences.
-
• Applications: Software and apps translate climate data into practical tools for education, decision-making, and daily life, enabling users to access personalized climate information and advice.
Box 1 indicates some of the diversity, listing a sample of data sources, actors, purposes, and products involved in climate data. The diversity is only likely to increase with the continued expansion of the Internet of Things, accompanied by a plethora of sensors and other new data collection methods. In addition, climate data, once primarily collected and used by scientists, is now making its way into a variety of domains—policymaking, the private sector, disaster response, etc.—which leads to a potential divergence of methods and priorities among an ever-widening group of stakeholders (including statistical agencies, scientists, citizen scientists, policymakers, civil society, the private sector, and more). Advances in machine learning and AI are likely to further complicate the picture, leading to unpredictable uses of data and equally unpredictable outcomes.
The cornucopia of interests and stakeholders—marked both by plenty and increasing divergence—calls for new approaches to governance. In particular, there is a need for greater multi-stakeholder governance that could align interests, sensitivities, and requirements at all levels of decision-making.Footnote 9 Data stewards have a natural role to play in such processes, and a number of initiatives (e.g., the Climate Data Store, Climate Montreal, and the Pacific Climate Impacts Consortium) have provided hubs to bring together stakeholders and help ensure legitimacy and trust in the climate data ecosystem. These are essential ingredients in a field that often processes PII; in addition, multi-stakeholder governance can also help protect independence and rigor for scientists and other researchers, a vital concern in the often contested field of climate data.
Tension #2 Competing concerns and lack of common principles
Among the divergence of sources and interests marking the climate data field, few are as pronounced as those separating climate change and climate justice actors on the one hand and private sector stakeholders on the other. The former category (which includes policymakers, researchers, and activists) calls for more data related to the environment, and more openness (embodied by the “Right to Know” movement), and seeks to increase data justice while limiting data extraction and data colonialism. Corporations and private actors, on the other hand, are often motivated by a desire to maintain competitive advantage. In addition, the private sector is itself marked by competing interests, depending on whether companies primarily produce, process, or reuse data.
In short, the field is marked by an absence of common principles for how data should be (re)used, with different stakeholders upholding different principles to advance their respective agendas and priorities. There is an urgent need for a common normative and ethical framework that could guide the collection, processing and (re)use of climate data. In particular, we need to move from a concept of data ownership, which exacerbates asymmetriesFootnote 10, to data stewardship. Data stewards could increase accessibility and transparency while accommodating individual and collective concerns and rights, making room for a variety of stakeholders.
Tension #3 Power imbalance: who decides what to measure? And what to collect? And what to share?
Decisions concerning what data to collect, analyze, and use—and how to use it—underlie another tension at the heart of climate data. The divergence of stakeholders means similar divergence in approaches to measuring, collecting, and sharing data. The potentially deleterious impact of these divergences is heightened by power asymmetries within the digital ecology, particularly as they relate to data access and agenda-setting. An agenda for data-driven collaboration can “inform the strategic allocation of resources for new research projects, indicators for regular monitoring, and the formation of cross-sectoral data-sharing collaborations.”Footnote 11 Despite clear potential benefits and widespread agreement on the general principle of using climate data, in other words, there remain important open questions about what specific forms of data are used, what types of conclusions are derived, and who benefits from data-driven decision-making.
Solutions do exist, but it is important to understand that the choice—and application—of solutions is itself not value-neutral. The notion of purpose specification, for instance, which may help limit data abuses, is an inherently political choice: defining a purpose will inherently narrow the scope of who benefits. To help mitigate such risks, it is essential that equity and participation be considered as overarching principles throughout the data value chain and the broader ecology of climate data governance. The quality of participation can also be enhanced by more effective communication and awareness building with civil society groups and the public at large. Several climate initiatives are now advancing new frontiers in data stewardship by making data useful not only for researchers and scientists, but by pioneering new approaches to visualization and storytelling (e.g., the WORLDLING initiative at MIT, or the Climate Impact Lab).
Tension #4 Extraction through the collection. Proportionality and collective rights. Data ownership
Datafication is taking place in a world marked by long-standing hierarchies, inequalities, and socio-economic divisions. This is perhaps especially true of climate data, which is being collected, processed and used across national and cultural boundaries, raising important questions about relations between and the relative rights of communities. Concerns exist about data sovereigntyFootnote 12 for indigenous populations, feminist and anti-colonial movements, and the rights of populations subject to so-called “helicopter research”Footnote 13 and extractive data practices. Such concerns are magnified by a general distrust of data collection practices between populations with unequal rights or powerFootnote 14, and growing awareness regarding flawed data consent provisions, which call for new “social licenses” to govern how data is collected and used.
To ensure that the field of climate data mitigates rather than exacerbates existing divisions, such concerns should be acknowledged by and embedded within emerging governance frameworks. Community-based participatory research and collection can help minimize extractive data practices. Representation should also be taken into consideration when selecting data stewards, whether they be individuals or bodies and institutions (which can be composed of a variety of community voices so as to ensure various stakeholder interests are taken into account).
In addition, it is essential to uphold proportionality as a core principle, for example by tailoring data collection efforts to the minimal necessary for meaningful insights, ensuring that the benefits of data use are equitably distributed among all stakeholders, and implementing safeguards that prevent the over-surveillance of vulnerable communities. Underlying these methods and approaches is a recognition of data as a public—rather than private—good, and a commitment to upholding the principle of digital self-determinationFootnote 15 throughout the data ecology.
Tension # 5: Quality, provenance, and standards
As the volume of available data grows, so do concerns over data quality and integrity. In addition, it is essential to ensure—and acknowledge—the “situatedness” of data; so-called “thick data”Footnote 16 helps ensure that data is processed, analyzed and used in a contextually relevant and sensitive way. Attention to data quality and thickness must be accompanied by—and can help encourage—the development of data standards to promote interoperability and responsible reusability. Responsible reuse is especially important for climate data, which is inherently global and cross-national in scope.
Data quality is in part a matter of ensuring that decisions involving climate technology are supported by appropriate quality-assured engineering standards and processes. But, as with all the tensions discussed in this paper, resolving these tensions is about far more than just technology. Governance and policy steps are also essential. These can include frameworks that help operationalize and standardize quality assurance, for instance, by establishing clear metrics for data accuracy, consistency, and completeness across different stages of data collection, analysis, and usage. In addition, data quality (and relevance) can be enhanced if institutions have policies that track what decisions they make about data throughout the data lifecycle.
Data stewards have a valuable role to play in this process, for instance by leveraging their ability to bring together and coordinate multiple stakeholders. A good example can be found in the Datzilla error reporting and tracking system, which offers a web-based portal to identify and correct data discrepancies in climate-related data sets from the National Oceanic and Atmospheric Administration (NOAA). One key to the tool’s effectiveness is its multistakeholder nature: hosted by Texas A&M University, it allows academic researchers, government bodies, and civil society organizations to coordinate to enhance the quality of climate data.
Tension #6 Timeliness, continuity, and sustainability
Ensuring the timely collection and release of climate data is critical for effective risk management and the development of innovative approaches to advance climate mitigation and climate justice. However, persistent concerns regarding the financial (and other forms of) sustainability raise questions about timeliness, continuity, and the long-term viability of climate data and efforts at climate justice. Such concerns are heightened by the political context of the global climate debate, which poses challenges to policy continuity and financial sustainability. In particular, a risk exists that bigger stakeholders could walk away from the conversation, thus imperiling the functioning of the larger ecosystem dedicated to climate mitigation.
As a result of these tensions, it is clear that any efforts at sustainability need to be at the core of an effective governance framework for climate data. This will require fit-for-purpose incentives for investment and institutionalization so that all stakeholders are aligned. In addition, governance will need to ensure the timeliness of climate data, to ensure its relevance and effectiveness. Some mechanisms that can help achieve these objectives include public-private data collaborativesFootnote 17 , Footnote 18 (e.g., the Net Zero Public Data UtilityFootnote 19), standards for data quality and timeliness, and international agreements on climate data exchange and reporting. In addition, fostering community-based monitoring programs and leveraging technological innovations, such as artificial intelligence for data analysis, can further enhance the governance framework’s effectiveness.
Tension # 7 Access, openness, and transparency
Our era’s challenges necessitate nuanced solutions beyond merely “opening” data—a narrow approach that may inadvertently serve corporate agendas. Mere access to raw data does not guarantee transparency, leaving room for manipulation. To navigate these complexities, fostering data collaboratives, empowering data stewards, and enhancing data-sharing practices are essential. However, for these strategies to be genuinely effective, it is crucial to establish and clarify incentives for private sector data holders (“the business case” for data collaborationFootnote 20) and to improve the harmonization of cross-border data sharing efforts.
As a broad approach, embedding FAIR data principlesFootnote 21 into the climate data conversation may provide an overarching framework to reduce access asymmetries and achieve meaningful openness. These principles (which embody findability, accessibility, interoperability, and reusability) were first established in the context of academic data. In addition, data collaboratives have proven particularly useful vehicles for bridging gaps between the private and public sectors, and more generally for promoting greater data sharing. Finally, the field of climate data could benefit from a commitment by relevant stakeholders to share data in response to specific—and specific types or scales—of crises. This can at least ensure that the right data will be available to serve the public good when it is most needed.
Tension # 8: Bias, capture, and whitewashing
Tension #3, above, highlights the risks of power imbalances within the data ecology. There are various ways these imbalances can manifest. In addition to the problems posed by unequal access, bias in the underlying data and algorithms is emerging as a serious concern—particularly given the growing prominence of artificial intelligence and large language models (LLMs). These risks are accompanied by concerns over “whitewashing” (in which stakeholders may cover up or obfuscate data collection or disclosure), and concerns regarding academic captureFootnote 22 by the private (and public) sectors. All of these affect the rigor and credibility of data and data efforts; more generally, they undermine trust in the ecosystem.
An adequate framework to address these concerns must encompass both technological solutions and policy interventions. Technological solutions might involve the development of automated systems capable of auditing and tracing data flows within the lifecycle, thereby ensuring accountability through alerts on potential discrepancies or misuse. However, these technological measures need to be underpinned by a robust governance framework that not only operationalizes ethical principles but also extends the notion of responsibility in the use of climate data.
This broader governance framework should include comprehensive approaches to protect individual and community rights. Some existing approaches to data stewardship already include tools to increase local participation—for example, the Protected Areas Database of the United States (PAD-US) initiative, a multi-stakeholder project that facilitates local review of protected lands and areas in the United States. But beyond formal participation mechanisms and tools, it is essential more broadly to establish a “social license to operateFootnote 23,” which entails gaining and maintaining public trust by demonstrating that data collection and use are conducted in ways that align with societal values and expectations. This concept goes beyond legal compliance to include ethical considerations, transparency, and community engagement, ensuring that data practices are perceived as legitimate and beneficial by the wider community.
Tension # 9: Local vs global: subsidiarity and cultural difference
Climate change is a global problem, and climate data is therefore also global in scope. But within this broad context, there exists little consensus on what types of questionsFootnote 24 , Footnote 25—or solutions—should be devolved to the local level, and how to account for cultural and structural differences across jurisdictions. As a general principle, governance at the local level is more conducive to citizen participation, and rapid feedback loops to iterate and improve upon data initiatives. In addition, local governance allows for more efficient management of climate resources and challenges. At the same time, it is important to recognize that many localities may lack the type of data expertise available at national and international levels. An overly local focus may likewise lack the global perspective required to mitigate broader challenges.
Several pathways are available to resolve these tensions. To begin, it is essential to embed notions of subsidiarity within data governance design; these can help establish the principle that, where possible and advantageous, climate data should be governed at the local level. Highlighting subsidiarity also upholds the importance of local participation and cultural and context sensitivity. Alongside subsidiarity, however, it is equally important to identify issues and challenges that are better addressed at the global (or supra-local) level—for example, data related to the management of common resources or the upholding of shared values and rights. Ultimately, an effective governance framework will coordinate the local and the global, and harmonize the need for contextual sensitivity and community participation with the wider scope of the problem and of the data itself.
Tension #10: Disputes, accountability and use
While it is increasingly clear that data can help mitigate the climate crisis, there exists little agreement on how to implement it, and what processes or mechanisms exist to harmonize values and priorities, avoid misuse and harm, and ensure accountability. Disputes are inevitable and require agile and independent processes to be resolved in a productive manner. In addition, as with all uses of data, accountability is essential. Climate justice requires not just judicious use of data but also clear lines of responsibility and accountability.
Existing processes from other data verticals may provide guidance. For example, there now exist well-established procedures and mechanisms (technological and otherwise) to establish decision provenance (i.e., transparency about who is responsible and accountable for the use of climate data). Auditing tools and frameworks can also be repurposed, and designed to ensure independence and agility within dispute resolution processes. While repurposing such established steps, it is also important to keep in mind specific needs and variations that may be necessary in the climate context—e.g., the tensions between the global and the local or the urgent need for real-time data to inform immediate climate action, the necessity for integrating indigenous and traditional knowledge systems, and the importance of addressing data sensitivity and security concerns related to vulnerable ecosystems and communities.
Conclusion: Toward Climate Data Stewardship
The preceding has outlined 10 tensions that currently characterize the climate data ecology. Our ability to use data for the public good—and more generally to mitigate the impending climate crisis—depends significantly on the extent to which we are able to navigate these tensions productively, maximizing the benefits of climate data while limiting their potential harms. At the same time, even as we seek to navigate the specific tensions, we must also ensure a broader commitment to data access, and nurture a data ecology that fosters the responsible reuse of private data for the public good.
What’s required, in effect, is an International Decade of DataFootnote 26. This decade would be marked by new activity on the legislative front as well as a general cultural shift in how society views data and the ability to reuse private data for the public good. This requires awareness raising and capacity building, and it requires stakeholders from various sectors and from around the world—this is both a global and local problem—to come together to limit data hoarding and instead foster responsible sharing and reuse.
Key in that endeavor must be the advancement of climate data stewardship. The various examples mentioned in the preceding offer clear evidence of the role that data stewards can play in helping to foster a more responsible climate data ecology. In particular, data stewards play a key role in fostering data collaboratives and greater data access and reuse.
Further analysis of these examples, and several more collected at our data stewards repository, helps elucidate that data stewards have 3 key responsibilities:
-
• Collaborate: Data stewards have a responsibility to identify, nurture and manage data and data collaboratives when there is an opportunity to unlock data in the public interest. As part of this responsibility, data stewards can help break down data stored in private silos.
-
• Protect: Data stewards play a key role in managing data ethically and preventing harm and misuse. In this role, data stewards also help protect the integrity and quality of data.
-
• Act: Finally, data stewards have a responsibility to proactively help unlock data and–this is critical–ensure that data insights are acted upon responsibly and in the public interest.
Table 1 breaks down these broad categories, showing some specific steps that data stewards can take to unlock the 10 tensions discussed in this paper. Considered together, the items in the table offer an action list for the climate field, akin to a SOW (scope of work) for prospective data stewards working for climate justice and mitigating climate change through more responsible use of data.
Today, the world stands at the precipice of a major crisis that has huge — even existential — implications. Data is of course not a silver bullet; technology cannot solve the climate crisis on its own. But data is an essential component of any solution. We owe it to future generations (and to ourselves) to seize the moment by recommitting ourselves to the ethical use and reuse of data, and to creating a more just and equitable data ecology. The road to climate justice runs, at least in part, through data justice.
Data availability statement
Data availability is not applicable to this article as no new data were created or analyzed in this study.
Author contribution
Writing – original draft: S.V.; Writing – review & editing: S.V.
Funding statement
This work received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest statement
Stefaan Verhulst is an editor-in-chief of Data & Policy. This article was reviewed through an independent peer review process.
Comments
No Comments have been published for this article.