Outcomes of Open Data Policies
This is the second in a series of three articles on Open Data that has been prepared as a contribution to the 2021 NASA CEOS Chair theme “Space-based Earth Observation Data for Open Science and Decision Support”.
- Open Data for Impact looking at the origins of data sharing principles, and satellites and open data.
- Outcomes of Open Data Policies (this article) discussing some projects that have resulted as a consequence of the availability of open data.
- Looking Forward: The Future of Open Data examining the future direction of open data for Earth observation, including CEOS Agency missions.
Utilising Open Data
The period of the early 2000’s coincided with the exponential growth of the Internet as a new global commons, as an enabling infrastructure, and ultimately as a marketplace of ideas, goods, and services. This growth has driven the marginal cost of reproduction for digital goods to near zero (e.g. images), and created new and unique channels for satellite data providers to engage end users. The emergence of the Internet also had a significant impact on the open source community, accelerating and growing its scope massively, and resulting in high quality “pay to build, free to licence” software, Operating Systems (e.g. Linux), frameworks, and toolsets. Additionally the late 2000’s saw a significant evolution in mobile devices, greatly increasing their compute and storage capacity which enabled the incorporation of rich features providing a new channel for geospatial information (e.g. iPhone, Android-powered). By opening up their datasets, space data providers have been able to participate in and contribute to this period of significant progress, and to take advantage of the massive growth in capabilities and reach.
One of the first platforms to demonstrate the power of the confluence of the internet and satellite Earth observations was Google Earth Engine (GEE, earthengine.google.com), launched in 2010. This was one of the first open online tools designed for the scientific analysis and visualisation of geospatial datasets, and is backed by a rich catalogue of open satellite observations. GEE is provided freely and openly for research and education, and has become very popular amongst a wide range of users (estimated at more than 100,000). It has been used for a wide range of applications, including environmental studies, coastal and ocean, atmospheric analysis, and climate assessment. It is used to power key applications such as the European Commission’s Global Surface Water Explorer (global-surface-water.appspot.com) and the FAO’s Sepal (sepal.io) platform (a central element of the Global Forest Observation Initiative’s capacity building and country engagement program). Without open data, developing this type of platform would have almost certainly been cost prohibitive – and the significant potential of these datasets would have remained locked in closed file shares and databases.
In 2017, based on several years of R&D work by CSIRO and Geoscience Australia, the Open Data Cube framework was released. (What is the Open Data Cube?). The open source ODC framework has been used to power continental scale instances referred to as ‘Digital Earth’ for Australia, Africa, and with the Americas currently under consideration. In addition, it has been used to enable large scale deployments for Switzerland (Swiss Cube), Taiwan, India, and a number of others around the globe – at last count more than 100 known local, country or regional instances. The investment by the open source community in the ODC project has led to significant uptake of satellite data, enabled by large scale datasets provided by CEOS agencies. This has culminated in a collaboration between WGISS, the CEOS Systems Engineering Office, several CEOS initiatives, and CSIRO on the creation of the CEOS Earth Analytics Interoperability Laboratory (EAIL) based on the Open Data Cube.
Over the past 20 years, open data has driven the uptake and application of satellite EO in new and exciting ways that would have been difficult to anticipate. It has allowed data providers to participate and contribute to the energy and excitement generated by the Internet. Perhaps most excitingly however it has laid the foundation for satellite EO to participate in the next 20 years of growth. Whether this is as an enabling contribution to the systematic monitoring of the Earth’s climate, helping to realise the Sustainable Development Goals, unlock more value through data fusion and Machine Learning, or more likely in all new ways we have yet the foresee – the ‘full and open exchange’ of data has been central to driving results and impact.
CEOS Activities Underpinned by Open Data
Driven by the increasing availability of open data, CEOS has undertaken various projects and activities to further increase the impact of satellite EO. As more data becomes openly available, CEOS and its working teams continue to create new initiatives to help make the data more accessible for a wider range of users. Examples of CEOS work underpinned by open data include the Working Group on Disasters’ Geohazard Supersites and Natural Laboratories initiative (GSNL), CEOS Analysis Ready Data, and support for the United Nations Sustainable Development Goals.
As extreme weather events and natural hazards become more common, it is becoming evident that more investments need to be made in the short term to prevent major losses in the future. The concept for the GSNL was conceived in 2007, and established as an initiative within the Group on Earth Observation (GEO) in 2010. The GSNL initiative is organised as a voluntary international partnership which aims to improve geophysical scientific research and geohazard assessment, promoting rapid and effective uptake of scientific results for enhanced societal benefits in Disaster Risk Reduction (DRR). The focus of the initiative are areas with important scientific problems and high risk levels, known as the Supersites and the Natural Laboratories. CEOS Agencies support the work by providing satellite imagery over the Supersite areas at no cost. GSNL objectives include:
- to enable the global scientific community with open, full and easy access to a variety of space- and ground-based data over the Supersites and the Natural Laboratories;
- to promote advancements in geohazard science over the selected sites;
- to promote rapid uptake of scientific results by DRR stakeholders and decision makers;
- to innovate technologies, processes, and communication models, enhancing data sharing, global scientific collaboration, and capacity building in geohazard science.
Once data is openly available, the data often still requires pre-processing before it is a useful form, from which information can be derived. This process is complicated and time consuming, often preventing non-experts from using satellite EO data for their benefit. To help mitigate this barrier, CEOS has recently begun forming Product Family Specifications (PFS), a set of guidelines to help data providers pre-process the data to make it more accessible to users. Once a data provider has undergone a self-assessment, the Working Group on Calibration and Validation (WGCV) provides a thorough peer-assessment. If the data meets all the requirements, the data provider can call their data ‘CEOS Analysis Ready Data’ (CARD), and can use the CARD logo. ‘Analysis Ready Data’ is a term used broadly across the industry with many varying levels of definition, hence CARD aims to give data users a less ambiguous definition that is internationally recognised. ARD was a focus of the 2020-21 CEOS SIT Co-Chairs, the Commonwealth Scientific and Industrial Research Organisation (CSIRO) and Geoscience Australia (GA), and has been used in international initiatives such as Digital Earth Africa.
The United Nations Sustainable Development Goals (SDGs) are the 17 goals that form the core of the 2030 Agenda for Sustainable Development. They recognize that ending poverty and other deprivations must go hand-in-hand with strategies that improve health and education, reduce inequality, and spur economic growth – all while tackling climate change and working to preserve our oceans and forests. In 2016, the CEOS-SDG ad hoc team was formed to support the work of GEO and other stakeholders processing the Agenda using satellite imagery. The team was extended several times, and at the 2021 CEOS Plenary, the team was disbanded to form a small permanent SDG coordination group, led by the Systems Engineering Office. In collaboration with GEO, the team published the ‘Earth Observations in support of the 2030 Agenda for Sustainable Development’. The report highlights the need for free and open data, noting that not all nations are able to develop and launch their own Earth observation satellites, and hence the availability of the data from these missions, for all nations, is of fundamental importance to their uptake and global impact. CEOS, through its new SDG coordination team will continue to investigate better ways to access this data and provide critical tools for global users focused on SDG objectives.