CEOS Development Environment

Looking Forward: The Future of Open Data

This is the final in a series of three articles on Open Data that has been prepared as a contribution to the 2021 NASA CEOS Chair theme “Space-based Earth Observation Data for Open Science and Decision Support”.

  • Open Data for Impact looking at the origins of data sharing principles, and satellites and open data.
  • Outcomes of Open Data Policies discussing some projects that have resulted as a consequence of the availability of open data.
  • Looking Forward: The Future of Open Data (this article) examining the future direction of open data for Earth observation, including CEOS Agency missions.

The Future of Open Data

Recently GEO has begun reviewing the GEO Statement on Open Science, and plans to  reformulate it to focus on ‘Open Knowledge’. This concept, while inclusive of Open Science, better supports decision making using EO. GEO aims to reiterate the critical role that full, secure and open sharing of Earth observations data and knowledge will play in deeper integration of Earth observation technologies into the digital economy.

Components of the GEO Statement on Open Knowledge

Open Knowledge has proven to have played a pivotal role in tackling global challenges such as the current COVID-19 pandemic (see for example ESA’s RACE project, NASA’s COVID-19 Dashboard, and JAXA for Earth on COVID-19). Open Knowledge is also essential to achieve the United Nations Sustainable Development Goals, the Paris Agreement objectives, the Sendai Framework for Disaster Risk Reduction, and a reduction in the knowledge divide among countries.

Openness in knowledge generation can help address the local, regional, and global needs of the society by creating an inclusive and interconnected environment. An open and participatory environment provides equal opportunities and chances for all to gain academic literacy and benefit from new knowledge and innovation.

Open Knowledge is the next step beyond Open Data. As we see more and more open data becoming available, it is clear that we need all the areas contained within Open Knowledge to work together to gain maximal benefits for society. As GEO continues their work to implement the Statement on Open Knowledge, we are also seeing more derived datasets, non-traditional datasets and in situ datasets become openly available.

Derived datasets focus on a specific measurement type, and often contain the raw data from multiple sensors. For example, the Global Human Settlement (GHS) framework, available on Sentinel Hub, produces global maps of built-up, population density and settlements to monitor human presence on Earth over time. The Global Human Settlement Layer GHS-BUILT-S2 is a global map of built-up areas (expressed as probabilities) at 10 m spatial resolution. It was derived from a Sentinel-2 global image composite for the reference year 2018 using Convolutional Neural Networks.

Non-traditional datasets include drone data and the Internet-of-Things. Drones allow us to observe our Earth in a much more detailed and focused way. GeoNadir is a community-driven project founded to establish an open repository of drone data, with a focus on Findable, Accessible, Interoperable, and Reusable (FAIR) drone data. At the time of writing, the GeoNadir team are currently doing beta testing for GeoNadir and will go live with the platform very soon.

In situ datasets are also becoming more openly available. John Deere, an international agricultural solutions company, collects terabytes of in situ data from their machinery, and are now promoting products that encourage the sharing of this data between their customers. In situ data can provide many additional societal benefits, so we hope that this type of data can be shared freely and openly in future.

CEOS Agency Planned Open Data Missions 

As we look forward to the future of Open Data and EO, we note some of the many planned satellite missions whose agencies have agreed to openly share the data collected, in some cases continuing a long history of EO data.

The NASA-ISRO SAR (NISAR) Mission (CEOS DB) is a collaboration between US and Indian space agencies, and will measure Earth’s changing ecosystems, dynamic surfaces, and ice masses providing information about biomass, natural hazards, sea level rise, and groundwater, alongside many other applications. Launch is currently targeted for 2023, with all NISAR science data freely available and open to the public, consistent with the long-standing NASA Earth Science open data policy. 

Timelapse of Doha, Qatar using Landsat & Copernicus images. Credit: Google Earth Timelapse (Google, Landsat, Copernicus) 

With the Landsat image archive hosting images since the launch of Landsat-5 in 1984 to the current day, this provides an amazing continuous record of the Earth over the last almost 40 years. With Landsat data now freely and openly available, it is an invaluable resource for climate research that must be continued for many years into the future. After the launch of Landsat-9 in September 2021, the teams at USGS and NASA are now looking at planning for Landsat-10 (aka Landsat Next) to continue the data archive. The Landsat Next mission is currently in its early phase, with mission launch targeted for 2029.

Another long-running mission series is ESA’s Copernicus missions, whose teams are working on the development of the Next Generation  Copernicus missions, which are intended to be the backbone of the Copernicus infrastructure after 2030. This will provide continuity to both the current Sentinel constellation and the forthcoming Sentinel Expansion missions, planned for launch before 2030. Many applications supported by the Copernicus missions in the environmental or climate change domains rely on systematic and uninterrupted data provision, alongside the key principle for the Copernicus data policy being “free, full and open”. 

The ‘Fourth V’ of Big Data: Veracity

As the concept of big data grew, many businesses and not-for-profits struggled keeping up with what big data is and how to effectively produce it. Data scientists began paraphrasing the five W’s of journalism to tackle this problem, coming up with the five V’s of big data. They are Volume, Velocity, Variety, Veracity, and Value.  

For satellite data, veracity is fast becoming a significant issue, referring  to the quality and accuracy of data, and the level of trust users can have in the collected data. The CEOS Working Group on Calibration and Validation (WGCV) works hard to ensure the veracity of all CEOS products, addressing the need for standardising the ways that different data sources are combined to ensure interoperability among existing and future Earth Observing systems. Their current activities focus mainly on the requirements identified by GEO and their goal to achieve a Global Earth Observation System of Systems (GEOSS).

To help improve the veracity of satellite data, CEOS Agencies are evaluating and developing missions aimed at high quality data that can be used for the calibration and validation of other satellite data. ESA, along with the UK’s National Physical Laboratory, is developing the TRUTHS mission, which is set to provide measurements of incoming solar radiation and of radiation reflected from Earth back out into space as traceable International System of Units, which will be used to calibrate data from other satellites. The launch of TRUTHS is currently targeted for 2029, after moving from its feasibility phase into its preliminary design phase following COP26 in November 2021. Similar concepts are being developed by other CEOS Agencies, such as the Satellite Cross-Calibration Radiometer (SCR) mission being considered by Australia.

The Global Human Settlement Layer GHS-BUILT-S2, also showing the degree of urbanisation SMod-2015. Image: EC-JRC