Dataset Description
The European Space Agency (ESA) WorldCover product provides global land cover maps for 2020 & 2021 at 10 m resolution based on Copernicus Sentinel-1 and Sentinel-2 data. The WorldCover product comes with 11 land cover classes and has been generated in the framework of the ESA WorldCover project, part of the 5th Earth Observation Envelope Programme (EOEP-5) of the European Space Agency. A first version of the product, containing the 2020 map was released in October 2021. The 2021 map was released in October 2022 using an improved algorithm.
Usage
The WorldCover map is a global-scale dataset, generated with a single methodology applied over all regions. As such, the accuracy of the map may vary between locations and with scale. That said, a crucial aspect for WorldCover was the involvement of several end users active in different domains who provided primary input for all engineering aspects and followed the whole project workflow from design up to validation and uptake. Consequently, WorldCover intends to provide a substantial benefit to various user communities and expands the established global land cover base of users and the development of novel services.
The WorldCover 2020 and 2021 maps were generated with different algorithm versions and therefore changes between the maps should be treated with caution as these contain both real land cover changes as well as changes due to the used algorithms. An updated version of the 2020 map using the same algorithm as the 2021 map is planned to be released in near future. The WorldCover 2021 Dataset has achieved over 20,000 downloads.
Methodology
The ESA WorldCover global land cover product builds further on the GlobCover and CCI Land Cover experiences from the European Space Agency. The algorithm used to generate the WorldCover global land cover product is based on the algorithm used to generate the dynamic yearly Copernicus Global Land Service Land Cover (CGLS-LC) map at 100 m resolution (Buchhorn et al., 2020). The CGLS-LC workflow uses 100 m, 5-day, Proba-V data as an input which were re-processed to the Sentinel-2 UTM grid together with training data obtained at 10 m resolution. For the generation of the WorldCover map however both Sentinel-2 multi-spectral image data and Sentinel-1 C-band Synthetic Aperture Radar (SAR) data are used instead of Proba-V data.
The following methodological steps were included in the production of the WorldCover maps:
- Level 2A (L2A) and Ground Range Detected (GRD) products for Sentinel-2 (S2) and Sentinel-1 (S1) respectively, are selected and either filtered for cloud cover (Sentinel-2) or pre-processed to Gamma0 backscatter time series (for Sentinel-1).
- Clouds and, cloud shadows are removed in the Sentinel-2 reflectance bands. Ten days median composites are computed from the cleaned band time series and additional vegetation indices (VI) for each time series step are calculated. For S1 bands, an additional multitemporal speckle filter is used before compositing the timeseries.
- Starting from those cleaned time series, temporal descriptive statistics such as the 10th , 50th, 90th percentile and the interquartile range are calculated. These are used as features together with some additional features extracted from auxiliary layers (e.g. the Copernicus Global Digital Elevation Model) in the classification.
- Next, different models (scenarios) are trained with a gradient boosting decision tree algorithm (CatBoost) using a manually labelled set of training data at 10 m resolution available from the Copernicus Global Land Service Land Cover, complemented with training data obtained from OpenStreetMap, Global Surface Water Explorer and Global Mangrove Watch.
- Finally, the different scenarios are combined into a final land cover map through the application of different expert rules and subsequently tiled into 3 x 3 degree tiles in geographic projection (EPSG:4326). Some of these expert rules use auxiliary datasets (e.g. OpenStreetMap, Global Surface Water Explorer, Global Mangrove Watch) in order to modulate the probabilities provided by the classifier to produce a better final prediction.
Uncertainty and Accuracy
The WorldCover data are assessed using an independent statistical accuracy assessment, map comparisons, spatial accuracy assessment and end-user assessments. The statistical accuracy assessment follows the Committee on Earth Observation Satellites (CEOS) Working Group on Calibration and Validation (WGCV) Land Product Validation (LPV) requirements. A global stratification independent of any land cover map and using the Sentinel 2 Universal Transverse Mercator (UTM) grid as a geographic base has been applied to provide more than 21,000 primary sampling units (PSUs) with each containing one hundred 10x10 m reference pixels for the years 2020 & 2021 - for robust accuracy assessment at global and continental levels (minimum of 3000 PSUs per continent). The WorldCover 2020 and 2021 products reach an overall accuracy of +/- 75 and 77%, respectively. More details can be found in the Product Validation Report.
Dataset Sustainment
The WorldCover dataset is a demonstration product from the European Space Agency and is expected to migrate to the operational Copernicus global land cover monitoring service in the coming years.