Skip to content

Extracting reflectance from Sentinel-2 data

This step aims at extracting reflectance from Sentinel-2 data using a vector file containing points and exporting the data to a csv file. If new acquisitions are added to the Sentinel-2 directory, new data is extracted and added to the existing csv file.


The input parameters are :

  • obs_path : The path to a vector file containing observation points, must have an ID column corresponding to name_column parameter, an 'area_name' column with the name of the Sentinel-2 tile from which to extract reflectance, and a 'espg' column containing the espg integer corresponding to the CRS of the Sentinel-2 tile.
  • sentinel_source : Can be either 'Planetary', in which case data is downloaded from Microsoft Planetary Computer stac catalogs, or the path of the directory containing Sentinel-2 data.
  • export_path : The path to write csv file with extracted reflectance.
  • name_column (optional) : The name of the ID column. The default is "id".
  • cloudiness_path (optional) : Path of a csv with the columns 'area_name','Date' and 'cloudiness', can be calculated by the extract_cloudiness function. Can be ignored if sentinel_source is 'Planetary'
  • lim_perc_cloud (optional) : Maximum cloudiness at the tile scale, used to filter used SENTINEL dates. Between 0 and 1.
  • bands_to_extract (optional) : List of bands to extract (ex : ["B1","B8A","Mask"]) The "Mask" band corresponds to "CLM" for THEIA data, and "SCL" for Planetary.
  • tile_selection (optional) : List of tiles from which to extract reflectance (ex : ["T31UFQ", "T31UGQ"]). If None, all tiles are extracted.
  • start_date (optional) : First date of the period from which to extract reflectance
  • end_date (optional) : Last date of the period from which to extract reflectance


The output is a csv file at export_path. It contains the following columns :

  • epsg : The CRS of the Sentinel-2 tile from which data was extracted
  • area_name : The name of the Sentinel-2 tile from which data was extracted
  • an ID column corresponding to the name_column parameter
  • id_pixel : The ID of the pixel whose centroid is at the point location in 10m Sentinel data in the corresponding epsg. id_pixel goes from 1 to the number of pixels in the observation.
  • Date : Date of the Sentinel-2 acquisition
  • One column for each band in the bands_to_extract list containg the value extracted from Sentinel-2 data.

How to use

From a script

from fordead.validation.extract_reflectance import extract_reflectance

extract_reflectance(obs_path = <obs_path>, export_path = <export_path>, name_column = <name_column>)

From the command line

fordead extract_reflectance [OPTIONS]

See detailed documentation on the site

How does it work ?

Importing the vector file containing points used as extraction locations

The vector file at obs_path is imported using the geopandas package.

Importing already extracted data if it exists

  • If a file already exists at export_path, it is imported.
  • If changes in last extraction's number of observations are detected, a message is printed to the console and the user is asked if they want to continue. This can prevent issues in case the user uses the wrong obs_path or export_path by mistake.

    Function used: get_already_extracted()

Extracting reflectance from Sentinel-2 data

Exporting extracted reflectance

Extracted reflectance is written at export_path. If the file already exists, new data is added to the already existing file.