Skip to content

Module fordead.reflectance_extraction

Functions

attribute_id_to_obs

def attribute_id_to_obs(
    obs,
    name_column
)
Adds an ID column if it doesn't already exists. If column named after name_column parameter does not exist in the geodataframe, adds one with integers from 1 to the number of observations.

Parameters
----------
obs : geopandas GeoDataFrame
    Observation points or polygons
name_column : str
    Name of the ID column.

Returns
-------
obs : geopandas GeoDataFrame
    Observation points or polygons with added column named using parameter name_column if it doesn't already exist.

buffer_obs

def buffer_obs(
    obs,
    buffer,
    name_column
)
Applies a buffer to dilate or erode observations. Names of discarded observations too small compared to a eroding buffer are printed.

Parameters
----------
obs : geopandas GeoDataFrame
    Observation points or polygons with a column named name_column used to identify observations.
buffer : int
    Length in meters of the buffer used to dilate (positive integer) or erode (negative integer) the observations. Some observations may disappear completely if a negative buffer is applied.
name_column : str
    Name of the column used to identify observations.

Returns
-------
obs : geopandas GeoDataFrame
    Observation polygons with the buffer applied.

extract_points

def extract_points(
    x,
    df,
    **kwargs
)
_summary_

Parameters
----------
x : xarray.DataArray or xarray.Dataset
df : pandas.DataFrame
    Coordinates of the points

Returns
-------
xarray.DataArray or xarray.Dataset
    The points values.

Examples
--------
>>> import xarray as xr
>>> import pandas as pd
>>> import dask
>>> da = xr.DataArray(
... # np.random.random((100,200)),
... dask.array.random.random((100,200), chunks=10),
... coords = [('x', range(100)), ('y', range(200))]
... )
>>> df = pd.DataFrame(
...    dict(
...        x=np.random.permutation(range(100))[:100]+np.random.random(100),
...        y=np.random.permutation(range(100))[:100]+np.random.random(100),
...        z=range(100),
...    )
... )
>>> extract_points(da, df, method="nearest", tolerance=.5)

extract_raster_values

def extract_raster_values(
    tile_coll,
    points,
    bands_to_extract=None,
    extracted_reflectance=None,
    export_path=None,
    chunksize=512,
    by_chunk=True,
    dropna=True,
    dtype=<class 'int'>
)
Sample raster values for each XY points

Parameters
----------
tile_coll : pystac.ItemCollection | xarray.DataArray
    Item_collection of a unique MGRS Tile or the corresponding xarray DataArray
points : geopandas.GeoDataFrame
    Observation points
bands_to_extract : list of strings
    List of bands to extract. If None, all bands are extracted.
extracted_reflectance: pandas.DataFrame
    Table of already sampled raster values, not to extract again.
    Expected columns are "area_name", "Date" and an ID columns to merge with points dataframe.
export_path : str, optional
    Path to export the result
by_chunk : bool, optional
    If True, the extraction is splitted in chunks and saved after each chunk. The default is False.
    This is especially useful for long time series extraction.
    Saving each chunk avoids loosing already extracted values in case of interruption,
    e.g. if the server fails to respond to the request which occurs time to time with Planetary Computer
chunksize : int, optional
    Size of the chunk {x,y}. The default is 512, the same as the default chunksize for COG.
    This is the size of the chunk that will be (down)loaded to extract points, see stackstac.stack.
    For info, a chunk of an S2 band (int16) is 512x512x2 = 524KB.
dropna : bool, optional
    If True, drop rows with NaN values. The default is True.
dtype : type, optional
    Type of the extracted values. The default is int.
    The conversion from float is done after the extraction, so that
    NaNs are dropped before converting to another type.
Returns
-------
pandas DataFrame or None
    The result is written in file `export_path`.

get_already_extracted

def get_already_extracted(
    export_path,
    name_column,
    bands_to_extract
)
Returns already extracted acquisition dates for each observation and each tile.

Parameters
----------
export_path : str
    Path to reflectance.csv
name_column : str
    Name of the ID column in obs
bands_to_extract : list
    List of bands to extract

Returns
-------
pandas DataFrame
    Already extracted acquisition dates for each observation and each tile
    Columns are "area_name", name_column, "Date".

get_bounds

def get_bounds(
    obs
)
Get bounds of around of a polygons so it matches the limits of Sentinel-2 pixels

Parameters
----------
obs : geopandas GeoDataFrame
    A polygon

Returns
-------
bounds : 1D array
    Bounds around the polygon

get_grid_points

def get_grid_points(
    obs_polygons,
    sen_polygons,
    name_column
)
Generates points in a grid corresponding to the centroids of Sentinel-2 pixels inside the polygons.

Parameters
----------
obs_polygons : geopandas GeoDataFrame
    Observation polygons with a column named name_column used to identify observations.
sentinel_dir : str
    Path of directory containing Sentinel-2 data.
name_column : str
    Name of the column used to identify observations.
list_tiles : list
    A list of names of Sentinel-2 directories. If this parameter is used, extraction is  limited to those directories.

Returns
-------
grid_points : geopandas GeoDataFrame
    Points corresponding to the centroids of the Sentinel-2 pixels of each Sentinel-2 tile intersecting with the polygons.

get_polygons_from_sentinel_dirs

def get_polygons_from_sentinel_dirs(
    sentinel_dir,
    list_tiles
)
Creates a Sentinel-2 tiles extent vector from existing Sentinel-2 data

Parameters
----------
sentinel_dir : str
    Path of directory containing Sentinel-2 data.
list_tiles : list
    A list of names of Sentinel-2 directories. If this parameter is used, extraction is  limited to those directories.

Returns
-------
concat_areas : geopandas GeoDataFrame
    Vector containing the extent of Sentinel-2 tiles contained in sentinel_dir directory, with 'epsg' and 'area_name' columns corresponding to the projection system and the name of the tile derived from the name of the directory containing its data.

get_sen_obs_intersection

def get_sen_obs_intersection(
    obs_polygons,
    sen_polygons,
    name_column
)
Observations polygons are intesected with Sentinel-2 tiles extent vector. Intersections where the observation polygon did not entirely fit in the sentinel-2 tile are removed.
Observation polygons which intersect no Sentinel-2 tiles are removed and their IDs are printed.

Parameters
----------
obs_polygons : geopandas GeoDataFrame
    Polygons of observations.
sen_polygons : geopandas GeoDataFrame
    Polygons of Sentinel-2 tiles extent with 'area_name' and 'epsg' columns corresponding to the name of the tile, and the projection system respectively.
name_column : str
    Name of the ID column.

Returns
-------
geopandas GeoDataFrame
    Intersection of obs_polygons and sen_polygons, with incomplete intersections removed.

polygons_to_grid_points

def polygons_to_grid_points(
    polygons,
    name_column
)
Converts polygons to points corresponding to the centroids of the Sentinel-2 pixels of each Sentinel-2 tile intersecting with the polygons.
Prints polygons with no pixels centroids inside of them.  

Parameters
----------
polygons : geopandas GeoDataFrame
    Observation polygons with 'area_name' and 'epsg' columns, corresponding to the name of a Sentinel-2 tile and its CRS respectively.
name_column : str
    Name of the ID column.

Returns
-------
grid_points : geopandas GeoDataFrame
    Points corresponding to the centroids of the Sentinel-2 pixels of each Sentinel-2 tile intersecting with the polygons.

process_points

def process_points(
    points,
    sen_polygons,
    name_column
)
Intersects observation points with Sentinel-2 tiles extent vector.
Adds an 'id_pixel' column filled with 0 so the resulting vector can be used in export_reflectance function.
Points outside of available Sentinel-2 tiles are detected and their IDs are printed.

Parameters
----------
points : geopandas GeoDataFrame
    Observation points used for intersection
sen_polygons : geopandas GeoDataFrame
    Polygons of Sentinel-2 tiles extent with 'area_name' and 'epsg' columns corresponding to the name of the tile, and the projection system respectively.
name_column : str
    Name of the ID column in points

Returns
-------
obs_intersection : geopandas GeoDataFrame
    Intersection of points and sen_polygons, with added 'id_pixel' columns