3. Dieback detection
STEP 3: Detection of dieback by comparing the model-predicted vegetation index with the actual vegetation index
This step allows the detection of dieback. For each SENTINEL date not used for training, the actual vegetation index is compared to the vegetation index predicted from the model calculated in the previous step. If the difference exceeds a threshold, an anomaly is detected. If three successive anomalies are detected, the pixel is considered as suffering from dieback. If after being detected as suffering from dieback, the pixel has three successive dates without anomalies, it is no longer considered as suffering from dieback. Those periods between detection and return to normal can be saved along with an associated stress index. This stress index can either be the mean of the difference between the vegetation index and its prediction, or a weighted mean where for each date used, the weight corresponds to the number of the date from the first anomaly :
The input parameters are :
- data_directory: The path of the output folder where the detection results will be written.
- threshold_anomaly: Threshold at which the difference between the actual and predicted vegetation index is considered as an anomaly
- max_nb_stress_periods : Maximum number of stress periods, pixels with a higher number of stress periods are masked in exports. Unused if stress_index_mode is None.
- stress_index_mode : Chosen stress index, if 'mean', the index is the mean of the difference between the vegetation index and the predicted vegetation index for all unmasked dates after the first anomaly subsequently confirmed. If 'weighted_mean', the index is a weighted mean, where for each date used, the weight corresponds to the number of the date (1, 2, 3, etc...) from the first anomaly. If None, the stress periods are not detected, and no informations are saved
- vi: Vegetation index used, can be ignored if the compute_masked_vegetationindex step has been used.
- path_dict_vi : Path to a text file allowing to add usable vegetation indices. If not filled in, only the indices provided in the package are usable (CRSWIR, NDVI, NDWI). The file examples/ex_dict_vi.txt gives an example on how to format of this file. It is necessary to fill in its name, its formula, and "+" or "-" depending on whether the index's value increases or decreases in case of diebacks. Can be ignored in if it has been done previously in the compute_masked_vegetationindex step.
The outputs of this step, in the data_directory folder, are :
- In the DataDieback folder, three rasters:
- count_dieback : the number of successive dates with anomalies
- first_date_unconfirmed_dieback : The date index of latest potential state change of the pixels, first anomaly if pixel is not detected as dieback, first non-anomaly if pixel is detected as dieback, not necessarily confirmed.
- first_date_dieback: The index of the first date with an anomaly in the last series of anomalies
- state_dieback: A binary raster whose value is 1 if the pixel is detected as suffering from dieback (at least three successive anomalies)
- In the DataStress folder, if stress_index_mode is not None, four rasters:
- dates_stress : A raster with max_nb_stress_periods*2+1 bands, containing the date indices of the first anomaly, and of return to normal for each stress period.
- nb_periods_stress: A raster containing the total number of stress periods for each pixel
- cum_diff_stress: a raster with max_nb_stress_periods+1 bands containing containing for each stress period the sum of the difference between the vegetation index and its prediction, multiplied by the weight if stress_index_mode is "weighted_mean"
- nb_dates_stress : a raster with max_nb_stress_periods+1 bands containing the number of unmasked dates of each stress period.
- stress_index : a raster with max_nb_stress_periods+1 bands containing the stress index of each stress period, it is the mean or weighted mean of the difference between the vegetation index and its prediction depending on stress_index_mode, obtained from cum_diff_stress and nb_dates_stress The number of bands of these rasters is meant to account for each potential stress period, and another for a potential final dieback detection
- In the DataAnomalies folder, a raster for each date Anomalies_YYYY-MM-DD.tif whose value is 1 where anomalies are detected.
- If stress_index_mode is provided, in the TimelessMasks" folder, the binary raster too_many_stress_periods_mask.tif which is 1 for pixels where the number of stress periods is inferior or equal to max_nb_stress_periods**, otherwise 0.
How to use
From a script
from fordead.steps.step3_dieback_detection import dieback_detection
dieback_detection(data_directory = <data_directory>)
From the command line
fordead dieback_detection [OPTIONS]
See detailed documentation on the site
How it works
Importing information on previous processes and deleting obsolete results if they exist
The informations about the previous processes are imported (parameters, data paths, used dates...). If the parameters used have been modified, all the results from this step onwards are deleted. Thus, unless the parameters have been modified or this is the first time this step is performed, the detection of dieback is updated using only with the new SENTINEL dates.
Importing the results of the previous steps
The coefficients of the vegetation index prediction model are imported, as well as the array containing the index of the first date used for the detection. The arrays containing the information related to the detection of diebacks (state of the pixels, number of successive anomalies, index of the date of the first anomaly) are initialized if the step is used for the first time, or imported if it is an update of the detection.
For each date not already used for dieback detection:
Import of the calculated vegetation index and the mask
Functions used: import_masked_vi()
(OPTIONAL - if correct_vi is True in previous model calculation step Correction of the vegetation index using the median vegetation index of the unmasked pixels of interest across the entire area
- Masking of the pixels not belonging to the area of interest, or masked
- Calculation of the median vegetation index on the remaining pixels of the whole area
- Calculation of a correction term, by difference between the calculated median and the prediction of the model calculated during the previous step from the median calculated for all the dates
- Application of the correction term by adding it to the value of the vegetation index of every pixel
Functions used: correct_vi_date()
Prediction of the vegetation index at the given date.
The vegetation index is predicted from the model coefficients.
Functions used: prediction_vegetation_index()
Anomalies are detected by comparing the vegetation index with its prediction. Knowing whether the vegetation index is expected to increase or decrease in case of dieback, anomalies are detected where the difference between the index and its prediction is greater than threshold_anomaly in the direction of expected change in case of dieback.
Functions used: detection_anomalies()
Detection of dieback
The successive anomalies are counted, the pixel is considered suffering from dieback if there are three successive anomalies. If the pixel is considered as suffering from dieback, the successive dates without anomalies are counted instead, and the pixel is not considered as suffering from dieback anymore if there are three successive dates without anomalies,
Functions used: detection_dieback()
Saving stress periods information (OPTIONAL - if stress_index_mode is not None)
The rasters containing stress periods information are updated, the number of stress periods is updated when pixels return to normal. When changes of state are confirmed, the first date of anomaly or return to normal is saved. For each date, the number of dates within stress periods is updated if the pixel is unmasked and in a stress period. The difference between the vegetation index and its prediction is added to the cum_diff_stress raster, after being multiplied be the number of the date if stress_index_mode is "weighted_mean".
Functions used: save_stress()
Creating a mask for pixels which went through too many stress periods
The number of periods of stress for each pixel is compared to the max_nb_stress_periods parameter, resulting in the mask too_many_stress_periods_mask. This mask will invalidate data when exporting, making timelapses and so on.
The stress index is calculated
If stress_index_mode is "mean", the stress index raster is the cum_diff_stress raster divided by the nb_dates_stress raster If stress_index_mode is "weighted_mean", the stress index raster is the cum_diff_stress raster divided by the sum of the weights (1+2+3+...+ nb_dates_stress)
### Writing the results The information related to the detection of dieback, stress data, stress indices are written as well as the too_many_stress_periods_mask. All parameters, data paths and dates used are also saved.
Functions used: write_tif(),