- class eocrops.climatools.resampling.TempResampling(range_dates=('2017-01-01', '2017-12-31'), stop='2017-12-31', smooth=False, id_column='key', varname_gdd='sum_Growing Degree days daily max/min', period_nas_rate=0.8, drop_nas_rate=0.2, bands=None, subset_id_fields=None)[source]#
Bases:
object
Resample time series (e.g. satellite image time series and daily weather data) over accumulated GDU periods (thermal time).
- Parameters:
- range_datesstr
Range of dates from the time series data. For satellite data, the data should be already be resampled using fixed periods from the start_date (e.g. 16- day periods from 1st January)
- stoptuple
Stoping date for temporal resampling. Very convenient if we train an in-season model to keep information only prior this date.
- smoothbool
Apply smoothing over time series when we resample into daily data in the pipeline
- id_columnstr
Column from the weather data file which refers to the identifier of the observation.
- varname_gddstr, optional
Name of the column for the weather dataset that refers to the accumulated GDUs
- period_nas_ratefloat
Keep a period only if it is completed at x %
- drop_nas_ratefloat
range of dates from the time series, by default yearly : (“01-01”, “12-31”)
- bandstuple, optional
range of dates from the time series, by default yearly : (“01-01”, “12-31”)
- subset_id_fieldstuple, optional
range of dates from the time series, by default yearly : (“01-01”, “12-31”)
Methods
fill_missing_columns
(output[, na_value])Fill nas values since some fields do not match with GDU intervals (especially for high intervals)
get_gdd_value_peak
(features_data[, fname, ...])Get GDD and vegetation index values when the daily time series of the vegetation index is maximum
Get last accumulated GDD observed for each observation
get_sat_features
(feature_data, fname)Retrieve time series vegetation index from 3D array given a feature name
get_weather_feature
(fname)Get subset of column from Meteoblue data corresponding to a given feature
load_meta_data
(feature_vector, filepath)Load metadata file that contains information of the fields
load_sat_data
(filepath)Load 3D arrays thar contains satellite data
load_weather_data
(filepath)Load weather data reformated and saved from csv file (filepath).
resample_s2
(features_data, fname[, stat, ...])Resample satellite data over periods (thermal or calendar) from the planting date
resample_weather
(fname, stat[, increment, ...])Resample weather data over periods (thermal or calendar) from the planting date
- get_weather_feature(fname)[source]#
Get subset of column from Meteoblue data corresponding to a given feature
- load_meta_data(feature_vector, filepath)[source]#
Load metadata file that contains information of the fields
- load_weather_data(filepath)[source]#
Load weather data reformated and saved from csv file (filepath). It corresponds to the output of from (1) Meteoblue_client (2) format_data (predfin.data_extraction.ce_hub) See the example from workflow_yield_prediction to understand how to build those files
- Parameters:
- filepath :str
path where the csv file is saved
- Returns:
- weather_data merged with meta_data
- get_sat_features(feature_data, fname)[source]#
Retrieve time series vegetation index from 3D array given a feature name
- get_gdd_value_peak(features_data, fname='Cab', ub_gdd=900, lb_gdd=400, days_range=8)[source]#
Get GDD and vegetation index values when the daily time series of the vegetation index is maximum
- fill_missing_columns(output, na_value=None)[source]#
Fill nas values since some fields do not match with GDU intervals (especially for high intervals)
- resample_s2(features_data, fname, stat='mean', increment=120, thermal_time=True, period_length=8, cumsum=False, remove_outliers=True)[source]#
Resample satellite data over periods (thermal or calendar) from the planting date
- Parameters:
- features_datalist
list of S2 features (S2 data loader)
- fnamestr
name of the S2 feature to resample over GDU intervals
- statstr
aggregation function over periods
- incrementint
number of units between each period (accumulated gdd or days)
- thermal_timebool
Resample over thermal time or calendar
- days_rangeint
S2 resampling resolution (8 days by default) from the original file in load_sat_data
- cumsumbool
compute cumulated cum of the feature rather than the mean over GDU intervals
- remove_outliersbool
remove outliers from time series using quantiles (0.02) ==> cloud
- Returns:
- pd.DataFramedataset with vegetation indices resampled
- resample_weather(fname, stat, increment=120, thermal_time=True)[source]#
Resample weather data over periods (thermal or calendar) from the planting date
- Parameters:
- fnamestr
name of the S2 feature to resample over GDU intervals
- statstr
Aggregation over periods (e.g. mean)
- incrementint
Number of units between each period (accumulated gdd or days)
- thermal_timebool
Resample over thermal time or calendar
- Returns:
- pd.DataFramedataset with weather data resampled