class eocrops.climatools.resampling.TempResampling(range_dates=('2017-01-01', '2017-12-31'), stop='2017-12-31', smooth=False, id_column='key', varname_gdd='sum_Growing Degree days daily max/min', period_nas_rate=0.8, drop_nas_rate=0.2, bands=None, subset_id_fields=None)[source]#

Bases: object

Resample time series (e.g. satellite image time series and daily weather data) over accumulated GDU periods (thermal time).

Parameters:
range_datesstr

Range of dates from the time series data. For satellite data, the data should be already be resampled using fixed periods from the start_date (e.g. 16- day periods from 1st January)

stoptuple

Stoping date for temporal resampling. Very convenient if we train an in-season model to keep information only prior this date.

smoothbool

Apply smoothing over time series when we resample into daily data in the pipeline

id_columnstr

Column from the weather data file which refers to the identifier of the observation.

varname_gddstr, optional

Name of the column for the weather dataset that refers to the accumulated GDUs

period_nas_ratefloat

Keep a period only if it is completed at x %

drop_nas_ratefloat

range of dates from the time series, by default yearly : (“01-01”, “12-31”)

bandstuple, optional

range of dates from the time series, by default yearly : (“01-01”, “12-31”)

subset_id_fieldstuple, optional

range of dates from the time series, by default yearly : (“01-01”, “12-31”)

Methods

fill_missing_columns(output[, na_value])

Fill nas values since some fields do not match with GDU intervals (especially for high intervals)

get_gdd_value_peak(features_data[, fname, ...])

Get GDD and vegetation index values when the daily time series of the vegetation index is maximum

get_last_gdd()

Get last accumulated GDD observed for each observation

get_sat_features(feature_data, fname)

Retrieve time series vegetation index from 3D array given a feature name

get_weather_feature(fname)

Get subset of column from Meteoblue data corresponding to a given feature

load_meta_data(feature_vector, filepath)

Load metadata file that contains information of the fields

load_sat_data(filepath)

Load 3D arrays thar contains satellite data

load_weather_data(filepath)

Load weather data reformated and saved from csv file (filepath).

resample_s2(features_data, fname[, stat, ...])

Resample satellite data over periods (thermal or calendar) from the planting date

resample_weather(fname, stat[, increment, ...])

Resample weather data over periods (thermal or calendar) from the planting date

get_weather_feature(fname)[source]#

Get subset of column from Meteoblue data corresponding to a given feature

load_meta_data(feature_vector, filepath)[source]#

Load metadata file that contains information of the fields

load_sat_data(filepath)[source]#

Load 3D arrays thar contains satellite data

load_weather_data(filepath)[source]#

Load weather data reformated and saved from csv file (filepath). It corresponds to the output of from (1) Meteoblue_client (2) format_data (predfin.data_extraction.ce_hub) See the example from workflow_yield_prediction to understand how to build those files

Parameters:
filepath :str

path where the csv file is saved

Returns:
weather_data merged with meta_data
get_sat_features(feature_data, fname)[source]#

Retrieve time series vegetation index from 3D array given a feature name

get_gdd_value_peak(features_data, fname='Cab', ub_gdd=900, lb_gdd=400, days_range=8)[source]#

Get GDD and vegetation index values when the daily time series of the vegetation index is maximum

get_last_gdd()[source]#

Get last accumulated GDD observed for each observation

fill_missing_columns(output, na_value=None)[source]#

Fill nas values since some fields do not match with GDU intervals (especially for high intervals)

resample_s2(features_data, fname, stat='mean', increment=120, thermal_time=True, period_length=8, cumsum=False, remove_outliers=True)[source]#

Resample satellite data over periods (thermal or calendar) from the planting date

Parameters:
features_datalist

list of S2 features (S2 data loader)

fnamestr

name of the S2 feature to resample over GDU intervals

statstr

aggregation function over periods

incrementint

number of units between each period (accumulated gdd or days)

thermal_timebool

Resample over thermal time or calendar

days_rangeint

S2 resampling resolution (8 days by default) from the original file in load_sat_data

cumsumbool

compute cumulated cum of the feature rather than the mean over GDU intervals

remove_outliersbool

remove outliers from time series using quantiles (0.02) ==> cloud

Returns:
pd.DataFramedataset with vegetation indices resampled
resample_weather(fname, stat, increment=120, thermal_time=True)[source]#

Resample weather data over periods (thermal or calendar) from the planting date

Parameters:
fnamestr

name of the S2 feature to resample over GDU intervals

statstr

Aggregation over periods (e.g. mean)

incrementint

Number of units between each period (accumulated gdd or days)

thermal_timebool

Resample over thermal time or calendar

Returns:
pd.DataFramedataset with weather data resampled