leaf_engine.etl.concat

Concatenates multiple datasets into a single shipments DataFrame, and post- concatenation deduplication logic (either custom logic, implemented in a shipper’s custom_drop_duplicates.py or the default logic).

Functions

_custom_drop_duplicates(→ pandas.DataFrame)

_default_drop_lane_duplicates(→ pandas.DataFrame)

_default_drop_shipment_duplicates(→ pandas.DataFrame)

_describe_lane_datasets(df)

_describe_shipment_datasets(df)

_drop_duplicates(→ pandas.DataFrame)

concat_pipeline(→ pandas.DataFrame)

Module Contents

leaf_engine.etl.concat._custom_drop_duplicates(df, module_name) pandas.DataFrame
Return type:

pandas.DataFrame

leaf_engine.etl.concat._default_drop_lane_duplicates(df) pandas.DataFrame
Return type:

pandas.DataFrame

leaf_engine.etl.concat._default_drop_shipment_duplicates(df) pandas.DataFrame
Return type:

pandas.DataFrame

leaf_engine.etl.concat._describe_lane_datasets(df)
leaf_engine.etl.concat._describe_shipment_datasets(df)
leaf_engine.etl.concat._drop_duplicates(df: pandas.DataFrame, lane_level) pandas.DataFrame
Parameters:

df (pandas.DataFrame) –

Return type:

pandas.DataFrame

leaf_engine.etl.concat.concat_pipeline(input_dfs: List[pandas.DataFrame], lane_level=False) pandas.DataFrame
Parameters:

input_dfs (List[pandas.DataFrame]) –

Return type:

pandas.DataFrame