leaf_engine.adapt.adapt_utilities
Functions
|
|
|
|
|
|
|
|
|
|
|
|
|
Given a dataframe of individual shipments, estimates whether each |
|
Given a Series of dates, calculates the number of days in the date range |
|
Wrapper for groupby() to standardize and streamline group agg syntax. |
|
Standard pipeline for loading .csv files. |
|
Returns the most common value in a pd.Series or other iterable. |
|
|
|
Passes an object through a sequence of functions, where the output of one |
|
|
|
Removes default index column if it exists. |
|
|
|
|
|
Module Contents
- leaf_engine.adapt.adapt_utilities.add_multilevel_columns(df: pandas.DataFrame, top_level: str) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
top_level (str) –
- Return type:
- leaf_engine.adapt.adapt_utilities.cluster_by_city_state(df: pandas.DataFrame, od: str) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
od (str) –
- Return type:
- leaf_engine.adapt.adapt_utilities.ends(df: pandas.DataFrame, n: int = 1) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
n (int) –
- Return type:
- leaf_engine.adapt.adapt_utilities.flatten_multilevel_columns(df: pandas.DataFrame, delimiter: str = '_') pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
delimiter (str) –
- Return type:
- leaf_engine.adapt.adapt_utilities.format_datetime_columns(df: pandas.DataFrame, datetime_cols: list) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
datetime_cols (list) –
- Return type:
- leaf_engine.adapt.adapt_utilities.get_centroids(geometry_col: pandas.Series)
- Parameters:
geometry_col (pandas.Series) –
- leaf_engine.adapt.adapt_utilities.get_rate_types(_ships: pandas.DataFrame) pandas.DataFrame
Given a dataframe of individual shipments, estimates whether each shipment’s rate type is contract or spot, and returns dataframe with rate_type column added.
- Parameters:
_ships (pandas.DataFrame) –
- Return type:
- leaf_engine.adapt.adapt_utilities.get_window_size(period: pandas.Series)
Given a Series of dates, calculates the number of days in the date range and rounds that number down to the nearest multiple of 30 (up to 360 days)
- Parameters:
period (pandas.Series) –
- leaf_engine.adapt.adapt_utilities.group(df: pandas.DataFrame, by: list, agg: dict, reset_index: bool = True) pandas.DataFrame
Wrapper for groupby() to standardize and streamline group agg syntax.
Index resets by default.
- Parameters:
df (pandas.DataFrame) –
by (list) –
agg (dict) –
reset_index (bool) –
- Return type:
- leaf_engine.adapt.adapt_utilities.load_data(csv: str, converters: dict = {'d_zip': str, 'o_zip': str, 'zip': str}, datetime_cols: list = ['period']) pandas.DataFrame
Standard pipeline for loading .csv files.
Takes file’s relative path, returns formatted dataframe.
- Parameters:
- Return type:
- leaf_engine.adapt.adapt_utilities.mode(iterable: pandas.Series) object
Returns the most common value in a pd.Series or other iterable.
If no mode, returns the first value. If 2+ modes, returns the first mode. Note: This method is 3x faster than native Pandas methods.
- Parameters:
iterable (pandas.Series) –
- Return type:
- leaf_engine.adapt.adapt_utilities.ordinals_to_days(iterable: Iterable[Any])
- Parameters:
iterable (Iterable[Any]) –
- leaf_engine.adapt.adapt_utilities.pipe(obj, thru=[])
Passes an object through a sequence of functions, where the output of one is the input of the next.
Example: pipe(3, thru = [lambda x: x ** 3, str]) # => “27”
- leaf_engine.adapt.adapt_utilities.remove_index(df: pandas.DataFrame, index_col: str = 'Unnamed:') pandas.DataFrame
Removes default index column if it exists.
- Parameters:
df (pandas.DataFrame) –
index_col (str) –
- Return type:
- leaf_engine.adapt.adapt_utilities.snakecase_column_names(df: pandas.DataFrame) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type:
- leaf_engine.adapt.adapt_utilities.upcase_string_values(df: pandas.DataFrame, excl_cols=[]) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type: