leaf_engine
Subpackages
Submodules
Functions
|
|
|
|
|
|
|
Public flagging pipeline. Pipe shipments through this function to augment |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pulls lighthouse shipper data for internal run. Filters down to only latest |
|
|
|
Reads all data specified in company_params. |
|
|
|
Reads CSV, Excel, or Google Spreadsheet file from Google Drive either by |
|
|
|
|
|
|
|
|
|
Write a DataFrame to a file on Google Drive, at the specified path. |
|
Uploads file to Google Drive. |
|
|
|
Package Contents
- leaf_engine.cluster_pipeline(df: pandas.DataFrame) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type:
- leaf_engine.concat_pipeline(input_dfs: List[pandas.DataFrame], lane_level=False) pandas.DataFrame
- Parameters:
input_dfs (List[pandas.DataFrame]) –
- Return type:
- leaf_engine.filter_pipeline(df)
- leaf_engine.flag_pipeline(df: pandas.DataFrame) pandas.DataFrame
Public flagging pipeline. Pipe shipments through this function to augment them with flagging columns.
- Parameters:
df (pd.DataFrame) – Shipments DataFrame.
- Returns:
Input DataFrame with additional flagging columns.
- Return type:
pd.DataFrame
- leaf_engine.fsc_pipeline(df: pandas.DataFrame, lane_level: bool = False) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
lane_level (bool) –
- Return type:
- leaf_engine.fuel_pipeline(df)
- leaf_engine.geocode_pipeline(df: pandas.DataFrame) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type:
- leaf_engine.load_pipeline(df: pandas.DataFrame, run_type: str, dry_run: bool = False, overwrite: bool = False) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
run_type (str) –
dry_run (bool) –
overwrite (bool) –
- Return type:
- leaf_engine.map_pipeline(df, dataset_params, lane_level=False)
- leaf_engine.output_pipeline(df, lane_level=False)
- leaf_engine.pcmiler_pipeline(df)
- leaf_engine.pull_lighthouse_data(params) List[pandas.DataFrame]
Pulls lighthouse shipper data for internal run. Filters down to only latest batch.
- Parameters:
adapt_params (dict) – Adapt params.
- Returns:
DataFrame’s containing lighthouse lanes.
- Return type:
DataFrame
- leaf_engine.read_csv(input_path, **kwargs) pandas.DataFrame
- Return type:
- leaf_engine.read_data(company_params: dict) Dict[str, pandas.DataFrame]
Reads all data specified in company_params.
- leaf_engine.read_dataset(dataset_params) pandas.DataFrame | Dict[Any, pandas.DataFrame]
- Return type:
pandas.DataFrame | Dict[Any, pandas.DataFrame]
- leaf_engine.read_drive(url: str | None = None, path: str | None = None, cache: bool = True, **kwargs) pandas.DataFrame
Reads CSV, Excel, or Google Spreadsheet file from Google Drive either by url or path.
- Parameters:
url (Optional[str], optional) – File URL. Can be copied from browser navigation bar. Defaults to None.
path (Optional[str], optional) – File path. First part needs to be drive name. Defaults to None.
cache (bool, optional) – Whether to cache the result to disk. Defaults to True. This makes subsequent calls to read_drive faster.
**kwargs – Keyword arguments passed to pandas.read_csv or pandas.read_excel.
- Return type:
Examples: >>> df = read_drive(url=”https://drive.google.com/file/d/XXYYZZ/view?usp=sharing”) >>> df = read_drive(path=”Data Science/folder1/folder2/file.csv”)
- Raises:
LeafGoogleDriveException – If both url and path are None or both are not None.
LeafGoogleDriveException – If file is not found on Google Drive.
LeafGoogleDriveException – If unable to download file from Google Drive.
ValueError – If neither pd.read_csv nor pd.read_excel can read the file.
- Returns:
DataFrame read from file.
- Return type:
pd.DataFrame
- Parameters:
- leaf_engine.read_params(params_path: str | pathlib.Path) dict
- Parameters:
params_path (Union[str, pathlib.Path]) –
- Return type:
- leaf_engine.resolve_geocoding(df: pandas.DataFrame) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type:
- leaf_engine.resolve_miles(df: pandas.DataFrame) pandas.DataFrame
- Parameters:
df (pandas.DataFrame) –
- Return type:
- leaf_engine.setup(input_params: dict | str | pathlib.Path, log_file_name: str | None = None, enable_log_git: bool = True, enable_log_params: bool = True) None
- leaf_engine.to_drive(df: pandas.DataFrame, path: str, overwrite: bool = False, **kwargs) str
Write a DataFrame to a file on Google Drive, at the specified path.
- Parameters:
- Return type:
Examples: >>> df.pipe(gdrive.to_drive, “drive_name/folder1/folder2/file_name.csv”) >>> df.pipe(gdrive.to_drive, “drive_name/folder1/folder2/file_name.xlsx”)
- Raises:
LeafGoogleDriveException – Raised if overwrite is False and file exists.
LeafGoogleDriveException – Raised if path is invalid.
LeafGoogleDriveException – Raised if write response does not contain file ID.
- Returns:
Google Drive file URL.
- Return type:
- Parameters:
df (pandas.DataFrame) –
path (str) –
overwrite (bool) –
- leaf_engine.upload_to_drive(local_path: str | pathlib.Path, drive_path: str, overwrite: bool = False) str
Uploads file to Google Drive.
- Parameters:
- Raises:
LeafGoogleDriveException – Raised if overwrite is False and file exists.
LeafGoogleDriveException – Raised if unable to upload file.
- Returns:
Google Drive URL of uploaded file.
- Return type:
- leaf_engine.uuid_pipeline(df, set_shipment_uuid=True, set_lane_uuid=False)
- leaf_engine.validate_pipeline(df, params=None)