leaf_engine.io

Subpackages

Submodules

Functions

read_csv(→ pandas.DataFrame)

read_data(→ Dict[str, pandas.DataFrame])

Reads all data specified in company_params.

read_dataset(→ pandas.DataFrame | Dict[Any, ...)

read_drive(→ pandas.DataFrame)

Reads CSV, Excel, or Google Spreadsheet file from Google Drive either by

read_excel(→ pandas.DataFrame)

read_params(→ dict)

to_drive(→ str)

Write a DataFrame to a file on Google Drive, at the specified path.

upload_to_drive(→ str)

Uploads file to Google Drive.

Package Contents

leaf_engine.io.read_csv(input_path, **kwargs) pandas.DataFrame
Return type:

pandas.DataFrame

leaf_engine.io.read_data(company_params: dict) Dict[str, pandas.DataFrame]

Reads all data specified in company_params.

Returns:

Dictionary mapping dataset labels to DataFrames that can be passed to pd.concat directly to create one DataFrame.

Return type:

Dict[str, pd.DataFrame]

Parameters:

company_params (dict) –

leaf_engine.io.read_dataset(dataset_params) pandas.DataFrame | Dict[Any, pandas.DataFrame]
Return type:

pandas.DataFrame | Dict[Any, pandas.DataFrame]

leaf_engine.io.read_drive(url: str | None = None, path: str | None = None, cache: bool = True, **kwargs) pandas.DataFrame

Reads CSV, Excel, or Google Spreadsheet file from Google Drive either by url or path.

Parameters:
  • url (Optional[str], optional) – File URL. Can be copied from browser navigation bar. Defaults to None.

  • path (Optional[str], optional) – File path. First part needs to be drive name. Defaults to None.

  • cache (bool, optional) – Whether to cache the result to disk. Defaults to True. This makes subsequent calls to read_drive faster.

  • **kwargs – Keyword arguments passed to pandas.read_csv or pandas.read_excel.

Return type:

pandas.DataFrame

Examples: >>> df = read_drive(url=”https://drive.google.com/file/d/XXYYZZ/view?usp=sharing”) >>> df = read_drive(path=”Data Science/folder1/folder2/file.csv”)

Raises:
  • LeafGoogleDriveException – If both url and path are None or both are not None.

  • LeafGoogleDriveException – If file is not found on Google Drive.

  • LeafGoogleDriveException – If unable to download file from Google Drive.

  • ValueError – If neither pd.read_csv nor pd.read_excel can read the file.

Returns:

DataFrame read from file.

Return type:

pd.DataFrame

Parameters:
  • url (Optional[str]) –

  • path (Optional[str]) –

  • cache (bool) –

leaf_engine.io.read_excel(input_path, **kwargs) pandas.DataFrame
Return type:

pandas.DataFrame

leaf_engine.io.read_params(params_path: str | pathlib.Path) dict
Parameters:

params_path (Union[str, pathlib.Path]) –

Return type:

dict

leaf_engine.io.to_drive(df: pandas.DataFrame, path: str, overwrite: bool = False, **kwargs) str

Write a DataFrame to a file on Google Drive, at the specified path.

Parameters:
  • df (pd.DataFrame) – DataFrame to write.

  • path (str) – Path to write to. First part of path is the drive name.

  • overwrite (bool, optional) – Overwrite existing file. Defaults to False.

Return type:

str

Examples: >>> df.pipe(gdrive.to_drive, “drive_name/folder1/folder2/file_name.csv”) >>> df.pipe(gdrive.to_drive, “drive_name/folder1/folder2/file_name.xlsx”)

Raises:
  • LeafGoogleDriveException – Raised if overwrite is False and file exists.

  • LeafGoogleDriveException – Raised if path is invalid.

  • LeafGoogleDriveException – Raised if write response does not contain file ID.

Returns:

Google Drive file URL.

Return type:

str

Parameters:
leaf_engine.io.upload_to_drive(local_path: str | pathlib.Path, drive_path: str, overwrite: bool = False) str

Uploads file to Google Drive.

Parameters:
  • local_path (Union[str, Path]) – Local path of file to upload.

  • drive_path (str) – Google Drive path to uploaded to. First part of path is the drive name. Must include file name: “drive_name/folder1/folder2/file_name.zip”.

  • overwrite (bool) – Overwrite existing file. Defaults to False.

Raises:
  • LeafGoogleDriveException – Raised if overwrite is False and file exists.

  • LeafGoogleDriveException – Raised if unable to upload file.

Returns:

Google Drive URL of uploaded file.

Return type:

str