API Reference
Python utilities by the Professorship of Environmental Sensing and Modeling at the Technical University of Munich.
GitHub Repository https://github.com/tum-esm/utils (opens in a new tab) Documentation: https://tum-esm-utils.netlify.app (opens in a new tab) PyPI: https://pypi.org/project/tum-esm-utils (opens in a new tab)
tum_esm_utils.code
Functions for interacting with GitHub and GitLab.
Implements: request_github_file
, request_gitlab_file
request_github_file
def request_github_file(repository: str,
filepath: str,
access_token: Optional[str] = None,
branch_name: str = "main",
timeout: int = 10) -> str
Sends a request and returns the content of the response, as a string. Raises an HTTPError if the response status code is not 200.
Arguments:
repository
- In the format "owner/repo".filepath
- The path to the file in the repository.access_token
- The GitHub access token. Only required if the repo is private.branch_name
- The branch name.timeout
- The request timeout in seconds.
Returns:
The content of the file as a string.
request_gitlab_file
def request_gitlab_file(repository: str,
filepath: str,
access_token: Optional[str] = None,
branch_name: str = "main",
hostname: str = "gitlab.com",
timeout: int = 10) -> str
Sends a request and returns the content of the response, as a string. Raises an HTTPError if the response status code is not 200.
Arguments:
repository
- In the format "owner/repo".filepath
- The path to the file in the repository.access_token
- The GitLab access token. Only required if the repo is private.branch_name
- The branch name.hostname
- The GitLab hostname.timeout
- The request timeout in seconds.
Returns:
The content of the file as a string.
tum_esm_utils.datastructures
Datastructures not in the standard library.
Implements: RingList
, merge_dicts
RingList
Objects
class RingList()
__init__
def __init__(max_size: int)
Initialize a RingList with a maximum size.
clear
def clear() -> None
Removes all elements from the list.
is_full
def is_full() -> bool
Returns True if the list is full.
append
def append(x: float) -> None
Appends an element to the list.
get
def get() -> list[float]
Returns the list of elements.
sum
def sum() -> float
Returns the max size of the list
set_max_size
def set_max_size(new_max_size: int) -> None
Sets a new max size fo the list.
merge_dicts
def merge_dicts(old_object: Any, new_object: Any) -> Any
For a given dict, update it recursively from a new dict. It will not add any properties and assert that the types remain the same (or null). null->int or int->null is possible but not int->dict or list->int.
example:
merge_dicts(
old_object={"a": 3, "b": {"c": 50, "e": None}},
new_object={"b": {"e": 80}},
) == {"a": 3, "b": {"c": 50, "e": 80}}
tum_esm_utils.decorators
Decorators that can be used wrap functions.
Implements: with_filelock
with_filelock
Objects
class with_filelock()
FileLock = Mark, that a file is being used and other programs should not interfere. A file "*.lock" will be created and the content of this file will make the wrapped function possibly wait until other programs are done using it.
See https://en.wikipedia.org/wiki/Semaphore_(programming) (opens in a new tab).
Credits for the typing of higher level decorators goes to https://github.com/python/mypy/issues/1551#issuecomment-253978622 (opens in a new tab).
__init__
def __init__(lockfile_path: str, timeout: float = -1) -> None
Create a new filelock decorator.
A timeout of -1 means that the code waits forever.
Arguments:
lockfile_path
- The path to the lockfile.timeout
- The time to wait for the lock in seconds.
tum_esm_utils.em27
Functions for interacting with EM27 interferograms.
Implements: detect_corrupt_opus_files
, load_proffast2_result
.
This requires you to install this utils library with the optional em27
dependency:
pip install "tum_esm_utils[em27]"
## `or`
pdm add "tum_esm_utils[em27]"
detect_corrupt_opus_files
def detect_corrupt_opus_files(
ifg_directory: str,
silent: bool = True,
fortran_compiler: Literal["gfortran", "gfortran-9"] = "gfortran",
force_recompile: bool = False) -> dict[str, list[str]]
Returns dict[filename, list[error_messages]] for all corrupt opus files in the given directory.
It will compile the fortran code using a given compiler to perform this task. The fortran code is derived from the preprocess source code of Proffast 2 (https://www.imk-asf.kit.edu/english/3225.php (opens in a new tab)). We use it because the retrieval using Proffast 2 will fail if there are corrupt interferograms in the input.
Arguments:
ifg_directory
- The directory containing the interferograms.silent
- If set to False, print additional information.fortran_compiler
- The fortran compiler to use.force_recompile
- If set to True, the fortran code will be recompiled.
Returns:
A dictionary containing corrupt filenames as keys and a list of error messages as values.
detect_corrupt_ifgs
@deprecated("This will be removed in the next breaking release. Please use " +
"the identical function `detect_corrupt_opus_files` instead.")
def detect_corrupt_ifgs(ifg_directory: str,
silent: bool = True,
fortran_compiler: Literal["gfortran",
"gfortran-9"] = "gfortran",
force_recompile: bool = False) -> dict[str, list[str]]
Returns dict[filename, list[error_messages]] for all corrupt opus files in the given directory.
It will compile the fortran code using a given compiler to perform this task. The fortran code is derived from the preprocess source code of Proffast 2 (https://www.imk-asf.kit.edu/english/3225.php (opens in a new tab)). We use it because the retrieval using Proffast 2 will fail if there are corrupt interferograms in the input.
Arguments:
ifg_directory
- The directory containing the interferograms.silent
- If set to False, print additional information.fortran_compiler
- The fortran compiler to use.force_recompile
- If set to True, the fortran code will be recompiled.
Returns:
A dictionary containing corrupt filenames as keys and a list of error messages as values.
load_proffast2_result
def load_proffast2_result(path: str) -> pl.DataFrame
Loads the output of Proffast 2 into a polars DataFrame.
Arguments:
path
- The path to the Proffast 2 output file.
Returns:
A polars DataFrame containing all columns.
SERIAL_NUMBERS
The serial numbers of the EM27 devices.
COLORS
Colors recommended for plotting the EM27 data.
COLORS_LIGHT
Lighter colors recommended for plotting the EM27 data.
COLORS_DARK
Darker colors recommended for plotting the EM27 data.
PROFFAST_MULTIPLIERS
Multiplication factors for the EM27 data retrieved using Proffast to bring the data in a common unit.
PROFFAST_UNITS
Units for the EM27 data retrieved using Proffast after applying the multiplication factor.
tum_esm_utils.files
File-related utility functions.
Implements: load_file
, dump_file
, load_json_file
,
dump_json_file
, get_parent_dir_path
, get_dir_checksum
,
get_file_checksum
, rel_to_abs_path
, read_last_n_lines
,
expect_file_contents
, render_directory_tree
, list_directory
load_file
def load_file(path: str) -> str
Load the content of a file.
dump_file
def dump_file(path: str, content: str) -> None
Dump content to a file.
load_binary_file
def load_binary_file(path: str) -> bytes
Load binary content of a file.
dump_binary_file
def dump_binary_file(path: str, content: bytes) -> None
Dump binary content to a file.
load_json_file
def load_json_file(path: str) -> Any
Load the content of a JSON file.
dump_json_file
def dump_json_file(path: str, content: Any, indent: Optional[int] = 4) -> None
Dump content to a JSON file.
get_parent_dir_path
def get_parent_dir_path(script_path: str, current_depth: int = 1) -> str
Get the absolute path of a parent directory based on the
current script path. Simply pass the __file__
variable of
the current script to this function. Depth of 1 will return
the direct parent directory of the current script.
get_dir_checksum
def get_dir_checksum(path: str) -> str
Get the checksum of a directory using md5deep.
get_file_checksum
def get_file_checksum(path: str) -> str
Get the checksum of a file using MD5 from haslib
.
Significantly faster than get_dir_checksum
since it does
not spawn a new process.
rel_to_abs_path
def rel_to_abs_path(*path: str) -> str
Convert a path relative to the caller's file to an absolute path.
Inside file /home/somedir/somepath/somefile.py
, calling
rel_to_abs_path("..", "config", "config.json")
will
return /home/somedir/config/config.json
.
Credits to https://stackoverflow.com/a/59004672/8255842 (opens in a new tab)
read_last_n_lines
def read_last_n_lines(file_path: str,
n: int,
ignore_trailing_whitespace: bool = False) -> list[str]
Read the last n
lines of a file.
The function returns less than n
lines if the file has less than n
lines.
The last element in the list is the last line of the file.
This function uses seeking in order not to read the full file. The simple approach of reading the last 10 lines would be:
with open(path, "r") as f:
return f.read().split("\n")[:-10]
However, this would read the full file and if we only need to read 10 lines out of a 2GB file, this would be a big waste of resources.
The ignore_trailing_whitespace
option to crop off trailing whitespace, i.e.
only return the last n
lines that are not empty or only contain whitespace.
expect_file_contents
def expect_file_contents(filepath: str,
required_content_blocks: list[str] = [],
forbidden_content_blocks: list[str] = []) -> None
Assert that the given file contains all of the required content blocks, and/or none of the forbidden content blocks.
Arguments:
filepath
- The path to the file.required_content_blocks
- A list of strings that must be present in the file.forbidden_content_blocks
- A list of strings that must not be present in the file.
render_directory_tree
def render_directory_tree(root: str,
ignore: list[str] = [],
max_depth: Optional[int] = None,
root_alias: Optional[str] = None,
directory_prefix: Optional[str] = "📁 ",
file_prefix: Optional[str] = "📄 ") -> Optional[str]
Render a file tree as a string.
Example:
📁 <config.general.data.results>
├─── 📁 bundle
│ ├─── 📄 __init__.py
│ ├─── 📄 load_results.py
│ └─── 📄 main.py
├─── 📁 profiles
│ ├─── 📄 __init__.py
│ ├─── 📄 cache.py
│ ├─── 📄 download_logic.py
│ ├─── 📄 generate_queries.py
│ ├─── 📄 main.py
│ ├─── 📄 std_site_logic.py
│ └─── 📄 upload_logic.py
├─── 📁 retrieval
│ ├─── 📁 algorithms
...
Arguments:
-
root
- The root directory to render. -
ignore
- A list of patterns to ignore. If the basename of a directory matches any of the patterns, the directory is ignored. -
max_depth
- The maximum depth to render. IfNone
, render the full tree. -
root_alias
- An alias for the root directory. IfNone
, the basename of the root directory is used. In the example above, the root directory is was aliased to<config.general.data.results>
. -
directory_prefix
- The prefix to use for directories. -
file_prefix
- The prefix to use for files. -
Returns
- The directory tree as a string. If the root directory is ignored,None
.
list_directory
def list_directory(path: str,
regex: Optional[str] = None,
ignore: Optional[list[str]] = None,
include_directories: bool = True,
include_files: bool = True,
include_links: bool = True) -> list[str]
List the contents of a directory based on certain criteria. Like os.listdir
with superpowers. You can filter the list by a regex or you can ignore Unix shell
style patterns like *.lock
.
Arguments:
-
path
- The path to the directory. -
regex
- A regex pattern to match the item names against. -
ignore
- A list of patterns to ignore. If the basename of an item matches any of the patterns, the item is ignored. -
include_directories
- Whether to include directories in the output. -
include_files
- Whether to include files in the output. -
include_links
- Whether to include symbolic links in the output. -
Returns
- A list of items in the directory that match the criteria.
tum_esm_utils.mathematics
Mathematical functions.
Implements: distance_between_angles
distance_between_angles
def distance_between_angles(angle_1: float, angle_2: float) -> float
Calculate the directional distance (in degrees) between two angles.
tum_esm_utils.opus
Functions for interacting with OPUS files.
Implements: OpusFile
, OpusHTTPInterface
.
Read https://tccon-wiki.caltech.edu/Main/I2SAndOPUSHeaders (opens in a new tab) for more information
about the file parameters. This requires you to install this utils library with
the optional opus
dependency:
pip install "tum_esm_utils[opus]"
## `or`
pdm add "tum_esm_utils[opus]"
Credits to Friedrich Klappenbach (friedrich.klappenbach@tum.de) for decoding the OPUS file format.
tum_esm_utils.opus.file_interface
Functions for interacting with OPUS files.
OpusFile
Objects
class OpusFile(pydantic.BaseModel)
Interact with OPUS spectrum files.
Credits to Friedrich Klappenbach (friedrich.klappenbach@tum.de) for decoding the OPUS file format.
read
@staticmethod
def read(filepath: str,
measurement_timestamp_mode: Literal["start", "end"] = "start",
interferogram_mode: Literal["skip", "validate", "read"] = "read",
read_all_channels: bool = True) -> OpusFile
Read an interferogram file.
Arguments:
filepath
- Path to the OPUS file.measurement_timestamp_mode
- Whether the timestamps in the interferograms indicate the start or end of the measurementinterferogram_mode
- How to handle the interferogram data. "skip" will not read the interferogram data, "validate" will read the first and last block to check for errors during writing, "read" will read the entire interferogram. "read" takes about 11-12 times longer than "skip", "validate" is about 20% slower than "skip".read_all_channels
- Whether to read all channels in the file or only the first one.
Returns:
An OpusFile object, optionally containing the interferogram data (in read mode)
tum_esm_utils.opus.http_interface
Provides a HTTP interface to OPUS.
OpusHTTPInterface
Objects
class OpusHTTPInterface()
Interface to the OPUS HTTP interface.
It uses the socket library, because the HTTP interface of OPUS does not reuturn valid HTTP/1 or HTTP/2 headers. It opens and closes a new socket because OPUS closes the socket after the answer has been sent.
Raises:
ConnectionError
- If the connection to the OPUS HTTP interface fails or if the response is invalid.
request
@staticmethod
@tenacity.retry(
retry=tenacity.retry_if_exception_type(ConnectionError),
reraise=True,
stop=tenacity.stop_after_attempt(3),
wait=tenacity.wait_fixed(5),
)
def request(request: str,
timeout: float = 10.0,
expect_ok: bool = False) -> list[str]
Send a request to the OPUS HTTP interface and return the answer.
Commands will be send to GET http://localhost/OpusCommand.htm?<request>
.
This function will retry the request up to 3 times and wait 5 seconds
inbetween retries.
Arguments:
request
- The request to send.timeout
- The time to wait for the answer.expect_ok
- Whether the first line of the answer should be "OK".
Returns:
The answer lines.
request_without_retry
@staticmethod
def request_without_retry(request: str,
timeout: float = 10.0,
expect_ok: bool = False) -> list[str]
Send a request to the OPUS HTTP interface and return the answer.
Commands will be send to GET http://localhost/OpusCommand.htm?<request>
.
Arguments:
request
- The request to send.timeout
- The time to wait for the answer.expect_ok
- Whether the first line of the answer should be "OK".
Returns:
The answer lines.
get_version
@staticmethod
def get_version() -> str
Get the version number, like 20190310
.
get_version_extended
@staticmethod
def get_version_extended() -> str
Get the extended version number, like 8.2 Build: 8, 2, 28 20190310
.
is_working
@staticmethod
def is_working() -> bool
Check if the OPUS HTTP interface is working. Does NOT raise a
ConnectionError
but only returns True
or False
.
get_main_thread_id
@staticmethod
def get_main_thread_id() -> int
Get the process ID of the main thread of OPUS.
some_macro_is_running
@staticmethod
def some_macro_is_running() -> bool
Check if any macro is currently running.
In theory, we could also check whether the correct macro is running using
READ_PARAMETER MPT
and READ_PARAMETER MFN
. However, these variables do
not seem to be updated right away, so we cannot rely on them.
get_loaded_experiment
@staticmethod
def get_loaded_experiment() -> str
Get the path to the currently loaded experiment.
load_experiment
@staticmethod
def load_experiment(experiment_path: str) -> None
Load an experiment file.
start_macro
@staticmethod
def start_macro(macro_path: str) -> int
Start a macro. Returns the macro ID.
macro_is_running
@staticmethod
def macro_is_running(macro_id: int) -> bool
Check if the given macro is running. It runs MACRO_RESULTS <macro_id>
under the hood. The OPUS documentation is ambiguous about the return value.
It seems that 0 means "there is no result yet", i.e. the macro is still running
stop_macro
@staticmethod
def stop_macro(macro_path_or_id: str | int) -> None
Stop a macro given by its path or ID.
unload_all_files
@staticmethod
def unload_all_files() -> None
Unload all files. This should be done before closing it.
close_opus
@staticmethod
def close_opus() -> None
Close OPUS.
set_parameter_mode
@staticmethod
def set_parameter_mode(variant: Literal["file", "opus"]) -> None
Set the parameter mode to FILE_PARAMETERS
or OPUS_PARAMETERS
.
read_parameter
@staticmethod
def read_parameter(parameter: str) -> str
Read the value of a parameter.
write_parameter
@staticmethod
def write_parameter(parameter: str, value: str | int | float) -> None
Update the value of a parameter.
get_language
@staticmethod
def get_language() -> str
Get the current language.
get_username
@staticmethod
def get_username() -> str
Get the current username.
get_path
@staticmethod
def get_path(literal: Literal["opus", "base", "data", "work"]) -> str
Get the path to the given directory.
set_processing_mode
@staticmethod
def set_processing_mode(
mode: Literal["command", "execute", "request"]) -> None
Set the processing mode to COMMAND_MODE
, EXECUTE_MODE
, or REQUEST_MODE
.
command_line
@staticmethod
def command_line(command: str) -> Optional[str]
Execute a command line command, i.e. COMMAND_LINE <command>
.
tum_esm_utils.plotting
Better defaults for matplotlib plots and utilities for creating and saving figures.
Implements: apply_better_defaults
, create_figure
, add_subplot
This requires you to install this utils library with the optional plotting
dependencies:
pip install "tum_esm_utils[plotting]"
## `or`
pdm add "tum_esm_utils[plotting]"
apply_better_defaults
def apply_better_defaults(font_family: Optional[str] = "Roboto") -> None
Apply better defaults to matplotlib plots.
Arguments:
font_family
- The font family to use for the plots. If None, the default settings are not changed.
create_figure
@contextlib.contextmanager
def create_figure(path: str,
title: Optional[str] = None,
width: float = 10,
height: float = 10,
suptitle_y: float = 0.97,
padding: float = 2,
dpi: int = 250) -> Generator[plt.Figure, None, None]
Create a figure for plotting.
Usage:
with create_figure("path/to/figure.png", title="Title") as fig:
...
Arguments:
path
- The path to save the figure to.title
- The title of the figure.width
- The width of the figure.height
- The height of the figure.suptitle_y
- The y-coordinate of the figure title.padding
- The padding of the figure.dpi
- The DPI of the figure.
add_subplot
def add_subplot(fig: plt.Figure,
position: tuple[int, int, int],
title: Optional[str] = None,
xlabel: Optional[str] = None,
ylabel: Optional[str] = None,
**kwargs: dict[str, Any]) -> plt.Axes
Add a subplot to a figure.
Arguments:
fig
- The figure to add the subplot to.position
- The position of the subplot. The tuple should contain three integers: the number of rows, the number of columns, and the index of the subplot.title
- The title of the subplot.xlabel
- The x-axis label of the subplot.ylabel
- The y-axis label of the subplot.**kwargs
- Additional keyword arguments for the subplot.
Returns:
An axis object for the new subplot.
Raises:
ValueError
- If the index of the subplot is invalid.
add_colorpatch_legend
def add_colorpatch_legend(fig: plt.Figure,
handles: list[tuple[
str,
Union[
str,
tuple[float, float, float],
tuple[float, float, float, float],
],
]],
ncols: Optional[int] = None,
location: str = "upper left") -> None
Add a color patch legend to a figure.
Arguments:
fig
- The figure to add the legend to.handles
- A list of tuples containing the label and color of each patch (e.g.[("Label 1", "red"), ("Label 2", "blue")]
). You can pass any color that is accepted by matplotlib.ncols
- The number of columns in the legend.location
- The location of the legend.
tum_esm_utils.processes
Functions to start and terminate background processes.
Implements: get_process_pids
, start_background_process
,
terminate_process
get_process_pids
def get_process_pids(script_path: str) -> list[int]
Return a list of PIDs that have the given script as their entrypoint.
Arguments:
script_path
- The absolute path of the python file entrypoint.
start_background_process
def start_background_process(interpreter_path: str,
script_path: str,
waiting_period: float = 0.5) -> int
Start a new background process with nohup with a given python interpreter and script path. The script paths parent directory will be used as the working directory for the process.
Arguments:
-
interpreter_path
- The absolute path of the python interpreter. -
script_path
- The absolute path of the python file entrypoint. -
waiting_period
- The waiting period in seconds after starting the process. -
Returns
- The PID of the started process.
terminate_process
def terminate_process(script_path: str,
termination_timeout: Optional[int] = None) -> list[int]
Terminate all processes that have the given script as their entrypoint. Returns the list of terminated PIDs.
If termination_timeout
is not None, the processes will be
terminated forcefully after the given timeout (in seconds).
Arguments:
script_path
- The absolute path of the python file entrypoint.termination_timeout
- The timeout in seconds after which the processes will be terminated forcefully.
Returns:
The list of terminated PIDs.
tum_esm_utils.shell
Implements custom logging functionality, because the standard logging module is hard to configure for special cases.
Implements: run_shell_command
, CommandLineException
,
get_hostname
, get_commit_sha
, change_file_permissions
CommandLineException
Objects
class CommandLineException(Exception)
Exception raised for errors in the command line.
run_shell_command
def run_shell_command(command: str,
working_directory: Optional[str] = None,
executable: str = "/bin/bash") -> str
runs a shell command and raises a CommandLineException
if the return code is not zero, returns the stdout. Uses
/bin/bash
by default.
Arguments:
command
- The command to run.working_directory
- The working directory for the command.executable
- The shell executable to use.
Returns:
The stdout of the command as a string.
get_hostname
def get_hostname() -> str
returns the hostname of the device, removes network
postfix (somename.local
) if present. Only works reliably,
when the hostname doesn't contain a dot.
get_commit_sha
def get_commit_sha(
variant: Literal["short", "long"] = "short") -> Optional[str]
Get the current commit sha of the repository. Returns
None
if there is not git repository in any parent directory.
Arguments:
variant
- "short" or "long" to specify the length of the sha.
Returns:
The commit sha as a string, or None
if there is no git
repository in the parent directories.
change_file_permissions
def change_file_permissions(file_path: str, permission_string: str) -> None
Change a file's system permissions.
Example permission_strings: --x------
, rwxr-xr-x
, rw-r--r--
.
Arguments:
file_path
- The path to the file.permission_string
- The new permission string.
tum_esm_utils.system
Common system status related functions.
Implements: get_cpu_usage
, get_memory_usage
, get_disk_space
,
get_system_battery
, get_last_boot_time
, get_utc_offset
get_cpu_usage
def get_cpu_usage() -> list[float]
Checks the CPU usage of the system.
Returns:
The CPU usage in percent for each core.
get_memory_usage
def get_memory_usage() -> float
Checks the memory usage of the system.
Returns:
The memory usage in percent.
get_disk_space
def get_disk_space(path: str = "/") -> float
Checks the disk space of a given path.
Arguments:
path
- The path to check the disk space for.
Returns:
The available disk space in percent.
get_system_battery
def get_system_battery() -> Optional[int]
Checks the system battery.
Returns:
The battery state in percent if available, else None.
get_last_boot_time
def get_last_boot_time() -> datetime.datetime
Checks the last boot time of the system.
get_utc_offset
def get_utc_offset() -> float
Returns the UTC offset of the system.
Credits to https://stackoverflow.com/a/35058476/8255842 (opens in a new tab)
x = get_utc_offset()
local time == utc time + x
Returns:
The UTC offset in hours.
tum_esm_utils.text
Functions used for text manipulation/processing.
Implements: get_random_string
, pad_string
, is_date_string
,
is_rfc3339_datetime_string
, insert_replacements
, simplify_string_characters
,
replace_consecutive_characters
, RandomLabelGenerator
get_random_string
def get_random_string(length: int, forbidden: list[str] = []) -> str
Return a random string from lowercase letters.
Arguments:
length
- The length of the random string.forbidden
- A list of strings that should not be generated.
Returns:
A random string.
pad_string
def pad_string(text: str,
min_width: int,
pad_position: Literal["left", "right"] = "left",
fill_char: Literal["0", " ", "-", "_"] = " ") -> str
Pad a string with a fill character to a minimum width.
Arguments:
text
- The text to pad.min_width
- The minimum width of the text.pad_position
- The position of the padding. Either "left" or "right".fill_char
- The character to use for padding.
Returns:
The padded string.
is_date_string
def is_date_string(date_string: str) -> bool
Returns True
if string is in a valid YYYYMMDD
format.
is_rfc3339_datetime_string
def is_rfc3339_datetime_string(rfc3339_datetime_string: str) -> bool
Returns True
if string is in a valid YYYY-MM-DDTHH:mm:ssZ
(RFC3339)
format. Caution: The appendix of +00:00
is required for UTC!
insert_replacements
def insert_replacements(content: str, replacements: dict[str, str]) -> str
For every key in replacements, replaces %key%
in the
content with its value.
simplify_string_characters
def simplify_string_characters(s: str,
additional_replacements: dict[str,
str] = {}) -> str
Simplify a string by replacing special characters with their ASCII counterparts and removing unwanted characters.
For example, simplify_string_characters("Héllo, wörld!")
will return "hello-woerld"
.
Arguments:
-
s
- The string to simplify. -
additional_replacements
- A dictionary of additional replacements to apply.{ "ö": "oe" }
will replaceö
withoe
. -
Returns
- The simplified string.
replace_consecutive_characters
def replace_consecutive_characters(s: str,
characters: list[str] = [" ", "-"]) -> str
Replace consecutiv characters in a string (e.g. "hello---world" -> "hello-world" or "hello world" -> "hello world").
Arguments:
s
- The string to process.characters
- A list of characters to replace duplicates of.
Returns:
The string with duplicate characters replaced.
RandomLabelGenerator
Objects
class RandomLabelGenerator()
A class to generate random labels that follow the Docker style naming of
containers, e.g admiring-archimedes
or happy-tesla
.
Usage with tracking duplicates:
generator = RandomLabelGenerator()
label = generator.generate()
another_label = generator.generate() # Will not be the same as `label`
generator.free(label) # Free the label to be used again
Usage without tracking duplicates:
label = RandomLabelGenerator.generate_fully_random()
Source for the names and adjectives: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go (opens in a new tab)
__init__
def __init__(occupied_labels: set[str] | list[str] = set(),
adjectives: set[str] | list[str] = CONTAINER_ADJECTIVES,
names: set[str] | list[str] = CONTAINER_NAMES) -> None
Initialize the label generator.
generate
def generate() -> str
Generate a random label that is not already occupied.
free
def free(label: str) -> None
Free a label to be used again.
generate_fully_random
@staticmethod
def generate_fully_random(
adjectives: set[str] | list[str] = CONTAINER_ADJECTIVES,
names: set[str] | list[str] = CONTAINER_NAMES) -> str
Get a random label without tracking duplicates.
Use an instance of RandomLabelGenerator
if you want to avoid
duplicates by tracking occupied labels.
tum_esm_utils.timing
Functions used for timing or time calculations.
Implements: date_range
, ensure_section_duration
, set_alarm
,
clear_alarm
, wait_for_condition
, ExponentialBackoff
date_range
def date_range(from_date: datetime.date,
to_date: datetime.date) -> list[datetime.date]
Returns a list of dates between from_date and to_date (inclusive).
ensure_section_duration
@contextlib.contextmanager
def ensure_section_duration(duration: float) -> Generator[None, None, None]
Make sure that the duration of the section is at least the given duration.
Usage example - do one measurement every 6 seconds:
with ensure_section_duration(6):
do_measurement()
set_alarm
def set_alarm(timeout: int, label: str) -> None
Set an alarm that will raise a TimeoutError
after
timeout
seconds. The message will be formatted as
{label} took too long (timed out after {timeout} seconds)
.
clear_alarm
def clear_alarm() -> None
Clear the alarm set by set_alarm
.
parse_timezone_string
def parse_timezone_string(timezone_string: str,
dt: Optional[datetime.datetime] = None) -> float
Parse a timezone string and return the offset in hours.
Why does this function exist? The strptime
function cannot parse strings
other than "±HHMM". This function can also parse strings in the format "±H"
("+2", "-3", "+5.5"), and "±HH:MM".
Examples:
parse_timezone_string("GMT") # returns 0
parse_timezone_string("GMT+2") # returns 2
parse_timezone_string("UTC+2.0") # returns 2
parse_timezone_string("UTC-02:00") # returns -2
You are required to pass a datetime object in case the utc offset for the passed timezone is not constant - e.g. for "Europe/Berlin".
wait_for_condition
def wait_for_condition(is_successful: Callable[[], bool],
timeout_message: str,
timeout_seconds: float = 5,
check_interval_seconds: float = 0.25) -> None
Wait for the given condition to be true, or raise a TimeoutError if the condition is not met within the given timeout. The condition is passed as a function that will be called periodically.
Arguments:
is_successful
- A function that returns True if the condition is met.timeout_message
- The message to include in the TimeoutError.timeout_seconds
- The maximum time to wait for the condition to be met.check_interval_seconds
- How long to wait inbetweenis_successful()
calls.
parse_iso_8601_datetime
def parse_iso_8601_datetime(s: str) -> datetime.datetime
Parse a datetime string from various formats and return a datetime object.
ISO 8601 supports time zones as <time>Z
, <time>±hh:mm
, <time>±hhmm
and
<time>±hh
. However, only the second format is supported by datetime.datetime.fromisoformat()
(HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]
).
This function supports parsing alll ISO 8601 time formats.
datetime_span_intersection
def datetime_span_intersection(
dt_span_1: tuple[datetime.datetime,
datetime.datetime], dt_span_2: tuple[datetime.datetime,
datetime.datetime]
) -> Optional[tuple[datetime.datetime, datetime.datetime]]
Check if two datetime spans overlap.
Arguments:
dt_span_1
- The first datetime span (start, end).dt_span_2
- The second datetime span (start, end).
Returns:
The intersection of the two datetime spans or None if they do not overlap. Returns None if the intersection is a single point.
date_span_intersection
def date_span_intersection(
d_span_1: tuple[datetime.date,
datetime.date], d_span_2: tuple[datetime.date,
datetime.date]
) -> Optional[tuple[datetime.date, datetime.date]]
Check if two date spans overlap. This functions behaves
differently from datetime_span_intersection
in that it
returns a single point as an intersection if the two date
spans overlap at a single date.
Arguments:
d_span_1
- The first date span (start, end).d_span_2
- The second date span (start, end).
Returns:
The intersection of the two date spans or None if they do not overlap.
ExponentialBackoff
Objects
class ExponentialBackoff()
Exponential backoff e.g. when errors occur. First try again in 1 minute, then 4 minutes, then 15 minutes, etc.. Usage:
exponential_backoff = ExponentialBackoff(
log_info=logger.info, buckets= [60, 240, 900, 3600, 14400]
)
while True:
try:
# do something that might fail
exponential_backoff.reset()
except Exception as e:
logger.exception(e)
exponential_backoff.sleep()
__init__
def __init__(log_info: Optional[Callable[[str], None]] = None,
buckets: list[int] = [60, 240, 900, 3600, 14400]) -> None
Create a new exponential backoff object.
Arguments:
log_info
- The function to call when logging information.buckets
- The buckets to use for the exponential backoff.
sleep
def sleep(max_sleep_time: Optional[float] = None) -> float
Wait and increase the wait time to the next bucket.
Arguments:
max_sleep_time
- The maximum time to sleep. If None, no maximum is set.
Returns:
The amount of seconds waited.
reset
def reset() -> None
Reset the waiting period to the first bucket
tum_esm_utils.validators
Implements validator utils for use with pydantic models.
Implements: StrictFilePath
, StrictDirectoryPath
StrictFilePath
Objects
class StrictFilePath(pydantic.RootModel[str])
A pydantic model that validates a file path.
Example usage:
class MyModel(pyndatic.BaseModel):
path: StrictFilePath
m = MyModel(path='/path/to/file') # validates that the file exists
The validation can be ignored by setting the context variable:
m = MyModel.model_validate(
{"path": "somenonexistingpath"},
context={"ignore-path-existence": True},
) # does not raise an error
StrictDirectoryPath
Objects
class StrictDirectoryPath(pydantic.RootModel[str])
A pydantic model that validates a directory path.
Example usage:
class MyModel(pyndatic.BaseModel):
path: StrictDirectoryPath
m = MyModel(path='/path/to/directory') # validates that the directory exists
The validation can be ignored by setting the context variable:
m = MyModel.model_validate(
{"path": "somenonexistingpath"},
context={"ignore-path-existence": True},
) # does not raise an error
Version
Objects
class Version(pydantic.RootModel[str])
A version string in the format of MAJOR.MINOR.PATCH[-(alpha|beta|rc).N]
as_tag
def as_tag() -> str
Return the version string as a tag, i.e. vMAJOR.MINOR.PATCH...
as_identifier
def as_identifier() -> str
Return the version string as a number, i.e. MAJOR.MINOR.PATCH...
StricterBaseModel
Objects
class StricterBaseModel(pydantic.BaseModel)
The same as pydantic.BaseModel, but with stricter rules. It does not allow extra fields and validates assignments after initialization.
StrictIPv4Adress
Objects
class StrictIPv4Adress(pydantic.RootModel[str])
A pydantic model that validates an IPv4 address.
Example usage:
class MyModel(pyndatic.BaseModel):
ip: StrictIPv4Adress
m = MyModel(ip='192.186.2.1')
m = MyModel(ip='192.186.2.1:22')