ipsl_common - IPSL-common documentation

Modules

cli

Functions and definitions useful when working with ArgumentParser.

Attributes:

ArgumentCmdParser (TypeAlias) –

generic type of a subparser of the ArgumentParser class.

Attributes

ArgumentCmdParser `module-attribute`

ArgumentCmdParser: TypeAlias = _SubParsersAction

Functions:

existing_dir

existing_dir(path: str | Path)

Check if path points to an existing directory.

Example:

from argparse import ArgumentParser
from ipsl_common.cli import existing_dir
parser = ArgumentParser()
parser.add_argument("file", type=existing_dir)

Parameters:

path
(str | Path) –

path to a directory

Source code in ipsl_common/cli.py

def existing_dir(path: str | Path):
    """Check if path points to an existing directory.

    Example:

        from argparse import ArgumentParser
        from ipsl_common.cli import existing_dir
        parser = ArgumentParser()
        parser.add_argument("file", type=existing_dir)

    Args:
        path: path to a directory
    """
    path = Path(path)
    if not path.exists():
        raise ArgumentTypeError(f'Path "{str(path)}" doesn\'t exist')
    if not path.is_dir():
        raise ArgumentTypeError(f'Path "{str(path)}" is not a directory')
    return path

existing_file

existing_file(path: str | Path)

Check if path points to an existing file.

Example:

from argparse import ArgumentParser
from ipsl_common.cli import existing_file
parser = ArgumentParser()
parser.add_argument("file", type=existing_file)

Parameters:

path
(str | Path) –

path to a file

Source code in ipsl_common/cli.py

def existing_file(path: str | Path):
    """Check if path points to an existing file.

    Example:

        from argparse import ArgumentParser
        from ipsl_common.cli import existing_file
        parser = ArgumentParser()
        parser.add_argument("file", type=existing_file)

    Args:
        path: path to a file
    """
    path = Path(path)
    if not path.exists():
        raise ArgumentTypeError(f'Path "{str(path)}" doesn\'t exist')
    if not path.is_file():
        raise ArgumentTypeError(f'Path "{str(path)}" is not a file')
    return path

environment

Environment utility functions.

Functions:

has_command

has_command(cmd: str) -> bool

Test if given command exists in the environment.

This function is mostly a convenience alias.

Parameters:

cmd
(str) –

command to test

Returns:

bool –

True if the command exists, False otherwise

Source code in ipsl_common/environment.py

def has_command(cmd: str) -> bool:
    """Test if given command exists in the environment.

    This function is mostly a convenience alias.

    Args:
        cmd: command to test

    Returns:
        True if the command exists, False otherwise
    """
    return which(cmd) is not None

envmodules

Python interface to EnvModules.

Better Python interface to EnvModules which doesn't pollute the global environment with the module() function.

Examples:

em = EnvModules()
em.load("gcc", "cdo")
print(em.list_loaded())

Classes

EnvModules

EnvModules(modulehome: str | Path | None = None)

Modern interface to EnvModules.

This class encapsulate the module() function of EnvModules inside a local environment.

Attributes:

envmodule –

Reference to the module() function of EnvModules.

Initialize EnvModules found on specified path (or default).

Parameters:

modulehome
(str or Path, default: None ) –

Path to the EnvModules installation directory.

Source code in ipsl_common/envmodules.py

def __init__(self, modulehome: str | Path | None = None) -> None:
    """Initialize EnvModules found on specified path (or default).

    Args:
        modulehome (str or Path, optional): Path to the EnvModules installation directory.
    """
    self.locals: dict = {}
    if modulehome is None:
        modulehome = environ["MODULESHOME"]
    # Don't pollute `globals()` environment and use local env instead
    exec(open(f"{modulehome}/init/python.py").read(), self.locals)
    self.envmodule = self.locals["module"]

Attributes

envmodule instance-attribute

envmodule = locals['module']

locals instance-attribute

locals: dict = {}

Methods:

get_available

get_available(
    flatten=True,
) -> list[str] | dict[str, list[str]]

Return available EnvModules.

This method depends on the /usr/bin/modulecmd binary being present. The classical module() function won't work here, because we need to capture and then parse the output of the module avail function which normally is directly passed onto stderr stream.

Parameters:

flatten (bool, default: True ) –

If true (default), the result will be a flat list of modules. Otherwise, a dictionary is returned with keys representing a named-group of modules.

Returns:

list[str] | dict[str, list[str]] –

list[str] or dict: List or dictionary of available modules.

Source code in ipsl_common/envmodules.py

def get_available(self, flatten=True) -> list[str] | dict[str, list[str]]:
    """Return available EnvModules.

    This method depends on the `/usr/bin/modulecmd` binary being present.
    The classical `module()` function won't work here, because we need to
    capture and then parse the output of the `module avail` function
    which normally is directly passed onto stderr stream.

    Args:
        flatten (bool, optional):
            If true (default), the result will be a flat list of modules.
            Otherwise, a dictionary is returned with keys representing
            a named-group of modules.

    Returns:
        list[str] or dict:
            List or dictionary of available modules.
    """
    output = subprocess.run(
        ["/usr/bin/modulecmd", "python", "avail"],
        stderr=subprocess.STDOUT,
        stdout=subprocess.PIPE,
    )
    avail_dict = self.__parse_list_available(output.stdout.decode("utf-8"))
    avail_list: list[str] = []
    if flatten:
        for modules in avail_dict.values():
            avail_list.extend(modules)
        return sorted(avail_list)
    else:
        return avail_dict

get_loaded

get_loaded() -> list[str]

Return loaded EnvModules.

Returns:

list[str] –

list[str]: List of loaded EnvModules.

Source code in ipsl_common/envmodules.py

def get_loaded(self) -> list[str]:
    """Return loaded EnvModules.

    Returns:
        list[str]: List of loaded EnvModules.
    """
    if modules := environ.get("LOADEDMODULES", None):
        return sorted(modules.split(":"))
    return []

load

load(*modules) -> bool

Load passed EnvModules.

Returns:

bool ( bool ) –

True if loading was successful.

Source code in ipsl_common/envmodules.py

def load(self, *modules) -> bool:
    """Load passed EnvModules.

    Returns:
        bool: True if loading was successful.
    """
    return self.envmodule("load", *modules)

purge

purge() -> bool

Purge all loaded EnvModules.

Returns:

bool ( bool ) –

True if purging was successful.

Source code in ipsl_common/envmodules.py

def purge(self) -> bool:
    """Purge all loaded EnvModules.

    Returns:
        bool: True if purging was successful.
    """
    return self.envmodule("purge")

machine

Functions:

is_espri_spirit

is_espri_spirit() -> bool

Check if this is ESPRI/Spirit machine (1 or 2)

Source code in ipsl_common/machine.py

def is_espri_spirit() -> bool:
    """Check if this is ESPRI/Spirit machine (1 or 2)"""
    return _get_hostname() in ["spirit1", "spirit2"]

is_espri_spiritx

is_espri_spiritx() -> bool

Check if this is ESPRI/SpiritX machine (1 or 2)

Source code in ipsl_common/machine.py

def is_espri_spiritx() -> bool:
    """Check if this is ESPRI/SpiritX machine (1 or 2)"""
    return _get_hostname().startswith("spiritx")

is_idris_jean_zay

is_idris_jean_zay() -> bool

Check if this is IDRIS/JeanZay machine (also pp, visu)

Source code in ipsl_common/machine.py

def is_idris_jean_zay() -> bool:
    """Check if this is IDRIS/JeanZay machine (also pp, visu)"""
    return _get_hostname().startswith("jean-zay")

is_idris_jean_zay_pp

is_idris_jean_zay_pp() -> bool

Check if this is IDRIS/JeanZay post-processing machine

Source code in ipsl_common/machine.py

def is_idris_jean_zay_pp() -> bool:
    """Check if this is IDRIS/JeanZay post-processing machine"""
    return _get_hostname().startswith("jean-zay-pp")

is_tgcc_irene

is_tgcc_irene() -> bool

Check if this is TGCC/Irene machine

Source code in ipsl_common/machine.py

def is_tgcc_irene() -> bool:
    """Check if this is TGCC/Irene machine"""
    return _get_hostname().startswith("irene")

modipsl

Modules

card_file

Classes

CardFileDecoder

Bases: Transformer

Decoder performs translation from *.card file to a dictionary.

The translation rules are:

`*.card`	Python	Comment
section	`dict[str, dict]`	Top-level keys are sections
key-value	`dict`	Each item is a sigle kv pair
list	`list`	Empty or with elements
nested list	`list[list]`	List of lists
range	`tuple[int]`	Integer range in the form: START:STEP:END
string	`str`	Unquoted (with chars: `-./${}*`) and double quoted
integer number	`int`	---
real number	`float`	Including scientific notation
`true`/`y`	`True`	Case insensitive
`false`/`n`	`False`	Case insensitive

Methods:

decode

decode(text: str) -> dict

Decode *.card text into dictionary.

Parameters:

text (str) –

content of the *.card file

Returns:

dict ( dict ) –

Decoded *.card file

Source code in ipsl_common/modipsl/card_file.py

def decode(self, text: str) -> dict:
    """Decode `*.card` text into dictionary.

    Args:
        text: content of the `*.card` file

    Returns:
        dict: Decoded `*.card` file
    """
    parse_tree = self._parser.parse(text)
    return self.transform(parse_tree)

CardFileEncoder

Functions:

flatten_dict_with_sections

flatten_dict_with_sections(position_map: dict) -> dict

Source code in ipsl_common/modipsl/card_file.py

def flatten_dict_with_sections(position_map: dict) -> dict:
    new_position_map = {}
    for section, values in position_map.items():
        for key, position in values.items():
            new_position_map[f"{section}.{key}"] = position
    return new_position_map

load

load(buffer: TextIOBase)

Load *.card text/file buffer into dictionary.

Parameters:

buffer (TextIOBase) –

text or file buffer with the *.card file

Returns:

dict –

Loaded *.card file

Source code in ipsl_common/modipsl/card_file.py

def load(buffer: TextIOBase):
    """Load `*.card` text/file buffer into dictionary.

    Args:
        buffer: text or file buffer with the `*.card` file

    Returns:
        dict: Loaded `*.card` file
    """
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    return loads(buffer.read())

loads

loads(text: str)

Load *.card file string into dictionary.

Parameters:

text (str) –

content of the *.card file

Returns:

dict –

Loaded *.card file

Source code in ipsl_common/modipsl/card_file.py

def loads(text: str):
    """Load `*.card` file string into dictionary.

    Args:
        text: content of the `*.card` file

    Returns:
        dict: Loaded `*.card` file
    """
    return CardFileDecoder().decode(text)

def_file

Read, write, and modify *.def file from the IPSL/modipsl project.

The *.def file contains model parameters in the key-value format. The format is extremely simple in comparison to similar formats, like *.ini or *.toml, as it doesn't provide sections, nor standarized datatypes. Usually, model configuration files contain dozens or hundred of parameters with scalars (int, float, str), arrays, or special _AUTO_/_AUTOBLOCK_ values.

The special "auto" values are used by the IPSL/modipsl/libIGCM projects to mark place where an external script, called component driver, should inject configuration values.

Examples usage:

// Follows Python convention of the JSON module with load(), dump(), etc. functions
from ipsl_common.modipsl.def_file import load, dump
with open("run_dynamico.def", "r") as f:
    parameters = load(f)
    // Modify loaded parameters
    parameters["start_file_name"] = "start2024.nc"
    with open("new_run_dynamico.def", "w") as g:
        dump(parameters, g)

Loading an example *.def file:

INCLUDEDEF=run_lmdz.def
INCLUDEDEF=run_dynamico.def
use_forcing=y
g=_AUTO_: DEFAULT=9.8
start_file_name=start2023
physics="always"

Gives the following dictionary:

{
    'INCLUDEDEF': ['run_lmdz.def', 'run_dynamico.def'],
    'g': ('_AUTO_', 9.8),
    'physics': '"always"',
    'start_file_name': 'start2023',
    'use_forcing': True
}

Loaded configuration can be altered and subsequently dumped onto a file or into a string. The configuration is easy to view and modify, because it is directly decoded into a Python dictionary. The decoding and encoding process is managed internally by DefFileDecoder and DefFileEncoder classes with decode() and encode() methods.

Classes

DefFileDecoder

DefFileDecoder(include_positions: bool = False)

Bases: Transformer

Decoder performs translation from *.def file to a dictionary.

The translation rules are:

`*.def`	Python	Comment
key-value	`dict`	Including many key-value pairs and `INCLUDEDEF`
array	`list`	Collection of at least 2 elements
string	`str`	Unquoted and quoted (single or double) strings
integer number	`int`	Standard integer values
real number	`float`	Including scientific notation
`true`/`y`	`True`	Case insensitive
`false`/`n`	`False`	Case insensitive
`_AUTO_`	`tuple`	With optional default value
`_AUTOBLOCKER_`	`tuple`	With optional default value

The first step of the decoder is parsing of *.def file. For this task Lark Earley parser is used with a simple grammar expressed with EBNF notation. The *.def grammer doesn't parse well with LALR(1) parser. The parsing produces a parse tree.

The second step transforms the parse tree into Python dictionary using translation rules mentioned in the above table. This transformation is based on a automated visitor pattern called Transformer, which produces the dictionary in a bottom-up manner.

Examples:

from ipsl_common.modipsl.def_file import DefFileDecoder
dictionary = DefFileDecoder().decode(text)

If text contains this *.def file:

radius=6.371229E6
g=9.80665
omega=_AUTO_: DEFAULT=7.292E-5

Then, Python dictionary would look as follows:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}

Tip

By default, the result dictionary contains no information about textual layout of the file. However, by using the argument include_positions=True, it is possible to refine the dictionary with exact start/end positions of each value as follows:

{
    "radius": {"value": 6371229.0, "start_pos": 50, "end_pos": 60},
    "g": {"value": 9.80665, "start_pos": 99, "end_pos": 106},
    "omega": {"value": ("_AUTO_", 7.292e-05), "start_pos": 158, "end_pos": 182},
}

Initialize DefFileDecoder.

Parameters:

include_positions (bool, default: False ) –

include textual positions of values (offset from the start)

Source code in ipsl_common/modipsl/def_file.py

def __init__(self, include_positions: bool = False) -> None:
    """Initialize DefFileDecoder.

    Args:
        include_positions: include textual positions of values (offset from the start)
    """
    self._include_positions = include_positions

Methods:

decode

decode(content: str) -> dict

Decode *.def content into dictionary.

Parameters:

content (str) –

content of the *.def file

Returns:

dict ( dict ) –

Decoded *.def file

Source code in ipsl_common/modipsl/def_file.py

def decode(self, content: str) -> dict:
    """Decode `*.def` content into dictionary.

    Args:
        content (str): content of the `*.def` file

    Returns:
        dict: Decoded `*.def` file
    """
    parse_tree = self._parser.parse(content)
    return self.transform(parse_tree)

DefFileEncoder

DefFileEncoder(
    truthy_value: str = "true", falsey_value: str = "false"
)

Encoder performs translation from a dictionary to *.def file.

The translation rules are:

Python	`*.def`	Comment
`dict`	key-value	Each `INCLUDEDEF` value translates to a single key-value
`list`	array	Of at least 2 elements
`str`	string	Quoted strings will contain explicit quote characters
`int`	integer number	---
`float`	real number	Including scientific notation
`bool`	`true`/`false`	Can be specified with encode arguments
`tuple`	`_AUTO_`/`_AUTOBLOCKER_`	With optional default value at second position in tuple

The translation is straightforward, based on Python type a specific conversion is performed. No grammar, nor parse tree is used during this step.

Example:

from ipsl_common.modipsl.def_file import DefFileEncoder
text = DefFileEncoder().encode(dictionary)

If dictionary contains:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}

Then, the encoded *.def file would look as follows:

radius = 6371229.0
g = 9.80665
omega = _AUTO_: DEFAULT=7.292e-05

Tip

Python representation of the *.def file doesn't contain any textual position of particular elements (keys, values, comments, whitespaces, etc.), thus, re-encoding of the exact input *.def file is impossible. In order to recreate the original file, or modify a file while keeping the original comments, whitespaces, and order of elements, use the designated modify functions.

Initialize DefFileEncoder.

Parameters:

truthy_value (str, default: 'true' ) –

label used to encode True
falsey_value (str, default: 'false' ) –

label used to encode False

Source code in ipsl_common/modipsl/def_file.py

def __init__(
    self,
    truthy_value: str = "true",
    falsey_value: str = "false",
):
    """Initialize DefFileEncoder.

    Args:
        truthy_value: label used to encode True
        falsey_value: label used to encode False
    """
    self._truthy_value = truthy_value
    self._falsey_value = falsey_value

Methods:

encode

encode(obj: object) -> str

Encode dictionary or other Python object into *.def file.

This function works not only on full decoded *.def files, but it also work on particular Python object such as a list, or a tuple. In such case, it will take the given object and apply one of the encoding rules mentioned before.

Parameters:

obj (object) –

dictionary or Python object

Returns:

str –

Encoded text of a *.def file

Source code in ipsl_common/modipsl/def_file.py

def encode(self, obj: object) -> str:
    """Encode dictionary or other Python object into `*.def` file.

    This function works not only on full decoded `*.def` files,
    but it also work on particular Python object such as a list,
    or a tuple. In such case, it will take the given object and apply
    one of the encoding rules mentioned before.

    Args:
        obj: dictionary or Python object

    Returns:
        Encoded text of a `*.def` file
    """
    # bool must be tested before int, because it is a subclass of int class.
    if isinstance(obj, bool):
        return self._truthy_value if obj else self._falsey_value
    elif isinstance(obj, int | float):
        return str(obj)
    elif isinstance(obj, tuple):
        if len(obj) != 2:
            raise ValueError(
                f"Only tuples with len=2 are encoded. Invalid tuple: {obj}"
            )
        auto, default = obj
        if auto not in _AUTO_VALUES:
            raise ValueError(
                f"First element of a tuple must be one of {_AUTO_VALUES}. Invalid tuple: {obj}"
            )
        if not self._is_scalar(default):
            raise TypeError(
                f"Second element of the tuple must be a scalar: bool, int, float, str, or None. Invalid tuple: {obj}"
            )
        if default is not None:
            return f"{auto}: DEFAULT={self.encode(default)}"
        else:
            return str(auto)
    elif isinstance(obj, dict) and len(obj) == 1:
        # Using tuple assignment
        ((k, v),) = obj.items()
        if k == _INCLUDEDEF:
            return self._encode_include(v)
        else:
            return self._encode_entry(k, v)
    elif isinstance(obj, dict):
        return "\n".join([self.encode({k: v}) for k, v in obj.items()])
    elif isinstance(obj, list):
        if len(obj) < 2:
            raise ValueError("List must have at least two elements")
        if not all(map(self._is_scalar, obj)):
            raise TypeError(
                f"All array values must be simple scalars: bool, int, float, str, or None. Instead got: {obj}"
            )
        return ", ".join(map(self.encode, obj))
    else:
        return str(obj)

Functions:

dump

dump(obj: dict, buffer: TextIOBase) -> None

Dump dictionary into *.def text/file buffer.

Parameters:

obj (dict) –

dictionary to dump to a file
buffer (TextIOBase) –

text or file buffer for storing *.def file

Source code in ipsl_common/modipsl/def_file.py

def dump(obj: dict, buffer: TextIOBase) -> None:
    """Dump dictionary into *.def text/file buffer.

    Args:
        obj: dictionary to dump to a file
        buffer: text or file buffer for storing *.def file
    """
    if not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writable")
    buffer.write(dumps(obj))

dumps

dumps(obj: dict) -> str

Dump dictionary into *.def string.

Parameters:

obj (dict) –

dictionary to dump to a file

Source code in ipsl_common/modipsl/def_file.py

def dumps(obj: dict) -> str:
    """Dump dictionary into *.def string.

    Args:
        obj: dictionary to dump to a file
    """
    return DefFileEncoder().encode(obj)

load

load(
    buffer: TextIOBase,
    include_positions: bool = False,
    return_text: bool = False,
) -> dict | tuple[dict, str]

Load *.def text/file buffer into dictionary.

Parameters:

buffer (TextIOBase) –

text or file buffer with the *.def file
include_positions (bool, default: False ) –

include textual positions of values

Returns:

dict ( dict | tuple[dict, str] ) –

Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py

def load(
    buffer: TextIOBase, include_positions: bool = False, return_text: bool = False
) -> dict | tuple[dict, str]:
    """Load `*.def` text/file buffer into dictionary.

    Args:
        buffer: text or file buffer with the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    text = buffer.read()
    if return_text:
        return loads(text, include_positions), text
    else:
        return loads(text, include_positions)

loads

loads(text: str, include_positions: bool = False) -> dict

Load *.def file string into dictionary.

Parameters:

text (str) –

content of the *.def file
include_positions (bool, default: False ) –

include textual positions of values

Returns:

dict ( dict ) –

Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py

def loads(text: str, include_positions: bool = False) -> dict:
    """Load `*.def` file string into dictionary.

    Args:
        text: content of the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    return DefFileDecoder(include_positions).decode(text)

modify

modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None

Modify *.def text/file buffer with minimal amount of changes.

THe output text/file buffer follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

buffer (TextIOBase) –

text or file buffer for reading the file (optionally to write to the file)
new_obj (dict) –

modified representation of the *.def file content
buffer_out (TextIOBase | None, default: None ) –

optional output buffer for writing the modified file content
insert_header (str, default: '' ) –

optional header inserted before appended key-value pairs

Source code in ipsl_common/modipsl/def_file.py

def modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None:
    """Modify `*.def` text/file buffer with minimal amount of changes.

    THe output text/file buffer follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        buffer(TextIOBase): text or file buffer for reading the file (optionally to write to the file)
        new_obj(dict): modified representation of the *.def file content
        buffer_out(TextIOBase | None): optional output buffer for writing the modified file content
        insert_header(str): optional header inserted before appended key-value pairs
    """
    # Buffer must always be readable
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    # Buffer must be writeable if buffer_out is not passed
    if buffer_out is None and not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writeable")
    # Otherwise, buffer_out must be writeable
    elif buffer_out is not None and buffer_out.writable():
        raise ValueError("Text buffer_out (TextIOBase) must be writeable")

    target_buffer = buffer_out if buffer_out is not None else buffer
    target_buffer.write(modifys(buffer.read(), new_obj, insert_header=insert_header))

modifys

modifys(
    text: str, new_obj: dict, insert_header: str = ""
) -> str

Modify *.def file string with minimal amount of changes.

THe output text follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

text (str) –

input text with *.def content to modify
new_obj (dict) –

modified representation of the *.def file content
insert_header (str, default: '' ) –

optional header inserted before appended key-value pairs

Returns:

str ( str ) –

Modified text

Source code in ipsl_common/modipsl/def_file.py

def modifys(text: str, new_obj: dict, insert_header: str = "") -> str:
    """Modify `*.def` file string with minimal amount of changes.

    THe output text follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        text(str): input text with *.def content to modify
        new_obj(dict): modified representation of the *.def file content
        insert_header(str): optional header inserted before appended key-value pairs

    Returns:
        str: Modified text
    """
    old_obj = loads(text, include_positions=True)
    encoder = DefFileEncoder()
    replacements = []

    def _eat_newline(text: str, position: int) -> int:
        """Return new position with the newline eaten"""
        return position + 1 if text[position] == "\n" else position

    # Replace removed key-value pairs by empty strings
    deleted_keys = old_obj.keys() - new_obj.keys()
    for k in deleted_keys:
        v = old_obj[k]
        replacements.append((v["key_start_pos"], _eat_newline(text, v["end_pos"]), ""))

    # TODO: refactor this part in the future, now I don't have enough strength to do so
    include_insert_pos = 0
    old_includes_with_pos = old_obj.pop("INCLUDEDEF", [])
    old_includes = [m["value"] for m in old_includes_with_pos]
    to_remove = old_includes.copy()
    for m in new_obj.pop("INCLUDEDEF", []):
        if m in old_includes:
            to_remove.remove(m)
            include_insert_pos = (
                old_includes_with_pos[old_includes.index(m)]["end_pos"] + 1
            )
        else:
            replacements.append(
                (include_insert_pos, include_insert_pos, f"INCLUDEDEF={m}\n")
            )
    # Whatever has left is to remove
    for m in to_remove:
        d = old_includes_with_pos[old_includes.index(m)]
        replacements.append((d["key_start_pos"], _eat_newline(text, d["end_pos"]), ""))

    # Apply modification to existing keys
    for k, v in old_obj.items():
        if k in new_obj and v["value"] != new_obj[k]:
            replacements.append(
                (v["start_pos"], v["end_pos"], encoder.encode(new_obj[k]))
            )
    new_text = replace_text(text, replacements)
    new_text += insert_header
    # Then, add new keys at the end
    # INFO: We cannot use .keys() to find the new keys, because this method
    # returns a set which doesn't keep keys order. Instead, we can iterate
    # over items in dictionaries which is guaranteed to preserve the
    # insertion order since Python 3.7
    # (https://docs.python.org/3/library/stdtypes.html#typesmapping)
    new_keys = []
    for k in new_obj:
        if k not in old_obj:
            new_keys.append(k)

    if append_keys := encoder.encode({k: new_obj[k] for k in new_keys}):
        new_text += append_keys
        new_text += "\n"
    return new_text

mod_file

Read mod.def file from the IPSL/modipsl project.

The mod.def file contains definitions of coupled model configurations. A single configuration consists of one or more components (e.g. athmospheric model, dynamics, I/O system, experiments).

Examples:

from ipsl_common.modipsl.mod_file import load
with open("mod.def", "r") as f:
    configs = load(f)

The loaded dictionary has a fixed schema. Using the following input:

#-H- GRISLI GRISLI stand-alone for Antarctica icesheets (prototype)
#-C- GRISLI trunk/libIGCM HEAD 10 libIGCM .
#-C- GRISLI branches/xios HEAD 26 GRISLI modeles
...
#-S- 7 git https://gitlab.in2p3.fr/ipsl/projets/nemogcm/nemo.git
#-S- 8 svn --username icmc_users https://forge.ipsl.fr/igcmg/svn

the output looks as follows:

{
    'configuration': {
        'GRISLI': {
            'description': ['GRISLI stand-alone for Antarctica icesheets (prototype)']
            'components': [
                {
                    'modipsl_dir': '.',
                    'name': 'libIGCM',
                    'repository': 10,
                    'revision': 'HEAD',
                    'variant': 'trunk/libIGCM'
                },
                {
                    'modipsl_dir': 'modeles',
                    'name': 'GRISLI',
                    'repository': 26,
                    'revision': 'HEAD',
                    'variant': 'branches/xios'
                },
                (...)
            ],
        },
        (...)
    },
    'repository': {
        7: {
            'clone_url': 'https://gitlab.in2p3.fr/ipsl/projets/nemogcm/nemo.git',
            'type': 'git'
        },
        8: {
            'clone_url': '--username icmc_users https://forge.ipsl.fr/igcmg/svn',
            'type': 'svn'
        },
        (...)
    }
}

Warning

Because this module uses iterative matching to patterns (with re.finditer), it doesn't have a capability to tell, when the *.mod file is not well formatted, nor invalid. It will siliently skip non-matched lines and move on!

Functions:

load

load(buffer: TextIOBase) -> dict

Load mod.def file from a text/file buffer.

Parameters:

buffer (TextIOBase) –

text or file buffer with the mod.def file

Returns:

dict ( dict ) –

Loaded mod.def file

Source code in ipsl_common/modipsl/mod_file.py

def load(buffer: io.TextIOBase) -> dict:
    """Load mod.def file from a text/file buffer.

    Args:
        buffer (io.TextIOBase): text or file buffer with the mod.def file

    Returns:
        dict: Loaded mod.def file
    """
    return loads(buffer.read())

loads

loads(content: str) -> dict

Load mod.def file from a string.

Parameters:

content (str) –

content of the mod.def file

Returns:

dict ( dict ) –

Loaded mod.def file

Source code in ipsl_common/modipsl/mod_file.py

def loads(content: str) -> dict:
    """Load mod.def file from a string.

    Args:
        content (str): content of the mod.def file

    Returns:
        dict: Loaded mod.def file
    """
    repositories = __parse_repositories(content)
    return {
        "repository": repositories,
        "configuration": __parse_configurations(content, repositories),
    }

path

Utility path/pathlib functions.

Functions:

try_joinpath

try_joinpath(
    path: Path, *others: str | Path
) -> Path | None

Test and return joined path if it exists.

This function is especially useful when used with walrus operator (:=).

Example:

if p := try_joinpath(some_dir, "subdir", "file.txt"):
    with open(p, "r") as f:
        ...

Parameters:

path
(Path) –

path to test
others
(str | Path, default: () ) –

other paths to join

Returns:

Path | None –

full joined path if it exists or None

Source code in ipsl_common/path.py

def try_joinpath(path: Path, *others: str | Path) -> Path | None:
    """Test and return joined path if it exists.

    This function is especially useful when used with walrus operator (:=).

    Example:

        if p := try_joinpath(some_dir, "subdir", "file.txt"):
            with open(p, "r") as f:
                ...

    Args:
        path: path to test
        others: other paths to join

    Returns:
        full joined path if it exists or None
    """
    new_path = path.joinpath(*others)
    return new_path if new_path.exists() else None

try_path

try_path(path: Path) -> Path | None

Test and return path if it exists.

This function is especially useful when used with walrus operator (:=).

Example:

if p := try_path(some_file):
    with open(p, "r") as f:
        ...

Parameters:

path
(Path) –

path to test

Returns:

Path | None –

path if it exists or None

Source code in ipsl_common/path.py

def try_path(path: Path) -> Path | None:
    """Test and return path if it exists.

    This function is especially useful when used with walrus operator (:=).

    Example:

        if p := try_path(some_file):
            with open(p, "r") as f:
                ...

    Args:
        path: path to test

    Returns:
        path if it exists or None
    """
    return path if path.exists() else None

python

Python-related functions, such as: checking Python version.

Functions:

check_minimal_version

check_minimal_version(
    major: int,
    minor: int,
    reason: str = "",
    exit_on_fail: bool = False,
) -> None

Check if minimal Python version is present. If not, print warning (default) or exit program entirely. The passed major.minor version is not validated and it can have any value, e.g. even non-existing Python versions like 4.24.

Source code in ipsl_common/python.py

def check_minimal_version(
    major: int, minor: int, reason: str = "", exit_on_fail: bool = False
) -> None:
    """
    Check if minimal Python version is present. If not, print warning (default) or
    exit program entirely.
    The passed major.minor version is not validated and it can have
    any value, e.g. even non-existing Python versions like 4.24.
    """
    if sys.version_info[0:2] < (major, minor):
        if reason:
            reason = f" {reason}\n"
        message = f"Python {major}.{minor} or later is required.{reason}"
        if exit_on_fail:
            sys.exit(message)
        else:
            log.warning(message)

scripts

Modules

slurm_logs

Process slurm logs from the command-line.

Examples:

ipsl_slurm_logs slurm.log --output-dir output_test/ --remove-trailing-whitespaces

The above command separates slurm.log containing model execution using 4 MPI processes into distinct files inside the output_test/ directory:

output_test/
├── output_0.log
├── output_10.log
├── output_11.log
├── output_12.log

JSON example:

ipsl_slurm_logs slurm.log --json > slurm.json

Will separate the slurm.log into a JSON format.

Functions:

main

main()

Source code in ipsl_common/scripts/slurm_logs.py

def main():
    parser = ArgumentParser()
    parser.add_argument(
        "log_file", type=existing_file, help="Labelled SLURM log file to process"
    )
    parser.add_argument(
        "--json",
        action="store_true",
        help="Print separted logs to console in JSON format",
    )
    parser.add_argument(
        "--remove-trailing-whitespaces",
        action="store_true",
        help=(
            "Remove trailing whitespaces and empty lines in separated logs."
            "The leading whitespaces are kept because indentation might provide additional information in logs."
        ),
    )
    parser.add_argument("--output-name", type=str, default="output_{process_id}.log")
    parser.add_argument("--output-dir", default=Path("."))

    args = parser.parse_args()

    with open(args.log_file, "r") as fp:
        if args.json:
            print(
                json.dumps(
                    separate_labelled_log(
                        fp, remove_trailing_whitespaces=args.remove_trailing_whitespaces
                    )
                )
            )
        else:
            separate_labelled_log_to_files(
                fp,
                args.output_dir,
                args.output_name,
                remove_trailing_whitespaces=args.remove_trailing_whitespaces,
            )

slurm

Improved interaction with the SLURM ecosystem.

This module includes: functions to separate labelled SLURM logs.

Functions:

separate_labelled_log

separate_labelled_log(
    fp: TextIO, remove_trailing_whitespaces: bool = False
) -> dict[int | str, list[str]]

Seperate SLURM labelled log by process IDs.

Almost each line in the SLURM labelled log starts with the process number followed by a colon:

22:  USING DEFAULTS : area_radius1 =   3360.00000000000
26:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  GETIN area_radius1 =    3360.00000000000
12:  USING DEFAULTS : area_rotation_pre =  0.000000000000000E+000
12:  USING DEFAULTS : area_rotation =  0.000000000000000E+000

This function reads such log file and groups lines coming from the same process. If a line doesn't have the label, which can happen when it is an error coming from srun or sbatch, it is placed under the slurm key.

This function might take a lot of memory for big logs, because it loads all the processed lines into one dictionary. Consider using separate_labelled_log_to_files() to separate files into files on-the-fly.

Parameters:

fp
(TextIO) –

text I/O stream with the log's content
remove_trailing_whitespaces
(bool, default: False ) –

remove trailing whitespaces and empty lines

Returns: dict: labelled log lines grouped by processes.

Source code in ipsl_common/slurm.py

def separate_labelled_log(
    fp: TextIO, remove_trailing_whitespaces: bool = False
) -> dict[int | str, list[str]]:
    """
    Seperate SLURM labelled log by process IDs.

    Almost each line in the SLURM labelled log starts with the process number
    followed by a colon:

    ```
    22:  USING DEFAULTS : area_radius1 =   3360.00000000000
    26:  USING DEFAULTS : area_radius1 =   3360.00000000000
     0:  USING DEFAULTS : area_radius1 =   3360.00000000000
     0:  GETIN area_radius1 =    3360.00000000000
    12:  USING DEFAULTS : area_rotation_pre =  0.000000000000000E+000
    12:  USING DEFAULTS : area_rotation =  0.000000000000000E+000
    ```

    This function reads such log file and groups lines coming from the same process.
    If a line doesn't have the label, which can happen when it is an error coming
    from `srun` or `sbatch`, it is placed under the `slurm` key.

    This function might take a lot of memory for big logs, because it loads all
    the processed lines into one dictionary. Consider using
    `separate_labelled_log_to_files()` to separate files into files on-the-fly.

    Args:
        fp: text I/O stream with the log's content
        remove_trailing_whitespaces: remove trailing whitespaces and empty lines
    Returns:
        dict: labelled log lines grouped by processes.
    """
    processed_log = {}

    for line in fp:
        # Group only process-laballed lines
        if m := _LABELLED_LOG_LINE.search(line):
            process_id = int(m.group(1))
            process_msg = m.group(2)
            if process_id not in processed_log:
                processed_log[process_id] = []
            if remove_trailing_whitespaces:
                process_msg = process_msg.rstrip()
                if not process_msg:
                    continue
            processed_log[process_id].append(process_msg)
        # Other, append under the `slurm` key
        else:
            if "slurm" not in processed_log:
                processed_log["slurm"] = []
            if remove_trailing_whitespaces:
                line = line.rstrip()
                if not line:
                    continue
            processed_log["slurm"].append(line.rstrip("\n"))
    return processed_log

separate_labelled_log_to_files

separate_labelled_log_to_files(
    fp: TextIO,
    output_dir: Path,
    output_name: str,
    remove_trailing_whitespaces: bool = False,
) -> None

Seperate SLURM labelled log by process IDs, on-the-fly, into dedicated log files.

This function will not load the log content into memory like separate_labelled_log() does. Instead, it will process it line-by-line and write to dedicated log files. Use this function for big logs.

Parameters:

fp
(TextIO) –

text I/O stream with the log's content
output_dir
(Path) –

directory where dedicated logs will be placed (must exist)
output_name
(str) –

the name of the output file (must contain {process_id} as a placeholder of the process ID)
remove_trailing_whitespaces
(bool, default: False ) –

remove trailing whitespaces and empty lines

Source code in ipsl_common/slurm.py

def separate_labelled_log_to_files(
    fp: TextIO,
    output_dir: Path,
    output_name: str,
    remove_trailing_whitespaces: bool = False,
) -> None:
    """
    Seperate SLURM labelled log by process IDs, on-the-fly, into dedicated log files.

    This function will not load the log content into memory like `separate_labelled_log()` does.
    Instead, it will process it line-by-line and write to dedicated log files.
    Use this function for big logs.

    Args:
        fp: text I/O stream with the log's content
        output_dir: directory where dedicated logs will be placed (must exist)
        output_name: the name of the output file (must contain `{process_id}` as a placeholder of the process ID)
        remove_trailing_whitespaces: remove trailing whitespaces and empty lines
    """
    output_dir = Path(output_dir)
    if not output_dir.exists():
        raise ValueError("The `output_dir` must exist")
    if not output_dir.is_dir():
        raise ValueError("The `output_dir` must be a directory")
    if "{process_id}" not in output_name:
        raise ValueError(
            "The `output_name` must contain `{process_id}` placeholder for the process ID."
        )
    output_files = {}
    for line in fp:
        # Write lines to per-process output files
        if m := _LABELLED_LOG_LINE.search(line):
            process_id = int(m.group(1))
            process_msg = m.group(2)
            if process_id not in output_files:
                output_files[process_id] = open(
                    output_dir / output_name.format(process_id=process_id), "w"
                )
            if remove_trailing_whitespaces:
                process_msg = process_msg.rstrip()
                if not process_msg:
                    continue
            output_files[process_id].write(f"{process_msg}\n")
        # Other, write to the `slurm` file
        else:
            if "slurm" not in output_files:
                output_files["slurm"] = open(
                    output_dir / output_name.format(process_id="slurm"), "w"
                )
            if remove_trailing_whitespaces:
                line = line.rstrip()
                if not line:
                    continue
            output_files["slurm"].write(line)

    for f in output_files.values():
        f.close()

str

Collection of string related functions

Functions:

contains

contains(text: str, *args) -> bool

Check if text contains all given substrings.

Parameters:

text
(str) –

Text to analyse.
args
(list[str], default: () ) –

List of substrings to match.

Returns:

bool ( bool ) –

True if all substrings are found in the text.

Source code in ipsl_common/str.py

def contains(text: str, *args) -> bool:
    """Check if text contains all given substrings.

    Args:
        text (str): Text to analyse.
        args (list[str]): List of substrings to match.

    Returns:
        bool: True if all substrings are found in the text.
    """
    for arg in args:
        if text not in arg:
            return False
    return True

contains_any

contains_any(text: str, *args) -> bool

Check if text contains any of given substrings.

Parameters:

text
(str) –

Text to analyse.
args
(list[str], default: () ) –

List of substrings to match.

Returns:

bool ( bool ) –

True if any of substrings is found in the text.

Source code in ipsl_common/str.py

def contains_any(text: str, *args) -> bool:
    """Check if text contains any of given substrings.

    Args:
        text (str): Text to analyse.
        args (list[str]): List of substrings to match.

    Returns:
        bool: True if any of substrings is found in the text.
    """
    for arg in args:
        if text in arg:
            return True
    return False

is_float

is_float(text: str) -> bool

Check if text represents floating value.

Parameters:

text
(str) –

Text to verify

Returns:

bool ( bool ) –

True if text represents a floating value.

Source code in ipsl_common/str.py

def is_float(text: str) -> bool:
    """Check if text represents floating value.

    Args:
        text(str): Text to verify

    Returns:
        bool: True if text represents a floating value.
    """
    try:
        float(text)
    except ValueError:
        return False
    else:
        return True

is_int

is_int(text: str) -> bool

Check if text represents integer value.

Parameters:

text
(str) –

Text to verify

Returns:

bool ( bool ) –

True if text represents an integer value.

Source code in ipsl_common/str.py

def is_int(text: str) -> bool:
    """Check if text represents integer value.

    Args:
        text(str): Text to verify

    Returns:
        bool: True if text represents an integer value.
    """
    try:
        int(text)
    except ValueError:
        return False
    else:
        return True

replace_text

replace_text(
    text: str,
    replacements: list[tuple[int, int, str]],
    sort_replacements: bool = True,
) -> str

Replace given text by a collection of replacements. Each replacement defines starting and ending positions with a new text value to replace. If the start and end positions of a replacements are the same, the text will be inject at given position. Using an empty replacement text "" will remove text between start and end positions.

Parameters:

text
(str) –

Text to replace
replacements
(list[tuple[int, int, str]]) –

Collection of replacements of the form (start position, end position, new value).
sort_replacements
(bool, default: True ) –

Sort replacements by their start/end positions.

Returns: str: New text with applied replacements.

Source code in ipsl_common/str.py

def replace_text(
    text: str, replacements: list[tuple[int, int, str]], sort_replacements: bool = True
) -> str:
    """
    Replace given text by a collection of replacements. Each replacement defines
    starting and ending positions with a new text value to replace. If the
    start and end positions of a replacements are the same, the text will be inject
    at given position. Using an empty replacement text "" will remove text between
    start and end positions.

    Args:
        text(str): Text to replace
        replacements(list[tuple[int, int, str]]): Collection of replacements
            of the form (start position, end position, new value).
        sort_replacements(bool): Sort replacements by their start/end positions.
    Returns:
        str: New text with applied replacements.
    """
    if sort_replacements:
        replacements = sorted(replacements, key=lambda r: (r[0], r[1]))
    else:
        positions = list(chain.from_iterable((r[0], r[1]) for r in replacements))
        if not all(positions[n] <= positions[n + 1] for n in range(len(positions) - 1)):
            raise ValueError(
                "Positions in replacements must be monotonically increasing, "
                "i.e. p1 <= p2 <= p3 <= p4 <= ... for [('..',  p1, p2), ('..', p3, p4), ...]"
            )

    def _apply_replacements(text, replacements):
        """Recursively apply the replacements to a given text"""
        # No replacements means no changes!
        if not replacements:
            return text
        (start, end, new_value) = replacements.pop()

        return (
            _apply_replacements(text[:start], replacements)
            + new_value
            + _apply_replacements(text[end:], replacements)
        )

    # Replacements will be eaten, hence, we make a copy of them!
    return _apply_replacements(text, replacements.copy())

Modules

cli

Attributes

ArgumentCmdParser module-attribute

Functions:

existing_dir

path

existing_file

path

environment

Functions:

has_command

cmd

envmodules

Classes

EnvModules

modulehome

Attributes

Methods:

machine

Functions:

is_espri_spirit

is_espri_spiritx

is_idris_jean_zay

is_idris_jean_zay_pp

is_tgcc_irene

modipsl

Modules

card_file

Classes

Functions:

def_file

Classes

Functions:

mod_file

Functions:

path

Functions:

try_joinpath

path

others

try_path

path

python

Functions:

check_minimal_version

scripts

Modules

slurm_logs

Functions:

slurm

Functions:

separate_labelled_log

fp

remove_trailing_whitespaces

separate_labelled_log_to_files

fp

output_dir

output_name

remove_trailing_whitespaces

str

Functions:

contains

text

args

contains_any

text

args

is_float

text

is_int

text

replace_text

text

replacements

sort_replacements

ArgumentCmdParser `module-attribute`

`path`

`path`

`cmd`

`modulehome`

`path`

`others`

`path`

`fp`

`remove_trailing_whitespaces`

`fp`

`output_dir`

`output_name`

`remove_trailing_whitespaces`

`text`

`args`

`text`

`args`

`text`

`text`

`text`

`replacements`

`sort_replacements`