Skip to content

Modules

cli

Functions and definitions useful when working with ArgumentParser.

Attributes:

  • ArgumentCmdParser (TypeAlias) –

    generic type of a subparser of the ArgumentParser class.

Attributes

ArgumentCmdParser module-attribute
ArgumentCmdParser: TypeAlias = _SubParsersAction

Functions:

existing_dir
existing_dir(path: str | Path)

Check if path points to an existing directory.

Example:

from argparse import ArgumentParser
from ipsl_common.cli import existing_dir
parser = ArgumentParser()
parser.add_argument("file", type=existing_dir)

Parameters:

  • path
    (str | Path) –

    path to a directory

Source code in ipsl_common/cli.py
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def existing_dir(path: str | Path):
    """Check if path points to an existing directory.

    Example:

        from argparse import ArgumentParser
        from ipsl_common.cli import existing_dir
        parser = ArgumentParser()
        parser.add_argument("file", type=existing_dir)

    Args:
        path: path to a directory
    """
    path = Path(path)
    if not path.exists():
        raise ArgumentTypeError(f'Path "{str(path)}" doesn\'t exist')
    if not path.is_dir():
        raise ArgumentTypeError(f'Path "{str(path)}" is not a directory')
    return path
existing_file
existing_file(path: str | Path)

Check if path points to an existing file.

Example:

from argparse import ArgumentParser
from ipsl_common.cli import existing_file
parser = ArgumentParser()
parser.add_argument("file", type=existing_file)

Parameters:

  • path
    (str | Path) –

    path to a file

Source code in ipsl_common/cli.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
def existing_file(path: str | Path):
    """Check if path points to an existing file.

    Example:

        from argparse import ArgumentParser
        from ipsl_common.cli import existing_file
        parser = ArgumentParser()
        parser.add_argument("file", type=existing_file)

    Args:
        path: path to a file
    """
    path = Path(path)
    if not path.exists():
        raise ArgumentTypeError(f'Path "{str(path)}" doesn\'t exist')
    if not path.is_file():
        raise ArgumentTypeError(f'Path "{str(path)}" is not a file')
    return path

environment

Environment utility functions.

Functions:

has_command
has_command(cmd: str) -> bool

Test if given command exists in the environment.

This function is mostly a convenience alias.

Parameters:

  • cmd
    (str) –

    command to test

Returns:

  • bool

    True if the command exists, False otherwise

Source code in ipsl_common/environment.py
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def has_command(cmd: str) -> bool:
    """Test if given command exists in the environment.

    This function is mostly a convenience alias.

    Args:
        cmd: command to test

    Returns:
        True if the command exists, False otherwise
    """
    return which(cmd) is not None

envmodules

Python interface to EnvModules.

Better Python interface to EnvModules which doesn't pollute the global environment with the module() function.

Examples:

em = EnvModules()
em.load("gcc", "cdo")
print(em.list_loaded())

Classes

EnvModules
EnvModules(modulehome: str | Path | None = None)

Modern interface to EnvModules.

This class encapsulate the module() function of EnvModules inside a local environment.

Attributes:

  • envmodule

    Reference to the module() function of EnvModules.

Initialize EnvModules found on specified path (or default).

Parameters:

  • modulehome
    (str or Path, default: None ) –

    Path to the EnvModules installation directory.

Source code in ipsl_common/envmodules.py
30
31
32
33
34
35
36
37
38
39
40
41
def __init__(self, modulehome: str | Path | None = None) -> None:
    """Initialize EnvModules found on specified path (or default).

    Args:
        modulehome (str or Path, optional): Path to the EnvModules installation directory.
    """
    self.locals: dict = {}
    if modulehome is None:
        modulehome = environ["MODULESHOME"]
    # Don't pollute `globals()` environment and use local env instead
    exec(open(f"{modulehome}/init/python.py").read(), self.locals)
    self.envmodule = self.locals["module"]
Attributes
envmodule instance-attribute
envmodule = locals['module']
locals instance-attribute
locals: dict = {}
Methods:
get_available
get_available(
    flatten=True,
) -> list[str] | dict[str, list[str]]

Return available EnvModules.

This method depends on the /usr/bin/modulecmd binary being present. The classical module() function won't work here, because we need to capture and then parse the output of the module avail function which normally is directly passed onto stderr stream.

Parameters:

  • flatten (bool, default: True ) –

    If true (default), the result will be a flat list of modules. Otherwise, a dictionary is returned with keys representing a named-group of modules.

Returns:

  • list[str] | dict[str, list[str]]

    list[str] or dict: List or dictionary of available modules.

Source code in ipsl_common/envmodules.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
def get_available(self, flatten=True) -> list[str] | dict[str, list[str]]:
    """Return available EnvModules.

    This method depends on the `/usr/bin/modulecmd` binary being present.
    The classical `module()` function won't work here, because we need to
    capture and then parse the output of the `module avail` function
    which normally is directly passed onto stderr stream.

    Args:
        flatten (bool, optional):
            If true (default), the result will be a flat list of modules.
            Otherwise, a dictionary is returned with keys representing
            a named-group of modules.

    Returns:
        list[str] or dict:
            List or dictionary of available modules.
    """
    output = subprocess.run(
        ["/usr/bin/modulecmd", "python", "avail"],
        stderr=subprocess.STDOUT,
        stdout=subprocess.PIPE,
    )
    avail_dict = self.__parse_list_available(output.stdout.decode("utf-8"))
    avail_list: list[str] = []
    if flatten:
        for modules in avail_dict.values():
            avail_list.extend(modules)
        return sorted(avail_list)
    else:
        return avail_dict
get_loaded
get_loaded() -> list[str]

Return loaded EnvModules.

Returns:

  • list[str]

    list[str]: List of loaded EnvModules.

Source code in ipsl_common/envmodules.py
59
60
61
62
63
64
65
66
67
def get_loaded(self) -> list[str]:
    """Return loaded EnvModules.

    Returns:
        list[str]: List of loaded EnvModules.
    """
    if modules := environ.get("LOADEDMODULES", None):
        return sorted(modules.split(":"))
    return []
load
load(*modules) -> bool

Load passed EnvModules.

Returns:

  • bool ( bool ) –

    True if loading was successful.

Source code in ipsl_common/envmodules.py
51
52
53
54
55
56
57
def load(self, *modules) -> bool:
    """Load passed EnvModules.

    Returns:
        bool: True if loading was successful.
    """
    return self.envmodule("load", *modules)
purge
purge() -> bool

Purge all loaded EnvModules.

Returns:

  • bool ( bool ) –

    True if purging was successful.

Source code in ipsl_common/envmodules.py
43
44
45
46
47
48
49
def purge(self) -> bool:
    """Purge all loaded EnvModules.

    Returns:
        bool: True if purging was successful.
    """
    return self.envmodule("purge")

machine

Functions:

is_espri_spirit
is_espri_spirit() -> bool

Check if this is ESPRI/Spirit machine (1 or 2)

Source code in ipsl_common/machine.py
17
18
19
def is_espri_spirit() -> bool:
    """Check if this is ESPRI/Spirit machine (1 or 2)"""
    return _get_hostname() in ["spirit1", "spirit2"]
is_espri_spiritx
is_espri_spiritx() -> bool

Check if this is ESPRI/SpiritX machine (1 or 2)

Source code in ipsl_common/machine.py
22
23
24
def is_espri_spiritx() -> bool:
    """Check if this is ESPRI/SpiritX machine (1 or 2)"""
    return _get_hostname().startswith("spiritx")
is_idris_jean_zay
is_idris_jean_zay() -> bool

Check if this is IDRIS/JeanZay machine (also pp, visu)

Source code in ipsl_common/machine.py
7
8
9
def is_idris_jean_zay() -> bool:
    """Check if this is IDRIS/JeanZay machine (also pp, visu)"""
    return _get_hostname().startswith("jean-zay")
is_idris_jean_zay_pp
is_idris_jean_zay_pp() -> bool

Check if this is IDRIS/JeanZay post-processing machine

Source code in ipsl_common/machine.py
12
13
14
def is_idris_jean_zay_pp() -> bool:
    """Check if this is IDRIS/JeanZay post-processing machine"""
    return _get_hostname().startswith("jean-zay-pp")
is_tgcc_irene
is_tgcc_irene() -> bool

Check if this is TGCC/Irene machine

Source code in ipsl_common/machine.py
27
28
29
def is_tgcc_irene() -> bool:
    """Check if this is TGCC/Irene machine"""
    return _get_hostname().startswith("irene")

modipsl

Modules

card_file
Classes
CardFileDecoder

Bases: Transformer

Decoder performs translation from *.card file to a dictionary.

The translation rules are:

*.card Python Comment
section dict[str, dict] Top-level keys are sections
key-value dict Each item is a sigle kv pair
list list Empty or with elements
nested list list[list] List of lists
range tuple[int] Integer range in the form: START:STEP:END
string str Unquoted (with chars: -./${}*) and double quoted
integer number int ---
real number float Including scientific notation
true/y True Case insensitive
false/n False Case insensitive
Methods:
decode
decode(text: str) -> dict

Decode *.card text into dictionary.

Parameters:

  • text (str) –

    content of the *.card file

Returns:

  • dict ( dict ) –

    Decoded *.card file

Source code in ipsl_common/modipsl/card_file.py
298
299
300
301
302
303
304
305
306
307
308
def decode(self, text: str) -> dict:
    """Decode `*.card` text into dictionary.

    Args:
        text: content of the `*.card` file

    Returns:
        dict: Decoded `*.card` file
    """
    parse_tree = self._parser.parse(text)
    return self.transform(parse_tree)
CardFileEncoder
Functions:
flatten_dict_with_sections
flatten_dict_with_sections(position_map: dict) -> dict
Source code in ipsl_common/modipsl/card_file.py
226
227
228
229
230
231
def flatten_dict_with_sections(position_map: dict) -> dict:
    new_position_map = {}
    for section, values in position_map.items():
        for key, position in values.items():
            new_position_map[f"{section}.{key}"] = position
    return new_position_map
load
load(buffer: TextIOBase)

Load *.card text/file buffer into dictionary.

Parameters:

  • buffer (TextIOBase) –

    text or file buffer with the *.card file

Returns:

  • dict

    Loaded *.card file

Source code in ipsl_common/modipsl/card_file.py
383
384
385
386
387
388
389
390
391
392
393
394
def load(buffer: TextIOBase):
    """Load `*.card` text/file buffer into dictionary.

    Args:
        buffer: text or file buffer with the `*.card` file

    Returns:
        dict: Loaded `*.card` file
    """
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    return loads(buffer.read())
loads
loads(text: str)

Load *.card file string into dictionary.

Parameters:

  • text (str) –

    content of the *.card file

Returns:

  • dict

    Loaded *.card file

Source code in ipsl_common/modipsl/card_file.py
397
398
399
400
401
402
403
404
405
406
def loads(text: str):
    """Load `*.card` file string into dictionary.

    Args:
        text: content of the `*.card` file

    Returns:
        dict: Loaded `*.card` file
    """
    return CardFileDecoder().decode(text)
def_file

Read, write, and modify *.def file from the IPSL/modipsl project.

The *.def file contains model parameters in the key-value format. The format is extremely simple in comparison to similar formats, like *.ini or *.toml, as it doesn't provide sections, nor standarized datatypes. Usually, model configuration files contain dozens or hundred of parameters with scalars (int, float, str), arrays, or special _AUTO_/_AUTOBLOCK_ values.

The special "auto" values are used by the IPSL/modipsl/libIGCM projects to mark place where an external script, called component driver, should inject configuration values.

Examples usage:

// Follows Python convention of the JSON module with load(), dump(), etc. functions
from ipsl_common.modipsl.def_file import load, dump
with open("run_dynamico.def", "r") as f:
    parameters = load(f)
    // Modify loaded parameters
    parameters["start_file_name"] = "start2024.nc"
    with open("new_run_dynamico.def", "w") as g:
        dump(parameters, g)

Loading an example *.def file:

INCLUDEDEF=run_lmdz.def
INCLUDEDEF=run_dynamico.def
use_forcing=y
g=_AUTO_: DEFAULT=9.8
start_file_name=start2023
physics="always"

Gives the following dictionary:

{
    'INCLUDEDEF': ['run_lmdz.def', 'run_dynamico.def'],
    'g': ('_AUTO_', 9.8),
    'physics': '"always"',
    'start_file_name': 'start2023',
    'use_forcing': True
}

Loaded configuration can be altered and subsequently dumped onto a file or into a string. The configuration is easy to view and modify, because it is directly decoded into a Python dictionary. The decoding and encoding process is managed internally by DefFileDecoder and DefFileEncoder classes with decode() and encode() methods.

Classes
DefFileDecoder
DefFileDecoder(include_positions: bool = False)

Bases: Transformer

Decoder performs translation from *.def file to a dictionary.

The translation rules are:

*.def Python Comment
key-value dict Including many key-value pairs and INCLUDEDEF
array list Collection of at least 2 elements
string str Unquoted and quoted (single or double) strings
integer number int Standard integer values
real number float Including scientific notation
true/y True Case insensitive
false/n False Case insensitive
_AUTO_ tuple With optional default value
_AUTOBLOCKER_ tuple With optional default value

The first step of the decoder is parsing of *.def file. For this task Lark Earley parser is used with a simple grammar expressed with EBNF notation. The *.def grammer doesn't parse well with LALR(1) parser. The parsing produces a parse tree.

The second step transforms the parse tree into Python dictionary using translation rules mentioned in the above table. This transformation is based on a automated visitor pattern called Transformer, which produces the dictionary in a bottom-up manner.

Examples:

from ipsl_common.modipsl.def_file import DefFileDecoder
dictionary = DefFileDecoder().decode(text)

If text contains this *.def file:

radius=6.371229E6
g=9.80665
omega=_AUTO_: DEFAULT=7.292E-5

Then, Python dictionary would look as follows:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}
Tip

By default, the result dictionary contains no information about textual layout of the file. However, by using the argument include_positions=True, it is possible to refine the dictionary with exact start/end positions of each value as follows:

{
    "radius": {"value": 6371229.0, "start_pos": 50, "end_pos": 60},
    "g": {"value": 9.80665, "start_pos": 99, "end_pos": 106},
    "omega": {"value": ("_AUTO_", 7.292e-05), "start_pos": 158, "end_pos": 182},
}

Initialize DefFileDecoder.

Parameters:

  • include_positions (bool, default: False ) –

    include textual positions of values (offset from the start)

Source code in ipsl_common/modipsl/def_file.py
198
199
200
201
202
203
204
def __init__(self, include_positions: bool = False) -> None:
    """Initialize DefFileDecoder.

    Args:
        include_positions: include textual positions of values (offset from the start)
    """
    self._include_positions = include_positions
Methods:
decode
decode(content: str) -> dict

Decode *.def content into dictionary.

Parameters:

  • content (str) –

    content of the *.def file

Returns:

  • dict ( dict ) –

    Decoded *.def file

Source code in ipsl_common/modipsl/def_file.py
206
207
208
209
210
211
212
213
214
215
216
def decode(self, content: str) -> dict:
    """Decode `*.def` content into dictionary.

    Args:
        content (str): content of the `*.def` file

    Returns:
        dict: Decoded `*.def` file
    """
    parse_tree = self._parser.parse(content)
    return self.transform(parse_tree)
DefFileEncoder
DefFileEncoder(
    truthy_value: str = "true", falsey_value: str = "false"
)

Encoder performs translation from a dictionary to *.def file.

The translation rules are:

Python *.def Comment
dict key-value Each INCLUDEDEF value translates to a single key-value
list array Of at least 2 elements
str string Quoted strings will contain explicit quote characters
int integer number ---
float real number Including scientific notation
bool true/false Can be specified with encode arguments
tuple _AUTO_/_AUTOBLOCKER_ With optional default value at second position in tuple

The translation is straightforward, based on Python type a specific conversion is performed. No grammar, nor parse tree is used during this step.

Example:

from ipsl_common.modipsl.def_file import DefFileEncoder
text = DefFileEncoder().encode(dictionary)

If dictionary contains:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}

Then, the encoded *.def file would look as follows:

radius = 6371229.0
g = 9.80665
omega = _AUTO_: DEFAULT=7.292e-05
Tip

Python representation of the *.def file doesn't contain any textual position of particular elements (keys, values, comments, whitespaces, etc.), thus, re-encoding of the exact input *.def file is impossible. In order to recreate the original file, or modify a file while keeping the original comments, whitespaces, and order of elements, use the designated modify functions.

Initialize DefFileEncoder.

Parameters:

  • truthy_value (str, default: 'true' ) –

    label used to encode True

  • falsey_value (str, default: 'false' ) –

    label used to encode False

Source code in ipsl_common/modipsl/def_file.py
352
353
354
355
356
357
358
359
360
361
362
363
364
def __init__(
    self,
    truthy_value: str = "true",
    falsey_value: str = "false",
):
    """Initialize DefFileEncoder.

    Args:
        truthy_value: label used to encode True
        falsey_value: label used to encode False
    """
    self._truthy_value = truthy_value
    self._falsey_value = falsey_value
Methods:
encode
encode(obj: object) -> str

Encode dictionary or other Python object into *.def file.

This function works not only on full decoded *.def files, but it also work on particular Python object such as a list, or a tuple. In such case, it will take the given object and apply one of the encoding rules mentioned before.

Parameters:

  • obj (object) –

    dictionary or Python object

Returns:

  • str

    Encoded text of a *.def file

Source code in ipsl_common/modipsl/def_file.py
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
def encode(self, obj: object) -> str:
    """Encode dictionary or other Python object into `*.def` file.

    This function works not only on full decoded `*.def` files,
    but it also work on particular Python object such as a list,
    or a tuple. In such case, it will take the given object and apply
    one of the encoding rules mentioned before.

    Args:
        obj: dictionary or Python object

    Returns:
        Encoded text of a `*.def` file
    """
    # bool must be tested before int, because it is a subclass of int class.
    if isinstance(obj, bool):
        return self._truthy_value if obj else self._falsey_value
    elif isinstance(obj, int | float):
        return str(obj)
    elif isinstance(obj, tuple):
        if len(obj) != 2:
            raise ValueError(
                f"Only tuples with len=2 are encoded. Invalid tuple: {obj}"
            )
        auto, default = obj
        if auto not in _AUTO_VALUES:
            raise ValueError(
                f"First element of a tuple must be one of {_AUTO_VALUES}. Invalid tuple: {obj}"
            )
        if not self._is_scalar(default):
            raise TypeError(
                f"Second element of the tuple must be a scalar: bool, int, float, str, or None. Invalid tuple: {obj}"
            )
        if default is not None:
            return f"{auto}: DEFAULT={self.encode(default)}"
        else:
            return str(auto)
    elif isinstance(obj, dict) and len(obj) == 1:
        # Using tuple assignment
        ((k, v),) = obj.items()
        if k == _INCLUDEDEF:
            return self._encode_include(v)
        else:
            return self._encode_entry(k, v)
    elif isinstance(obj, dict):
        return "\n".join([self.encode({k: v}) for k, v in obj.items()])
    elif isinstance(obj, list):
        if len(obj) < 2:
            raise ValueError("List must have at least two elements")
        if not all(map(self._is_scalar, obj)):
            raise TypeError(
                f"All array values must be simple scalars: bool, int, float, str, or None. Instead got: {obj}"
            )
        return ", ".join(map(self.encode, obj))
    else:
        return str(obj)
Functions:
dump
dump(obj: dict, buffer: TextIOBase) -> None

Dump dictionary into *.def text/file buffer.

Parameters:

  • obj (dict) –

    dictionary to dump to a file

  • buffer (TextIOBase) –

    text or file buffer for storing *.def file

Source code in ipsl_common/modipsl/def_file.py
472
473
474
475
476
477
478
479
480
481
def dump(obj: dict, buffer: TextIOBase) -> None:
    """Dump dictionary into *.def text/file buffer.

    Args:
        obj: dictionary to dump to a file
        buffer: text or file buffer for storing *.def file
    """
    if not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writable")
    buffer.write(dumps(obj))
dumps
dumps(obj: dict) -> str

Dump dictionary into *.def string.

Parameters:

  • obj (dict) –

    dictionary to dump to a file

Source code in ipsl_common/modipsl/def_file.py
484
485
486
487
488
489
490
def dumps(obj: dict) -> str:
    """Dump dictionary into *.def string.

    Args:
        obj: dictionary to dump to a file
    """
    return DefFileEncoder().encode(obj)
load
load(
    buffer: TextIOBase,
    include_positions: bool = False,
    return_text: bool = False,
) -> dict | tuple[dict, str]

Load *.def text/file buffer into dictionary.

Parameters:

  • buffer (TextIOBase) –

    text or file buffer with the *.def file

  • include_positions (bool, default: False ) –

    include textual positions of values

Returns:

  • dict ( dict | tuple[dict, str] ) –

    Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
def load(
    buffer: TextIOBase, include_positions: bool = False, return_text: bool = False
) -> dict | tuple[dict, str]:
    """Load `*.def` text/file buffer into dictionary.

    Args:
        buffer: text or file buffer with the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    text = buffer.read()
    if return_text:
        return loads(text, include_positions), text
    else:
        return loads(text, include_positions)
loads
loads(text: str, include_positions: bool = False) -> dict

Load *.def file string into dictionary.

Parameters:

  • text (str) –

    content of the *.def file

  • include_positions (bool, default: False ) –

    include textual positions of values

Returns:

  • dict ( dict ) –

    Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py
459
460
461
462
463
464
465
466
467
468
469
def loads(text: str, include_positions: bool = False) -> dict:
    """Load `*.def` file string into dictionary.

    Args:
        text: content of the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    return DefFileDecoder(include_positions).decode(text)
modify
modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None

Modify *.def text/file buffer with minimal amount of changes.

THe output text/file buffer follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

  • buffer (TextIOBase) –

    text or file buffer for reading the file (optionally to write to the file)

  • new_obj (dict) –

    modified representation of the *.def file content

  • buffer_out (TextIOBase | None, default: None ) –

    optional output buffer for writing the modified file content

  • insert_header (str, default: '' ) –

    optional header inserted before appended key-value pairs

Source code in ipsl_common/modipsl/def_file.py
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
def modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None:
    """Modify `*.def` text/file buffer with minimal amount of changes.

    THe output text/file buffer follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        buffer(TextIOBase): text or file buffer for reading the file (optionally to write to the file)
        new_obj(dict): modified representation of the *.def file content
        buffer_out(TextIOBase | None): optional output buffer for writing the modified file content
        insert_header(str): optional header inserted before appended key-value pairs
    """
    # Buffer must always be readable
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    # Buffer must be writeable if buffer_out is not passed
    if buffer_out is None and not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writeable")
    # Otherwise, buffer_out must be writeable
    elif buffer_out is not None and buffer_out.writable():
        raise ValueError("Text buffer_out (TextIOBase) must be writeable")

    target_buffer = buffer_out if buffer_out is not None else buffer
    target_buffer.write(modifys(buffer.read(), new_obj, insert_header=insert_header))
modifys
modifys(
    text: str, new_obj: dict, insert_header: str = ""
) -> str

Modify *.def file string with minimal amount of changes.

THe output text follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

  • text (str) –

    input text with *.def content to modify

  • new_obj (dict) –

    modified representation of the *.def file content

  • insert_header (str, default: '' ) –

    optional header inserted before appended key-value pairs

Returns:

  • str ( str ) –

    Modified text

Source code in ipsl_common/modipsl/def_file.py
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
def modifys(text: str, new_obj: dict, insert_header: str = "") -> str:
    """Modify `*.def` file string with minimal amount of changes.

    THe output text follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        text(str): input text with *.def content to modify
        new_obj(dict): modified representation of the *.def file content
        insert_header(str): optional header inserted before appended key-value pairs

    Returns:
        str: Modified text
    """
    old_obj = loads(text, include_positions=True)
    encoder = DefFileEncoder()
    replacements = []

    def _eat_newline(text: str, position: int) -> int:
        """Return new position with the newline eaten"""
        return position + 1 if text[position] == "\n" else position

    # Replace removed key-value pairs by empty strings
    deleted_keys = old_obj.keys() - new_obj.keys()
    for k in deleted_keys:
        v = old_obj[k]
        replacements.append((v["key_start_pos"], _eat_newline(text, v["end_pos"]), ""))

    # TODO: refactor this part in the future, now I don't have enough strength to do so
    include_insert_pos = 0
    old_includes_with_pos = old_obj.pop("INCLUDEDEF", [])
    old_includes = [m["value"] for m in old_includes_with_pos]
    to_remove = old_includes.copy()
    for m in new_obj.pop("INCLUDEDEF", []):
        if m in old_includes:
            to_remove.remove(m)
            include_insert_pos = (
                old_includes_with_pos[old_includes.index(m)]["end_pos"] + 1
            )
        else:
            replacements.append(
                (include_insert_pos, include_insert_pos, f"INCLUDEDEF={m}\n")
            )
    # Whatever has left is to remove
    for m in to_remove:
        d = old_includes_with_pos[old_includes.index(m)]
        replacements.append((d["key_start_pos"], _eat_newline(text, d["end_pos"]), ""))

    # Apply modification to existing keys
    for k, v in old_obj.items():
        if k in new_obj and v["value"] != new_obj[k]:
            replacements.append(
                (v["start_pos"], v["end_pos"], encoder.encode(new_obj[k]))
            )
    new_text = replace_text(text, replacements)
    new_text += insert_header
    # Then, add new keys at the end
    # INFO: We cannot use .keys() to find the new keys, because this method
    # returns a set which doesn't keep keys order. Instead, we can iterate
    # over items in dictionaries which is guaranteed to preserve the
    # insertion order since Python 3.7
    # (https://docs.python.org/3/library/stdtypes.html#typesmapping)
    new_keys = []
    for k in new_obj:
        if k not in old_obj:
            new_keys.append(k)

    if append_keys := encoder.encode({k: new_obj[k] for k in new_keys}):
        new_text += append_keys
        new_text += "\n"
    return new_text
mod_file

Read mod.def file from the IPSL/modipsl project.

The mod.def file contains definitions of coupled model configurations. A single configuration consists of one or more components (e.g. athmospheric model, dynamics, I/O system, experiments).

Examples:

from ipsl_common.modipsl.mod_file import load
with open("mod.def", "r") as f:
    configs = load(f)

The loaded dictionary has a fixed schema. Using the following input:

#-H- GRISLI GRISLI stand-alone for Antarctica icesheets (prototype)
#-C- GRISLI trunk/libIGCM HEAD 10 libIGCM .
#-C- GRISLI branches/xios HEAD 26 GRISLI modeles
...
#-S- 7 git https://gitlab.in2p3.fr/ipsl/projets/nemogcm/nemo.git
#-S- 8 svn --username icmc_users https://forge.ipsl.fr/igcmg/svn

the output looks as follows:

{
    'configuration': {
        'GRISLI': {
            'description': ['GRISLI stand-alone for Antarctica icesheets (prototype)']
            'components': [
                {
                    'modipsl_dir': '.',
                    'name': 'libIGCM',
                    'repository': 10,
                    'revision': 'HEAD',
                    'variant': 'trunk/libIGCM'
                },
                {
                    'modipsl_dir': 'modeles',
                    'name': 'GRISLI',
                    'repository': 26,
                    'revision': 'HEAD',
                    'variant': 'branches/xios'
                },
                (...)
            ],
        },
        (...)
    },
    'repository': {
        7: {
            'clone_url': 'https://gitlab.in2p3.fr/ipsl/projets/nemogcm/nemo.git',
            'type': 'git'
        },
        8: {
            'clone_url': '--username icmc_users https://forge.ipsl.fr/igcmg/svn',
            'type': 'svn'
        },
        (...)
    }
}
Warning

Because this module uses iterative matching to patterns (with re.finditer), it doesn't have a capability to tell, when the *.mod file is not well formatted, nor invalid. It will siliently skip non-matched lines and move on!

Functions:
load
load(buffer: TextIOBase) -> dict

Load mod.def file from a text/file buffer.

Parameters:

  • buffer (TextIOBase) –

    text or file buffer with the mod.def file

Returns:

  • dict ( dict ) –

    Loaded mod.def file

Source code in ipsl_common/modipsl/mod_file.py
151
152
153
154
155
156
157
158
159
160
def load(buffer: io.TextIOBase) -> dict:
    """Load mod.def file from a text/file buffer.

    Args:
        buffer (io.TextIOBase): text or file buffer with the mod.def file

    Returns:
        dict: Loaded mod.def file
    """
    return loads(buffer.read())
loads
loads(content: str) -> dict

Load mod.def file from a string.

Parameters:

  • content (str) –

    content of the mod.def file

Returns:

  • dict ( dict ) –

    Loaded mod.def file

Source code in ipsl_common/modipsl/mod_file.py
163
164
165
166
167
168
169
170
171
172
173
174
175
176
def loads(content: str) -> dict:
    """Load mod.def file from a string.

    Args:
        content (str): content of the mod.def file

    Returns:
        dict: Loaded mod.def file
    """
    repositories = __parse_repositories(content)
    return {
        "repository": repositories,
        "configuration": __parse_configurations(content, repositories),
    }

path

Utility path/pathlib functions.

Functions:

try_joinpath
try_joinpath(
    path: Path, *others: str | Path
) -> Path | None

Test and return joined path if it exists.

This function is especially useful when used with walrus operator (:=).

Example:

if p := try_joinpath(some_dir, "subdir", "file.txt"):
    with open(p, "r") as f:
        ...

Parameters:

  • path
    (Path) –

    path to test

  • others
    (str | Path, default: () ) –

    other paths to join

Returns:

  • Path | None

    full joined path if it exists or None

Source code in ipsl_common/path.py
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def try_joinpath(path: Path, *others: str | Path) -> Path | None:
    """Test and return joined path if it exists.

    This function is especially useful when used with walrus operator (:=).

    Example:

        if p := try_joinpath(some_dir, "subdir", "file.txt"):
            with open(p, "r") as f:
                ...

    Args:
        path: path to test
        others: other paths to join

    Returns:
        full joined path if it exists or None
    """
    new_path = path.joinpath(*others)
    return new_path if new_path.exists() else None
try_path
try_path(path: Path) -> Path | None

Test and return path if it exists.

This function is especially useful when used with walrus operator (:=).

Example:

if p := try_path(some_file):
    with open(p, "r") as f:
        ...

Parameters:

  • path
    (Path) –

    path to test

Returns:

  • Path | None

    path if it exists or None

Source code in ipsl_common/path.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def try_path(path: Path) -> Path | None:
    """Test and return path if it exists.

    This function is especially useful when used with walrus operator (:=).

    Example:

        if p := try_path(some_file):
            with open(p, "r") as f:
                ...

    Args:
        path: path to test

    Returns:
        path if it exists or None
    """
    return path if path.exists() else None

python

Python-related functions, such as: checking Python version.

Functions:

check_minimal_version
check_minimal_version(
    major: int,
    minor: int,
    reason: str = "",
    exit_on_fail: bool = False,
) -> None

Check if minimal Python version is present. If not, print warning (default) or exit program entirely. The passed major.minor version is not validated and it can have any value, e.g. even non-existing Python versions like 4.24.

Source code in ipsl_common/python.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def check_minimal_version(
    major: int, minor: int, reason: str = "", exit_on_fail: bool = False
) -> None:
    """
    Check if minimal Python version is present. If not, print warning (default) or
    exit program entirely.
    The passed major.minor version is not validated and it can have
    any value, e.g. even non-existing Python versions like 4.24.
    """
    if sys.version_info[0:2] < (major, minor):
        if reason:
            reason = f" {reason}\n"
        message = f"Python {major}.{minor} or later is required.{reason}"
        if exit_on_fail:
            sys.exit(message)
        else:
            log.warning(message)

scripts

Modules

slurm_logs

Process slurm logs from the command-line.

Examples:

ipsl_slurm_logs slurm.log --output-dir output_test/ --remove-trailing-whitespaces

The above command separates slurm.log containing model execution using 4 MPI processes into distinct files inside the output_test/ directory:

output_test/
├── output_0.log
├── output_10.log
├── output_11.log
├── output_12.log

JSON example:

ipsl_slurm_logs slurm.log --json > slurm.json

Will separate the slurm.log into a JSON format.

Functions:
main
main()
Source code in ipsl_common/scripts/slurm_logs.py
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
def main():
    parser = ArgumentParser()
    parser.add_argument(
        "log_file", type=existing_file, help="Labelled SLURM log file to process"
    )
    parser.add_argument(
        "--json",
        action="store_true",
        help="Print separted logs to console in JSON format",
    )
    parser.add_argument(
        "--remove-trailing-whitespaces",
        action="store_true",
        help=(
            "Remove trailing whitespaces and empty lines in separated logs."
            "The leading whitespaces are kept because indentation might provide additional information in logs."
        ),
    )
    parser.add_argument("--output-name", type=str, default="output_{process_id}.log")
    parser.add_argument("--output-dir", default=Path("."))

    args = parser.parse_args()

    with open(args.log_file, "r") as fp:
        if args.json:
            print(
                json.dumps(
                    separate_labelled_log(
                        fp, remove_trailing_whitespaces=args.remove_trailing_whitespaces
                    )
                )
            )
        else:
            separate_labelled_log_to_files(
                fp,
                args.output_dir,
                args.output_name,
                remove_trailing_whitespaces=args.remove_trailing_whitespaces,
            )

slurm

Improved interaction with the SLURM ecosystem.

This module includes: functions to separate labelled SLURM logs.

Functions:

separate_labelled_log
separate_labelled_log(
    fp: TextIO, remove_trailing_whitespaces: bool = False
) -> dict[int | str, list[str]]

Seperate SLURM labelled log by process IDs.

Almost each line in the SLURM labelled log starts with the process number followed by a colon:

22:  USING DEFAULTS : area_radius1 =   3360.00000000000
26:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  GETIN area_radius1 =    3360.00000000000
12:  USING DEFAULTS : area_rotation_pre =  0.000000000000000E+000
12:  USING DEFAULTS : area_rotation =  0.000000000000000E+000

This function reads such log file and groups lines coming from the same process. If a line doesn't have the label, which can happen when it is an error coming from srun or sbatch, it is placed under the slurm key.

This function might take a lot of memory for big logs, because it loads all the processed lines into one dictionary. Consider using separate_labelled_log_to_files() to separate files into files on-the-fly.

Parameters:

  • fp
    (TextIO) –

    text I/O stream with the log's content

  • remove_trailing_whitespaces
    (bool, default: False ) –

    remove trailing whitespaces and empty lines

Returns: dict: labelled log lines grouped by processes.

Source code in ipsl_common/slurm.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
def separate_labelled_log(
    fp: TextIO, remove_trailing_whitespaces: bool = False
) -> dict[int | str, list[str]]:
    """
    Seperate SLURM labelled log by process IDs.

    Almost each line in the SLURM labelled log starts with the process number
    followed by a colon:

    ```
    22:  USING DEFAULTS : area_radius1 =   3360.00000000000
    26:  USING DEFAULTS : area_radius1 =   3360.00000000000
     0:  USING DEFAULTS : area_radius1 =   3360.00000000000
     0:  GETIN area_radius1 =    3360.00000000000
    12:  USING DEFAULTS : area_rotation_pre =  0.000000000000000E+000
    12:  USING DEFAULTS : area_rotation =  0.000000000000000E+000
    ```

    This function reads such log file and groups lines coming from the same process.
    If a line doesn't have the label, which can happen when it is an error coming
    from `srun` or `sbatch`, it is placed under the `slurm` key.

    This function might take a lot of memory for big logs, because it loads all
    the processed lines into one dictionary. Consider using
    `separate_labelled_log_to_files()` to separate files into files on-the-fly.

    Args:
        fp: text I/O stream with the log's content
        remove_trailing_whitespaces: remove trailing whitespaces and empty lines
    Returns:
        dict: labelled log lines grouped by processes.
    """
    processed_log = {}

    for line in fp:
        # Group only process-laballed lines
        if m := _LABELLED_LOG_LINE.search(line):
            process_id = int(m.group(1))
            process_msg = m.group(2)
            if process_id not in processed_log:
                processed_log[process_id] = []
            if remove_trailing_whitespaces:
                process_msg = process_msg.rstrip()
                if not process_msg:
                    continue
            processed_log[process_id].append(process_msg)
        # Other, append under the `slurm` key
        else:
            if "slurm" not in processed_log:
                processed_log["slurm"] = []
            if remove_trailing_whitespaces:
                line = line.rstrip()
                if not line:
                    continue
            processed_log["slurm"].append(line.rstrip("\n"))
    return processed_log
separate_labelled_log_to_files
separate_labelled_log_to_files(
    fp: TextIO,
    output_dir: Path,
    output_name: str,
    remove_trailing_whitespaces: bool = False,
) -> None

Seperate SLURM labelled log by process IDs, on-the-fly, into dedicated log files.

This function will not load the log content into memory like separate_labelled_log() does. Instead, it will process it line-by-line and write to dedicated log files. Use this function for big logs.

Parameters:

  • fp
    (TextIO) –

    text I/O stream with the log's content

  • output_dir
    (Path) –

    directory where dedicated logs will be placed (must exist)

  • output_name
    (str) –

    the name of the output file (must contain {process_id} as a placeholder of the process ID)

  • remove_trailing_whitespaces
    (bool, default: False ) –

    remove trailing whitespaces and empty lines

Source code in ipsl_common/slurm.py
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
def separate_labelled_log_to_files(
    fp: TextIO,
    output_dir: Path,
    output_name: str,
    remove_trailing_whitespaces: bool = False,
) -> None:
    """
    Seperate SLURM labelled log by process IDs, on-the-fly, into dedicated log files.

    This function will not load the log content into memory like `separate_labelled_log()` does.
    Instead, it will process it line-by-line and write to dedicated log files.
    Use this function for big logs.

    Args:
        fp: text I/O stream with the log's content
        output_dir: directory where dedicated logs will be placed (must exist)
        output_name: the name of the output file (must contain `{process_id}` as a placeholder of the process ID)
        remove_trailing_whitespaces: remove trailing whitespaces and empty lines
    """
    output_dir = Path(output_dir)
    if not output_dir.exists():
        raise ValueError("The `output_dir` must exist")
    if not output_dir.is_dir():
        raise ValueError("The `output_dir` must be a directory")
    if "{process_id}" not in output_name:
        raise ValueError(
            "The `output_name` must contain `{process_id}` placeholder for the process ID."
        )
    output_files = {}
    for line in fp:
        # Write lines to per-process output files
        if m := _LABELLED_LOG_LINE.search(line):
            process_id = int(m.group(1))
            process_msg = m.group(2)
            if process_id not in output_files:
                output_files[process_id] = open(
                    output_dir / output_name.format(process_id=process_id), "w"
                )
            if remove_trailing_whitespaces:
                process_msg = process_msg.rstrip()
                if not process_msg:
                    continue
            output_files[process_id].write(f"{process_msg}\n")
        # Other, write to the `slurm` file
        else:
            if "slurm" not in output_files:
                output_files["slurm"] = open(
                    output_dir / output_name.format(process_id="slurm"), "w"
                )
            if remove_trailing_whitespaces:
                line = line.rstrip()
                if not line:
                    continue
            output_files["slurm"].write(line)

    for f in output_files.values():
        f.close()

str

Collection of string related functions

Functions:

contains
contains(text: str, *args) -> bool

Check if text contains all given substrings.

Parameters:

  • text
    (str) –

    Text to analyse.

  • args
    (list[str], default: () ) –

    List of substrings to match.

Returns:

  • bool ( bool ) –

    True if all substrings are found in the text.

Source code in ipsl_common/str.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def contains(text: str, *args) -> bool:
    """Check if text contains all given substrings.

    Args:
        text (str): Text to analyse.
        args (list[str]): List of substrings to match.

    Returns:
        bool: True if all substrings are found in the text.
    """
    for arg in args:
        if text not in arg:
            return False
    return True
contains_any
contains_any(text: str, *args) -> bool

Check if text contains any of given substrings.

Parameters:

  • text
    (str) –

    Text to analyse.

  • args
    (list[str], default: () ) –

    List of substrings to match.

Returns:

  • bool ( bool ) –

    True if any of substrings is found in the text.

Source code in ipsl_common/str.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def contains_any(text: str, *args) -> bool:
    """Check if text contains any of given substrings.

    Args:
        text (str): Text to analyse.
        args (list[str]): List of substrings to match.

    Returns:
        bool: True if any of substrings is found in the text.
    """
    for arg in args:
        if text in arg:
            return True
    return False
is_float
is_float(text: str) -> bool

Check if text represents floating value.

Parameters:

  • text
    (str) –

    Text to verify

Returns:

  • bool ( bool ) –

    True if text represents a floating value.

Source code in ipsl_common/str.py
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
def is_float(text: str) -> bool:
    """Check if text represents floating value.

    Args:
        text(str): Text to verify

    Returns:
        bool: True if text represents a floating value.
    """
    try:
        float(text)
    except ValueError:
        return False
    else:
        return True
is_int
is_int(text: str) -> bool

Check if text represents integer value.

Parameters:

  • text
    (str) –

    Text to verify

Returns:

  • bool ( bool ) –

    True if text represents an integer value.

Source code in ipsl_common/str.py
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def is_int(text: str) -> bool:
    """Check if text represents integer value.

    Args:
        text(str): Text to verify

    Returns:
        bool: True if text represents an integer value.
    """
    try:
        int(text)
    except ValueError:
        return False
    else:
        return True
replace_text
replace_text(
    text: str,
    replacements: list[tuple[int, int, str]],
    sort_replacements: bool = True,
) -> str

Replace given text by a collection of replacements. Each replacement defines starting and ending positions with a new text value to replace. If the start and end positions of a replacements are the same, the text will be inject at given position. Using an empty replacement text "" will remove text between start and end positions.

Parameters:

  • text
    (str) –

    Text to replace

  • replacements
    (list[tuple[int, int, str]]) –

    Collection of replacements of the form (start position, end position, new value).

  • sort_replacements
    (bool, default: True ) –

    Sort replacements by their start/end positions.

Returns: str: New text with applied replacements.

Source code in ipsl_common/str.py
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
def replace_text(
    text: str, replacements: list[tuple[int, int, str]], sort_replacements: bool = True
) -> str:
    """
    Replace given text by a collection of replacements. Each replacement defines
    starting and ending positions with a new text value to replace. If the
    start and end positions of a replacements are the same, the text will be inject
    at given position. Using an empty replacement text "" will remove text between
    start and end positions.

    Args:
        text(str): Text to replace
        replacements(list[tuple[int, int, str]]): Collection of replacements
            of the form (start position, end position, new value).
        sort_replacements(bool): Sort replacements by their start/end positions.
    Returns:
        str: New text with applied replacements.
    """
    if sort_replacements:
        replacements = sorted(replacements, key=lambda r: (r[0], r[1]))
    else:
        positions = list(chain.from_iterable((r[0], r[1]) for r in replacements))
        if not all(positions[n] <= positions[n + 1] for n in range(len(positions) - 1)):
            raise ValueError(
                "Positions in replacements must be monotonically increasing, "
                "i.e. p1 <= p2 <= p3 <= p4 <= ... for [('..',  p1, p2), ('..', p3, p4), ...]"
            )

    def _apply_replacements(text, replacements):
        """Recursively apply the replacements to a given text"""
        # No replacements means no changes!
        if not replacements:
            return text
        (start, end, new_value) = replacements.pop()

        return (
            _apply_replacements(text[:start], replacements)
            + new_value
            + _apply_replacements(text[end:], replacements)
        )

    # Replacements will be eaten, hence, we make a copy of them!
    return _apply_replacements(text, replacements.copy())