Skip to content

def_file

Read, write, and modify *.def file from the IPSL/modipsl project.

The *.def file contains model parameters in the key-value format. The format is extremely simple in comparison to similar formats, like *.ini or *.toml, as it doesn't provide sections, nor standarized datatypes. Usually, model configuration files contain dozens or hundred of parameters with scalars (int, float, str), arrays, or special _AUTO_/_AUTOBLOCK_ values.

The special "auto" values are used by the IPSL/modipsl/libIGCM projects to mark place where an external script, called component driver, should inject configuration values.

Examples usage:

// Follows Python convention of the JSON module with load(), dump(), etc. functions
from ipsl_common.modipsl.def_file import load, dump
with open("run_dynamico.def", "r") as f:
    parameters = load(f)
    // Modify loaded parameters
    parameters["start_file_name"] = "start2024.nc"
    with open("new_run_dynamico.def", "w") as g:
        dump(parameters, g)

Loading an example *.def file:

INCLUDEDEF=run_lmdz.def
INCLUDEDEF=run_dynamico.def
use_forcing=y
g=_AUTO_: DEFAULT=9.8
start_file_name=start2023
physics="always"

Gives the following dictionary:

{
    'INCLUDEDEF': ['run_lmdz.def', 'run_dynamico.def'],
    'g': ('_AUTO_', 9.8),
    'physics': '"always"',
    'start_file_name': 'start2023',
    'use_forcing': True
}

Loaded configuration can be altered and subsequently dumped onto a file or into a string. The configuration is easy to view and modify, because it is directly decoded into a Python dictionary. The decoding and encoding process is managed internally by DefFileDecoder and DefFileEncoder classes with decode() and encode() methods.

Classes

DefFileDecoder

DefFileDecoder(include_positions: bool = False)

Bases: Transformer

Decoder performs translation from *.def file to a dictionary.

The translation rules are:

*.def Python Comment
key-value dict Including many key-value pairs and INCLUDEDEF
array list Collection of at least 2 elements
string str Unquoted and quoted (single or double) strings
integer number int Standard integer values
real number float Including scientific notation
true/y True Case insensitive
false/n False Case insensitive
_AUTO_ tuple With optional default value
_AUTOBLOCKER_ tuple With optional default value

The first step of the decoder is parsing of *.def file. For this task Lark Earley parser is used with a simple grammar expressed with EBNF notation. The *.def grammer doesn't parse well with LALR(1) parser. The parsing produces a parse tree.

The second step transforms the parse tree into Python dictionary using translation rules mentioned in the above table. This transformation is based on a automated visitor pattern called Transformer, which produces the dictionary in a bottom-up manner.

Examples:

from ipsl_common.modipsl.def_file import DefFileDecoder
dictionary = DefFileDecoder().decode(text)

If text contains this *.def file:

radius=6.371229E6
g=9.80665
omega=_AUTO_: DEFAULT=7.292E-5

Then, Python dictionary would look as follows:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}
Tip

By default, the result dictionary contains no information about textual layout of the file. However, by using the argument include_positions=True, it is possible to refine the dictionary with exact start/end positions of each value as follows:

{
    "radius": {"value": 6371229.0, "start_pos": 50, "end_pos": 60},
    "g": {"value": 9.80665, "start_pos": 99, "end_pos": 106},
    "omega": {"value": ("_AUTO_", 7.292e-05), "start_pos": 158, "end_pos": 182},
}

Initialize DefFileDecoder.

Parameters:

  • include_positions

    (bool, default: False ) –

    include textual positions of values (offset from the start)

Source code in ipsl_common/modipsl/def_file.py
198
199
200
201
202
203
204
def __init__(self, include_positions: bool = False) -> None:
    """Initialize DefFileDecoder.

    Args:
        include_positions: include textual positions of values (offset from the start)
    """
    self._include_positions = include_positions

Methods:

decode
decode(content: str) -> dict

Decode *.def content into dictionary.

Parameters:

  • content
    (str) –

    content of the *.def file

Returns:

  • dict ( dict ) –

    Decoded *.def file

Source code in ipsl_common/modipsl/def_file.py
206
207
208
209
210
211
212
213
214
215
216
def decode(self, content: str) -> dict:
    """Decode `*.def` content into dictionary.

    Args:
        content (str): content of the `*.def` file

    Returns:
        dict: Decoded `*.def` file
    """
    parse_tree = self._parser.parse(content)
    return self.transform(parse_tree)

DefFileEncoder

DefFileEncoder(
    truthy_value: str = "true", falsey_value: str = "false"
)

Encoder performs translation from a dictionary to *.def file.

The translation rules are:

Python *.def Comment
dict key-value Each INCLUDEDEF value translates to a single key-value
list array Of at least 2 elements
str string Quoted strings will contain explicit quote characters
int integer number ---
float real number Including scientific notation
bool true/false Can be specified with encode arguments
tuple _AUTO_/_AUTOBLOCKER_ With optional default value at second position in tuple

The translation is straightforward, based on Python type a specific conversion is performed. No grammar, nor parse tree is used during this step.

Example:

from ipsl_common.modipsl.def_file import DefFileEncoder
text = DefFileEncoder().encode(dictionary)

If dictionary contains:

{
    "radius": 6.371229e6,
    "g": 9.80665,
    "omega": ("_AUTO_", 7.292e-05),
}

Then, the encoded *.def file would look as follows:

radius = 6371229.0
g = 9.80665
omega = _AUTO_: DEFAULT=7.292e-05
Tip

Python representation of the *.def file doesn't contain any textual position of particular elements (keys, values, comments, whitespaces, etc.), thus, re-encoding of the exact input *.def file is impossible. In order to recreate the original file, or modify a file while keeping the original comments, whitespaces, and order of elements, use the designated modify functions.

Initialize DefFileEncoder.

Parameters:

  • truthy_value

    (str, default: 'true' ) –

    label used to encode True

  • falsey_value

    (str, default: 'false' ) –

    label used to encode False

Source code in ipsl_common/modipsl/def_file.py
352
353
354
355
356
357
358
359
360
361
362
363
364
def __init__(
    self,
    truthy_value: str = "true",
    falsey_value: str = "false",
):
    """Initialize DefFileEncoder.

    Args:
        truthy_value: label used to encode True
        falsey_value: label used to encode False
    """
    self._truthy_value = truthy_value
    self._falsey_value = falsey_value

Methods:

encode
encode(obj: object) -> str

Encode dictionary or other Python object into *.def file.

This function works not only on full decoded *.def files, but it also work on particular Python object such as a list, or a tuple. In such case, it will take the given object and apply one of the encoding rules mentioned before.

Parameters:

  • obj
    (object) –

    dictionary or Python object

Returns:

  • str

    Encoded text of a *.def file

Source code in ipsl_common/modipsl/def_file.py
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
def encode(self, obj: object) -> str:
    """Encode dictionary or other Python object into `*.def` file.

    This function works not only on full decoded `*.def` files,
    but it also work on particular Python object such as a list,
    or a tuple. In such case, it will take the given object and apply
    one of the encoding rules mentioned before.

    Args:
        obj: dictionary or Python object

    Returns:
        Encoded text of a `*.def` file
    """
    # bool must be tested before int, because it is a subclass of int class.
    if isinstance(obj, bool):
        return self._truthy_value if obj else self._falsey_value
    elif isinstance(obj, int | float):
        return str(obj)
    elif isinstance(obj, tuple):
        if len(obj) != 2:
            raise ValueError(
                f"Only tuples with len=2 are encoded. Invalid tuple: {obj}"
            )
        auto, default = obj
        if auto not in _AUTO_VALUES:
            raise ValueError(
                f"First element of a tuple must be one of {_AUTO_VALUES}. Invalid tuple: {obj}"
            )
        if not self._is_scalar(default):
            raise TypeError(
                f"Second element of the tuple must be a scalar: bool, int, float, str, or None. Invalid tuple: {obj}"
            )
        if default is not None:
            return f"{auto}: DEFAULT={self.encode(default)}"
        else:
            return str(auto)
    elif isinstance(obj, dict) and len(obj) == 1:
        # Using tuple assignment
        ((k, v),) = obj.items()
        if k == _INCLUDEDEF:
            return self._encode_include(v)
        else:
            return self._encode_entry(k, v)
    elif isinstance(obj, dict):
        return "\n".join([self.encode({k: v}) for k, v in obj.items()])
    elif isinstance(obj, list):
        if len(obj) < 2:
            raise ValueError("List must have at least two elements")
        if not all(map(self._is_scalar, obj)):
            raise TypeError(
                f"All array values must be simple scalars: bool, int, float, str, or None. Instead got: {obj}"
            )
        return ", ".join(map(self.encode, obj))
    else:
        return str(obj)

Functions:

dump

dump(obj: dict, buffer: TextIOBase) -> None

Dump dictionary into *.def text/file buffer.

Parameters:

  • obj

    (dict) –

    dictionary to dump to a file

  • buffer

    (TextIOBase) –

    text or file buffer for storing *.def file

Source code in ipsl_common/modipsl/def_file.py
472
473
474
475
476
477
478
479
480
481
def dump(obj: dict, buffer: TextIOBase) -> None:
    """Dump dictionary into *.def text/file buffer.

    Args:
        obj: dictionary to dump to a file
        buffer: text or file buffer for storing *.def file
    """
    if not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writable")
    buffer.write(dumps(obj))

dumps

dumps(obj: dict) -> str

Dump dictionary into *.def string.

Parameters:

  • obj

    (dict) –

    dictionary to dump to a file

Source code in ipsl_common/modipsl/def_file.py
484
485
486
487
488
489
490
def dumps(obj: dict) -> str:
    """Dump dictionary into *.def string.

    Args:
        obj: dictionary to dump to a file
    """
    return DefFileEncoder().encode(obj)

load

load(
    buffer: TextIOBase,
    include_positions: bool = False,
    return_text: bool = False,
) -> dict | tuple[dict, str]

Load *.def text/file buffer into dictionary.

Parameters:

  • buffer

    (TextIOBase) –

    text or file buffer with the *.def file

  • include_positions

    (bool, default: False ) –

    include textual positions of values

Returns:

  • dict ( dict | tuple[dict, str] ) –

    Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
def load(
    buffer: TextIOBase, include_positions: bool = False, return_text: bool = False
) -> dict | tuple[dict, str]:
    """Load `*.def` text/file buffer into dictionary.

    Args:
        buffer: text or file buffer with the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    text = buffer.read()
    if return_text:
        return loads(text, include_positions), text
    else:
        return loads(text, include_positions)

loads

loads(text: str, include_positions: bool = False) -> dict

Load *.def file string into dictionary.

Parameters:

  • text

    (str) –

    content of the *.def file

  • include_positions

    (bool, default: False ) –

    include textual positions of values

Returns:

  • dict ( dict ) –

    Loaded *.def file

Source code in ipsl_common/modipsl/def_file.py
459
460
461
462
463
464
465
466
467
468
469
def loads(text: str, include_positions: bool = False) -> dict:
    """Load `*.def` file string into dictionary.

    Args:
        text: content of the `*.def` file
        include_positions: include textual positions of values

    Returns:
        dict: Loaded `*.def` file
    """
    return DefFileDecoder(include_positions).decode(text)

modify

modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None

Modify *.def text/file buffer with minimal amount of changes.

THe output text/file buffer follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

  • buffer

    (TextIOBase) –

    text or file buffer for reading the file (optionally to write to the file)

  • new_obj

    (dict) –

    modified representation of the *.def file content

  • buffer_out

    (TextIOBase | None, default: None ) –

    optional output buffer for writing the modified file content

  • insert_header

    (str, default: '' ) –

    optional header inserted before appended key-value pairs

Source code in ipsl_common/modipsl/def_file.py
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
def modify(
    buffer: TextIOBase,
    new_obj: dict,
    buffer_out: TextIOBase | None = None,
    insert_header: str = "",
) -> None:
    """Modify `*.def` text/file buffer with minimal amount of changes.

    THe output text/file buffer follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        buffer(TextIOBase): text or file buffer for reading the file (optionally to write to the file)
        new_obj(dict): modified representation of the *.def file content
        buffer_out(TextIOBase | None): optional output buffer for writing the modified file content
        insert_header(str): optional header inserted before appended key-value pairs
    """
    # Buffer must always be readable
    if not buffer.readable():
        raise ValueError("Text buffer (TextIOBase) must be readable")
    # Buffer must be writeable if buffer_out is not passed
    if buffer_out is None and not buffer.writable():
        raise ValueError("Text buffer (TextIOBase) must be writeable")
    # Otherwise, buffer_out must be writeable
    elif buffer_out is not None and buffer_out.writable():
        raise ValueError("Text buffer_out (TextIOBase) must be writeable")

    target_buffer = buffer_out if buffer_out is not None else buffer
    target_buffer.write(modifys(buffer.read(), new_obj, insert_header=insert_header))

modifys

modifys(
    text: str, new_obj: dict, insert_header: str = ""
) -> str

Modify *.def file string with minimal amount of changes.

THe output text follows changes made to new_obj representation of *.def file. The modifications are performed, so that the minimal amount of changes is applied. As a result, the diff between the old and new file content is minimal and no comments or whitespaces are lost beyond what is neccessary. The new_obj can remove, modify, and/or add new key-value pairs.

Parameters:

  • text

    (str) –

    input text with *.def content to modify

  • new_obj

    (dict) –

    modified representation of the *.def file content

  • insert_header

    (str, default: '' ) –

    optional header inserted before appended key-value pairs

Returns:

  • str ( str ) –

    Modified text

Source code in ipsl_common/modipsl/def_file.py
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
def modifys(text: str, new_obj: dict, insert_header: str = "") -> str:
    """Modify `*.def` file string with minimal amount of changes.

    THe output text follows changes made to `new_obj` representation of `*.def` file.
    The modifications are performed, so that the minimal amount of changes is applied.
    As a result, the diff between the old and new file content is minimal and no comments
    or whitespaces are lost beyond what is neccessary. The `new_obj` can remove, modify,
    and/or add new key-value pairs.

    Args:
        text(str): input text with *.def content to modify
        new_obj(dict): modified representation of the *.def file content
        insert_header(str): optional header inserted before appended key-value pairs

    Returns:
        str: Modified text
    """
    old_obj = loads(text, include_positions=True)
    encoder = DefFileEncoder()
    replacements = []

    def _eat_newline(text: str, position: int) -> int:
        """Return new position with the newline eaten"""
        return position + 1 if text[position] == "\n" else position

    # Replace removed key-value pairs by empty strings
    deleted_keys = old_obj.keys() - new_obj.keys()
    for k in deleted_keys:
        v = old_obj[k]
        replacements.append((v["key_start_pos"], _eat_newline(text, v["end_pos"]), ""))

    # TODO: refactor this part in the future, now I don't have enough strength to do so
    include_insert_pos = 0
    old_includes_with_pos = old_obj.pop("INCLUDEDEF", [])
    old_includes = [m["value"] for m in old_includes_with_pos]
    to_remove = old_includes.copy()
    for m in new_obj.pop("INCLUDEDEF", []):
        if m in old_includes:
            to_remove.remove(m)
            include_insert_pos = (
                old_includes_with_pos[old_includes.index(m)]["end_pos"] + 1
            )
        else:
            replacements.append(
                (include_insert_pos, include_insert_pos, f"INCLUDEDEF={m}\n")
            )
    # Whatever has left is to remove
    for m in to_remove:
        d = old_includes_with_pos[old_includes.index(m)]
        replacements.append((d["key_start_pos"], _eat_newline(text, d["end_pos"]), ""))

    # Apply modification to existing keys
    for k, v in old_obj.items():
        if k in new_obj and v["value"] != new_obj[k]:
            replacements.append(
                (v["start_pos"], v["end_pos"], encoder.encode(new_obj[k]))
            )
    new_text = replace_text(text, replacements)
    new_text += insert_header
    # Then, add new keys at the end
    # INFO: We cannot use .keys() to find the new keys, because this method
    # returns a set which doesn't keep keys order. Instead, we can iterate
    # over items in dictionaries which is guaranteed to preserve the
    # insertion order since Python 3.7
    # (https://docs.python.org/3/library/stdtypes.html#typesmapping)
    new_keys = []
    for k in new_obj:
        if k not in old_obj:
            new_keys.append(k)

    if append_keys := encoder.encode({k: new_obj[k] for k in new_keys}):
        new_text += append_keys
        new_text += "\n"
    return new_text