Schema adapters

class dyff.schema.adapters.Adapter(*args, **kwargs)

Bases: Protocol

Transforms streams of JSON structures.

class dyff.schema.adapters.Drop(configuration: dict)

Bases: object

Drop named top-level fields.

The configuration is a dictionary:

{
    "fields": list[str]
}
class dyff.schema.adapters.ExplodeCollections(configuration: dict)

Bases: object

Explodes one or more top-level lists of the same length into multiple records, where each record contains the corresponding value from each list. This is useful for turning nested-list representations into “relational” representations where the lists are converted to multiple rows with a unique index.

The configuration argument is a dictionary:

{
    "collections": list[str],
    "index": dict[str, str | None]
}

For example, if the input data is:

[
    {"numbers": [1, 2, 3], "squares": [1, 4, 9], "scalar": "foo"},
    {"numbers": [4, 5], "squares": [16, 25], "scalar": bar"}
]

Then ExplodeCollections({"collections": ["numbers", "squares"]}) will yield this output data:

[
    {"numbers": 1, "squares": 1, "scalar": "foo"},
    {"numbers": 2, "squares": 4, "scalar": "foo"},
    {"numbers": 3, "squares": 9, "scalar": "foo"},
    {"numbers": 4, "squares": 16, "scalar": "bar"},
    {"numbers": 5, "squares": 25, "scalar": "bar"},
]

You can also create indexes for the exploded records. Given the following configuration:

{
    "collections": ["choices"],
    "index": {
        "collection/index": None,
        "collection/rank": "$.choices[*].meta.rank"
    }
}

then for the input:

   [
       {
           "choices": [
               {"label": "foo", "meta": {"rank": 1}},
               {"label": "bar", "meta": {"rank": 0}}
           ]
       },
       ...
   ]

the output will be::

   [
       {
           "choices": {"label": "foo", "meta": {"rank": 1}},
           "collection/index": 0,
           "collection/rank": 1
       },
       {
           "choices": {"label": "bar", "meta": {"rank": 0}},
           "collection/index": 1,
           "collection/rank": 0
       },
       ...
   ]

The None value for the "collection/index" index key means that the adapter should assign indices from 0...n-1 automatically. If the value is not None, it must be a JSONPath query to execute against the pre-transformation data that returns a list. Notice how the example uses $.choices[*] to get the list of choices.

class dyff.schema.adapters.FlattenHierarchy(configuration=None)

Bases: object

Flatten a JSON object – or the JSON sub-objects in named fields – by creating a new object with a key for each “leaf” value in the input.

The configuration options are:

{
    "fields": list[str],
    "depth": int | None,
    "addPrefix": bool
}

If fields is missing or empty, the flattening is applied to the root object. The depth option is the maximum recursion depth. If addPrefix is True (the default), then the resultint fields will be named like "path.to.leaf" to avoid name conflicts.

For example, if the configuration is:

{
    "fields": ["choices"],
    "depth": 1,
    "addPrefix": True
}

and the input is:

{
    "choices": {"label": "foo", "metadata": {"value": 42}},
    "scores": {"top1": 0.9}
}

then the output will be:

{
    "choices.label": "foo",
    "choices.metadata": {"value": 42},
    "scores": {"top1": 0.9}
}

Note that nested lists are considered “leaf” values, even if they contain objects.

class dyff.schema.adapters.HTTPData(content_type, data)

Bases: NamedTuple

content_type: str

Alias for field number 0

data: Any

Alias for field number 1

class dyff.schema.adapters.Map(configuration: dict)

Bases: object

For each input item, map another Adapter over the elements of each of the named nested collections within that item.

The configuration is a dictionary:

{
    "collections": list[str],
    "adapter": {
        "kind": <AdapterType>
        "configuration": <AdapterConfigurationDictionary>
    }
}
class dyff.schema.adapters.Pipeline(adapters: list[Adapter])

Bases: object

Apply multiple adapters in sequence.

class dyff.schema.adapters.Rename(configuration: dict)

Bases: object

Rename top-level fields in each JSON object.

The input is a dictionary {old_name: new_name}.

class dyff.schema.adapters.Select(configuration: dict)

Bases: object

Select named top-level fields and drop the others.

The configuration is a dictionary:

{
    "fields": list[str]
}
class dyff.schema.adapters.TransformJSON(configuration: dict)

Bases: object

Transform an input JSON structure by creating a new output JSON structure where all of the “leaf” values are populated by either:

  1. A provided JSON literal value, or

  2. The result of a jsonpath query on the input structure.

For example, if the output_structure parameter is:

{
    "id": "$.object.id",
    "name": "literal",
    "children": {"left": "$.list[0]", "right": "$.list[1]"}
}

and the data is:

{
    "object": {"id": 42, "name": "spam"},
    "list": [1, 2]
}

Then applying the transformer to the data will result in the new structure:

{
    "id": 42,
    "name": "literal",
    "children: {"left": 1, "right": 2}
}

A value is interpreted as a jsonpath query if it is a string that starts with the ‘$’ character. If you need a literal string that starts with the ‘$’ character, escape it with a second ‘$’, e.g., “$$PATH” will appear as the literal string “$PATH” in the output.

All of the jsonpath queries must return exactly one value when executed against each input item. If not, a ValueError will be raised.

dyff.schema.adapters.create_adapter(adapter_spec: SchemaAdapter | dict) Adapter
dyff.schema.adapters.create_pipeline(adapter_specs: Iterable[SchemaAdapter | dict]) Pipeline
dyff.schema.adapters.flatten_object(obj: dict, *, max_depth: int | None = None, add_prefix: bool = True) dict

Flatten a JSON object the by creating a new object with a key for each “leaf” value in the input. If add_prefix is True, the key will be equal to the “path” string of the leaf, i.e., “obj.field.subfield”; otherwise, it will be just “subfield”.

Nested lists are considered “leaf” values, even if they contain objects.

dyff.schema.adapters.known_adapters() dict[str, Type[Adapter]]
dyff.schema.adapters.map_structure(fn, data)

Given a JSON data structure data, create a new data structure instance with the same shape as data by applying fn to each “leaf” value in the nested data structure.