Create a safety case¶
A safety case is a document intended for human readers that compiles and
presents the evidence of the safety (or un-safety) of a single AI system for a particular
use case and context. Safety cases are modeled by
SafetyCase
resources. They are rendered as HTML
documents containing text, tables, charts, and other graphics.
Implementation: Jupyter notebook¶
Safety cases are generated using an analysis workflow that is very similar to the workflow for generating Measurements. The key difference is that Methods that generate safety cases are implemented as Jupyter notebooks . The Dyff Platform runs the Jupyter notebook, renders the output cells as HTML, and serves the generated HTML at a specific route.
Conceptually, a notebook that generates a safety case is very similar to a Python function that generates a measurement. You can think of the notebook as a function that maps input datasets to an HTML document. The key difference is that notebooks can’t accept arguments in the same way that Python functions do, so we need a different mechanism to pass data into the notebook.
Warning
Any output generated by your notebook will be visible to anyone who has permission to view the corresponding safety cases. Be careful not to leak sensitive information such as PII or private labels in the output.
The AnalysisContext
¶
The first thing you must do in your notebook is to instantiate the
dyff.audit.analysis.AnalysisContext
class:
from dyff.audit.analysis import AnalysisContext
ctx = AnalysisContext()
The context instance allows you to access the inputs to the notebook:
# Access arguments
category: str = ctx.get_argument("category")
temperature: float = float(ctx.get_argument("temperature"))
# Access input data
dataset: pyarrow.dataset.Dataset = ctx.open_input_dataset("dataset")
The available arguments and input datasets are defined in the Method specification resource associated with the notebook. When you create a safety case resource that references this method, the Dyff Platform binds the specified values and data inputs to the specified names.
You can now proceed with all of the usual Jupyter notebook activities — manipulating data, embedding charts, creating formatted text, etc.
Styling the notebook¶
Dyff provides a few basic display “widgets” to help you create reports that communicate your main conclusions effectively. These are implemented as methods on the AnalysisContext
object that you can call in your notebooks:
ctx = AnalysisContext()
# Use this at the top of the notebook to generate a "title" section
ctx.TitleCard(
headline="System gives inaccurate resopnses about cookies",
author="Flancrest Enterprises",
summary_phrase="Often inaccurate",
summary_text="When answering multiple-choice questions about cookie ingredients.",
)
# Conclusions call out specific "take-away" messages
ctx.Conclusion(text="People with allergies should be careful", indicator="Hazard")
# Scores call out numeric quantities
ctx.Score(text="Error rate", quantity=20, unit="%")
The information you provide will be rendered in the notebook using display templates.
The Score
widget can also be used to integrate scores into the Dyff App.
Deploying and running the notebook¶
The process of deploying a notebook is just like the process of deploying an analysis implemented as a Python function. You need to create three resources:
A
Module
containing the notebook code (the.ipynb
file).A
Method
that describes the method and its inputs and outputs, and references theModule
from step (1).A
SafetyCase
that references theMethod
from step (2) and specifies the IDs of specific resources to pass as inputs.
Create a Module¶
Assuming you’ve implemented your notebook in a file called my-notebook.ipynb
in the directory /home/me/dyff/my-notebook
, you would create and upload the
package like this:
1# SPDX-FileCopyrightText: 2024 UL Research Institutes
2# SPDX-License-Identifier: Apache-2.0
3
4from __future__ import annotations
5
6from pathlib import Path
7
8from dyff.audit.local import DyffLocalPlatform
9from dyff.schema.platform import *
10from dyff.schema.requests import *
11
12ACCOUNT: str = ...
13ROOT_DIR: Path = Path("/home/me/dyff")
14
15# Develop using the local platform
16dyffapi = DyffLocalPlatform(
17 storage_root=ROOT_DIR / ".dyff-local",
18)
19# When you're ready, switch to the remote platform:
20# dyffapi = Client(...)
21
22module_root = str(ROOT_DIR / "my-notebook")
23module = dyffapi.modules.create_package(
24 module_root,
25 account=ACCOUNT,
26 name="my-notebook",
27)
28dyffapi.modules.upload_package(module, module_root)
29print(module.json(indent=2))
30
Create a Method¶
The Method
resource specifies the inputs and
outputs:
31method_request = MethodCreateRequest(
32 name="mean-word-length-notebook",
33 # The notebook analyzes multiple measurements of the same system
34 scope=MethodScope.InferenceService,
35 description="Visualizes the mean word length in prompts and system completions.",
36 # The method is implemented as a Jupyter notebook
37 implementation=MethodImplementation(
38 kind=MethodImplementationKind.JupyterNotebook,
39 jupyterNotebook=MethodImplementationJupyterNotebook(
40 notebookModule=module.id,
41 # The path to the notebook file, relative to the module root directory
42 notebookPath="my-notebook.ipynb",
43 ),
44 ),
45 # The method accepts one argument called 'threshold'
46 parameters=[
47 MethodParameter(keyword="threshold", description="(float) A numeric threshold"),
48 ],
49 # The method accepts two PyArrow datasets as inputs:
50 # - The one called 'easy' is a Measurement on the "easy" dataset
51 # - The one called 'hard' is a Measurement on the "hard" dataset
52 inputs=[
53 MethodInput(kind=MethodInputKind.Measurement, keyword="easy"),
54 MethodInput(kind=MethodInputKind.Measurement, keyword="hard"),
55 ],
56 # The method produces a SafetyCase
57 output=MethodOutput(
58 kind=MethodOutputKind.SafetyCase,
59 safetyCase=SafetyCaseSpec(
60 name="mean-word-length-safetycase",
61 description="Visualizes the mean word length in prompts and system completions.",
62 ),
63 ),
64 # The Module containing the notebook code
65 modules=[module.id],
66 account=ACCOUNT,
67)
68method = dyffapi.methods.create(method_request)
69print(method.json(indent=2))
70
For this notebook, both of the inputs are Measurements
— they could also
be Datasets
or Evaluations
. Here, we want to analyze a (hypothetical)
Measurement computed using the same Method but applied to two different input
datasets, called easy
and hard
.
Create a SafetyCase¶
The SafetyCase
resource represents the
computational work needed to run your notebook on specific inputs. You use the
same AnalysisCreateRequest
class that is used
when creating Measurements:
71easy_measurement_id: str = ...
72hard_measurement_id: str = ...
73analysis_request = AnalysisCreateRequest(
74 account=ACCOUNT,
75 method=method.id,
76 arguments=[
77 AnalysisArgument(keyword="threshold", value="1.0"),
78 ],
79 inputs=[
80 AnalysisInput(keyword="easy", entity=easy_measurement_id),
81 AnalysisInput(keyword="hard", entity=hard_measurement_id),
82 ],
83)
84safetycase = dyffapi.safetycases.create(analysis_request)
85print(safetycase.json(indent=2))
Full Example¶
1# SPDX-FileCopyrightText: 2024 UL Research Institutes
2# SPDX-License-Identifier: Apache-2.0
3
4from __future__ import annotations
5
6from pathlib import Path
7
8from dyff.audit.local import DyffLocalPlatform
9from dyff.schema.platform import *
10from dyff.schema.requests import *
11
12ACCOUNT: str = ...
13ROOT_DIR: Path = Path("/home/me/dyff")
14
15# Develop using the local platform
16dyffapi = DyffLocalPlatform(
17 storage_root=ROOT_DIR / ".dyff-local",
18)
19# When you're ready, switch to the remote platform:
20# dyffapi = Client(...)
21
22module_root = str(ROOT_DIR / "my-notebook")
23module = dyffapi.modules.create_package(
24 module_root,
25 account=ACCOUNT,
26 name="my-notebook",
27)
28dyffapi.modules.upload_package(module, module_root)
29print(module.json(indent=2))
30
31method_request = MethodCreateRequest(
32 name="mean-word-length-notebook",
33 # The notebook analyzes multiple measurements of the same system
34 scope=MethodScope.InferenceService,
35 description="Visualizes the mean word length in prompts and system completions.",
36 # The method is implemented as a Jupyter notebook
37 implementation=MethodImplementation(
38 kind=MethodImplementationKind.JupyterNotebook,
39 jupyterNotebook=MethodImplementationJupyterNotebook(
40 notebookModule=module.id,
41 # The path to the notebook file, relative to the module root directory
42 notebookPath="my-notebook.ipynb",
43 ),
44 ),
45 # The method accepts one argument called 'threshold'
46 parameters=[
47 MethodParameter(keyword="threshold", description="(float) A numeric threshold"),
48 ],
49 # The method accepts two PyArrow datasets as inputs:
50 # - The one called 'easy' is a Measurement on the "easy" dataset
51 # - The one called 'hard' is a Measurement on the "hard" dataset
52 inputs=[
53 MethodInput(kind=MethodInputKind.Measurement, keyword="easy"),
54 MethodInput(kind=MethodInputKind.Measurement, keyword="hard"),
55 ],
56 # The method produces a SafetyCase
57 output=MethodOutput(
58 kind=MethodOutputKind.SafetyCase,
59 safetyCase=SafetyCaseSpec(
60 name="mean-word-length-safetycase",
61 description="Visualizes the mean word length in prompts and system completions.",
62 ),
63 ),
64 # The Module containing the notebook code
65 modules=[module.id],
66 account=ACCOUNT,
67)
68method = dyffapi.methods.create(method_request)
69print(method.json(indent=2))
70
71easy_measurement_id: str = ...
72hard_measurement_id: str = ...
73analysis_request = AnalysisCreateRequest(
74 account=ACCOUNT,
75 method=method.id,
76 arguments=[
77 AnalysisArgument(keyword="threshold", value="1.0"),
78 ],
79 inputs=[
80 AnalysisInput(keyword="easy", entity=easy_measurement_id),
81 AnalysisInput(keyword="hard", entity=hard_measurement_id),
82 ],
83)
84safetycase = dyffapi.safetycases.create(analysis_request)
85print(safetycase.json(indent=2))