Integrating with the Dyff Web App¶
The Dyff Platform includes a Web app in addition to the API and backend services. The Dyff App is designed to let non-expert AI stakeholders explore the test results hosted on the platform and gain insights into the risks of using AI systems for their use cases.
In the Dyff instance operated by DSRI, the Web app is hosted at app.dyff.io.
Publish your analysis¶
To make your analysis available in the Dyff App, you must publish the relevant resources:
dyffapi.methods.publish("<method ID>", "public")
dyffapi.safetycases.publish("<safetycase ID>", "public")
After doing so, anonymous users can view the safety case report at https://app.dyff.io/reports/<safetycase ID>
and can view the method at https://app.dyff.io/tests/<method ID>
.
Note
Resources have different names in the Dyff App to make it easier for non-expert users to understand what they’re looking at.
You can also publish in “preview” mode:
dyffapi.safetycases.publish("<safetycase ID>", "preview")
In preview mode, the resource will be visible as though it were published publicly, but only to authenticated users who already have permission to view the resource. This way, you can see how your results will look in the Dyff App before making them public. You can also un-publish a resource by setting access to "private"
.
Warning
Make sure your analysis doesn’t output sensitive data before publishing it. Assume that anything published publicly is immediately cached forever on the public Web. Use “preview” mode to verify that what you’re releasing is correct.
Document your analysis resources¶
Most Dyff core resources have associated editable documentation. The documentation is used to populate various text fields in the Dyff App. You should write documentation for all of the resources that you plan to publish. If documentation is not provided, the UI will fall back to other resource fields when it needs text to display, but these usually will not be as user-friendly.
Editable documentation for all entities follows the same format:
- pydantic model dyff.schema.platform.DocumentationBase
- field fullPage: str | None = None
Long-form documentation. Interpreted as Markdown. There are no length constraints, but be reasonable.
- field summary: str | None = None
A brief summary, suitable for display in small UI elements. Interpreted as Markdown. Excessively long summaries may be truncated in the UI, especially on small displays.
- field title: str | None = None
A short plain string suitable as a title or “headline”.
All of the fields are optional, but you should specify at least title
and summary
. The fullPage
docs are interpreted as Markdown and can be as long as you want.
Use the edit_documentation()
functions to add documentation to a resource:
dyffapi.datasets.edit_documentation(
"dataset-id",
DocumentationEditRequest(
title="My dataset",
summary="A dataset that I created.",
# Setting a field to None expicitly deletes the corresponding docs
fullPage=None,
),
)
Use the documentation()
functions to view the current documentation:
dyffapi.datasets.documentation("dataset-id")
Warning
Currently, there is no protection from concurrent modifications to docs, and no ability to restore a previous version of docs. It is strongly recommended that you save documentation text in your version control system and then call the Dyff API with the contents of your versioned files.
Outputting Scores¶
Scores are “named numbers” output by an analysis notebook. They are used by the Dyff App to generate summary and comparison visualizations. You get these visualizations for free simply by outputting a score.
Scores must be declared as part of the Method specification if you want them to appear in the Dyff App:
method_request = MethodCreateRequest(
...,
scores=[
ScoreSpec(
name="error_rate",
title="Error rate",
summary="Percentage of inputs for which the system gave an incorrect response",
minimum=0,
maximum=100,
unit="%",
# Lower is better
valence="negative",
),
ScoreSpec(
name="longest_response",
title="Longest response",
summary="The length of the system's longest response",
minimum=0,
# Only 1 score can be "primary"
priority="secondary",
),
...,
],
...,
)
One score must be the "primary"
score. This is the score that is displayed in contexts where there’s only room for one score. All other scores must have "secondary"
priority. Use the valence
property to specify whether higher or lower scores are “better”; the default is "neutral"
. There are also properties that control how the score is rendered as a string.
You output the score value by setting the output=
argument when calling the AnalysisContext.Score()
method:
ctx = AnalysisContext()
...
# output= must match the name of the score declared in the Method spec
ctx.Score(output="error_rate", text="Not great, not terrible", quantity=20)
# This is output for the Dyff App, but not displayed in the notebook
ctx.Score(
output="longest_response", text="Pretty long, eh?", quantity=123, display=False
)
# This is displayed in the notebook (in scientific notation) but not output
# for the Dyff App; notice that this score is not declared in the spec
ctx.Score(text="Not output", quantity=42000.0, format="{quantity:.2e}")
The score appears in a display widget in the report unless display=False
. The score is output to the Dyff App for use in visualizations if output=<score name>
is specified. Your notebook must output a value for all declared scores; not doing so is an error.