Integrating with the Dyff Web App¶
The Dyff Platform includes a Web app in addition to the API and backend services. The Dyff App is designed to let non-expert AI stakeholders explore the test results hosted on the platform and gain insights into the risks of using AI systems for their use cases.
In the Dyff instance operated by DSRI, the Web app is hosted at app.dyff.io.
Publish your analysis¶
To make your analysis available in the Dyff App, you must publish the relevant resources:
dyffapi.methods.publish("<method ID>", "public")
dyffapi.safetycases.publish("<safetycase ID>", "public")
After doing so, anonymous users can view the safety case report at https://app.dyff.io/reports/<safetycase ID>
and can view the method at https://app.dyff.io/tests/<method ID>
.
Note
Resources have different names in the Dyff App to make it easier for non-expert users to understand what they’re looking at.
You can also publish in “preview” mode:
dyffapi.safetycases.publish("<safetycase ID>", "preview")
In preview mode, the resource will be visible as though it were published publicly, but only to authenticated users who already have permission to view the resource. This way, you can see how your results will look in the Dyff App before making them public. You can also un-publish a resource by setting access to "private"
.
Warning
Make sure your analysis doesn’t output sensitive data before publishing it. Assume that anything published publicly is immediately cached forever on the public Web. Use “preview” mode to verify that what you’re releasing is correct.
Outputting Scores¶
Scores are “named numbers” output by the notebook. They are used by the Dyff App to generate summary and comparison visualizations. You get these visualizations for free simply by outputting a Score.
Scores must be declared as part of the Method specification if you want them to appear in the Dyff App:
method_request = MethodCreateRequest(
...,
scores=[
ScoreSpec(
name="error_rate",
title="Error rate",
summary="Percentage of inputs for which the system gave an incorrect response",
minimum=0,
maximum=100,
unit="%",
# Lower is better
valence="negative",
),
ScoreSpec(
name="longest_response",
title="Longest response",
summary="The length of the system's longest response",
minimum=0,
# Only 1 score can be "primary"
priority="secondary",
),
...,
],
...,
)
One score must be the "primary"
score. This is the score that is displayed in contexts where there’s only room for one score. All other scores must have "secondary"
priority. Use the valence
property to specify whether higher or lower scores are “better”; the default is "neutral"
. There are also properties that control how the score is rendered as a string.
You output the score value by setting the output=
argument when calling the AnalysisContext.Score()
method:
ctx = AnalysisContext()
...
# output= must match the name of the score declared in the Method spec
ctx.Score(output="error_rate", text="Not great, not terrible", quantity=20)
# This is output for the Dyff App, but not displayed in the notebook
ctx.Score(
output="longest_response", text="Pretty long, eh?", quantity=123, display=False
)
# This is displayed in the notebook (in scientific notation) but not output
# for the Dyff App; notice that this score is not declared in the spec
ctx.Score(text="Not output", quantity=42000.0, format="{quantity:.2e}")
The score appears in a display widget in the report unless display=False
. The score is output to the Dyff App for use in visualizations if output=<score name>
is specified. Your notebook must output a value for all declared scores; not doing so is an error.