-
Notifications
You must be signed in to change notification settings - Fork 840
Description
Currently, in the MetricData class, the score field is strictly typed as an Optional[float]:
class MetricData(BaseModel):
...
score: Optional[float] = None
...
However, we have encountered use cases where the LLM returns a score as a string, such as:
"acceptable"
"ok"
"not acceptable"
"True"
These are semantically meaningful and useful in contexts where the model output isn't strictly numerical.
Why this is worth fixing:
The current strict typing leads to a TypeError in the evaluate() method when using custom metrics that return strings. This creates unnecessary friction when integrating non-numeric model outputs into evaluation pipelines.
We would like to propose updating the type of score to:
score: Optional[Union[float, str]] = None
This change would maintain backward compatibility while allowing for more flexible and descriptive scoring outputs from LLMs.
Happy to submit a PR!