feat: add support for labeled dataset to evaluate function#617
Conversation
|
The Wikipedia article linked in the |
Codecov Report
@@ Coverage Diff @@
## main #617 +/- ##
==========================================
- Coverage 85.02% 84.99% -0.03%
==========================================
Files 133 133
Lines 6718 6745 +27
==========================================
+ Hits 5712 5733 +21
- Misses 1006 1012 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
| self, | ||
| other: 'DocumentArray', | ||
| metric: Union[str, Callable[..., float]], | ||
| ground_truth: Optional['DocumentArray'] = None, |
There was a problem hiding this comment.
keep other to avoid breaking change
There was a problem hiding this comment.
It is a breaking change anyway, because we need to turn it from a mandatory into an optional attribute, but I can change it back to other if you think it makes a difference. I just find this name confusing, especially as an keyword argument.
There was a problem hiding this comment.
it is not breaking, passing from mandatory to Optional will not break other's code, it just adds more options
| self, | ||
| other: 'DocumentArray', | ||
| metric: Union[str, Callable[..., float]], | ||
| ground_truth: Optional['DocumentArray'] = None, |
There was a problem hiding this comment.
please avoid breaking the interface, ground_truth can be added as an alias, u can have a deprecation decorator that has other equivalent to groundtruth
There was a problem hiding this comment.
The problem is, that the function has atm two mandatory attributes other and metric (in this order). It does not make sense to keep other mandatory if the DocumentArray and its matches itself have labels. So I have to make it optional. If I do this I can not preserve the order since metric as a mandatory attribute has to be placed before other.
There was a problem hiding this comment.
The only way how it might work is to make metric optional as well by setting a default metric. However, then this syntax will not work:
da.evaluate('precision_at_k')you always have to do:
da.evaluate(metric='precision_at_k')There was a problem hiding this comment.
other and ground_truth need to be equivalent, this 100% sure
There was a problem hiding this comment.
this can be achieved by a decorator
There was a problem hiding this comment.
why metric is precision_at_k? shouldn't be precision, k=10?
There was a problem hiding this comment.
@bwanglzu , the function in docarray.math is called precision_at_k, setting a k is possible but not mandatory since the matches usually contain only the Top-K
There was a problem hiding this comment.
Just for documentation: we agreed on checking the type of the first argument. If it is a DocumentArray we set ground_truth to metric and raise a deprecation warning. We also handle cases where someone provides a other keyword argument and raise a deprecation warning.
| on comparing the `matches` of `documents` inside the `DocumentArray. | ||
| This implementation expects the documents and their matches to have labels | ||
| annotated inside the tag with the key specified in the `label_tag` attribute. | ||
| Alternatively, one can provide a `ground_truth` DocumentArray that is |
There was a problem hiding this comment.
This in my opinion should be the principal case, and the other the alternative
There was a problem hiding this comment.
ok, yes I can change this in the docstring
| hash_fn: Optional[Callable[['Document'], str]] = None, | ||
| metric_name: Optional[str] = None, | ||
| strict: bool = True, | ||
| label_tag='label', |
There was a problem hiding this comment.
label_tag should be an Optional, plus please provide type hints
| elif label_tag in self[0].tags: | ||
| if ground_truth: | ||
| warnings.warn( | ||
| 'An ground_truth attribute is provided but does not ' |
There was a problem hiding this comment.
| 'An ground_truth attribute is provided but does not ' | |
| 'A ground_truth attribute is provided but does not ' |
| hash_fn: Optional[Callable[['Document'], str]] = None, | ||
| metric_name: Optional[str] = None, | ||
| strict: bool = True, | ||
| label_tag='label', |
There was a problem hiding this comment.
Yes and optional is also missing.
| results = [] | ||
| caller_max_rel = kwargs.pop('max_rel', None) | ||
| for d, gd in zip(self, other): | ||
| for d, gd in zip(self, ground_truth): |
There was a problem hiding this comment.
do we check the length should be identical?
There was a problem hiding this comment.
This is done if strict=True
| self, | ||
| other: 'DocumentArray', | ||
| metric: Union[str, Callable[..., float]], | ||
| ground_truth: Optional['DocumentArray'] = None, |
There was a problem hiding this comment.
why metric is precision_at_k? shouldn't be precision, k=10?
| 1 if m.tags[label_tag] == d.tags[label_tag] else 0 for m in targets | ||
| ] | ||
| else: | ||
| raise RuntimeError(f'Unsupported groundtruth type {ground_truth_type}') |
There was a problem hiding this comment.
this error message will be very confusing, user do not know what is a ground_truth_type, it's not a paramter passed from user ,but something you interpreted from self or ground_truth.
There was a problem hiding this comment.
It should be unreachable code, but probably it is better to just remove it
There was a problem hiding this comment.
I would say better to keep it but just have a better error message that does not talk about an internal object that the user is not aware of. Smth like
| raise RuntimeError(f'Unsupported groundtruth type {ground_truth_type}') | |
| raise RuntimeError(f'Something went wrong with the groudtruth') |
There was a problem hiding this comment.
Describe what can be wrong, without talking about the internal name, but something the user can relate to
d12cb03 to
15ea32d
Compare
| for d in da1_index: | ||
| d.tags = {'label': 'A'} |
There was a problem hiding this comment.
No, you are right, I think I can remove it.
| assert isinstance(r, float) | ||
| assert r == 0.0 | ||
| for d in da1: | ||
| d: Document |
There was a problem hiding this comment.
oh I don't know, I will remove it
bwanglzu
left a comment
There was a problem hiding this comment.
minor comments, make sure da team agree ground_truth as a breaking change
| raise ValueError('It is not possible to evaluate an empty DocumentArray') | ||
| if ground_truth and len(ground_truth) > 0 and ground_truth[0].matches: | ||
| ground_truth_type = 'matches' | ||
| elif label_tag in self[0].tags: |
There was a problem hiding this comment.
what happens here if label_tag is None? is label_tag really Optional?
There was a problem hiding this comment.
ah sorry, yes it should not be optional
| metric_name: Optional[str] = None, | ||
| strict: bool = True, | ||
| label_tag: Optional[str] = 'label', | ||
| label_tag: str = 'label', |
There was a problem hiding this comment.
I'd rather check what happens with None. One should be able to put None
There was a problem hiding this comment.
Ok so do you mean, that it should be optional and default to label. If a user does not want to use labels, i.e., pass a ground_truth it should be ok to pass None. However, if it is set to None and no ground_truth is passed, an exception is raised.
| for d in da1_index: | ||
| d.tags = {'label': 'A'} |
|
📝 Docs are deployed on https://ft-feat-support-labels-in-evaluate--jina-docs.netlify.app 🎉 |
Goals:
Allow the user to add labels to the documents instead of passing an addtional ground truth
DocumentArraycheck and update documentation, if required. See guide