BUG,ENH: Relax finfo to be easier accessible for all user dtypes#31563
BUG,ENH: Relax finfo to be easier accessible for all user dtypes#31563seberg wants to merge 5 commits into
Conversation
| try: | ||
| arr = numeric.empty(1, dtype=dtype) | ||
| imag_part = arr.imag | ||
| real_part = arr.real |
There was a problem hiding this comment.
To be clear, my first plan for this function was to guard it with the AbstractDType hierarchy, but since that is stalled behind heap-types, etc. and the .imag path is good and used either way, that guard felt a bit unnecessary.
(ml_dtypes cannot inherit from the scalar hierarchy easily, I think because it would mean inheriting the scalar methods and they make wrong assumptions.)
Yes, no problem, we return 1 for all six now-mandatory float constants (and the four int ones), so quaddtype is unaffected. |
I didn't thought about it earlier but yeah that logic routes the input to default scalar type. I think we can leave |
|
Actually, pushed to use |
ngoldbaum
left a comment
There was a problem hiding this comment.
Should the new .real and .imag heuristic be documented somewhere?
I also spotted a couple issues inline. I used an AI model to help with the review.
There was a problem hiding this comment.
Shouldn't there be a new error here to match the one you added in the other branch?
There was a problem hiding this comment.
oops, good point. Yeah, should error out just the same way here also (i.e. for now require everything to be filled)
| real_part = arr.real | ||
| except Exception: | ||
| # if array creation or arr.imag fails, assume it's not a complex. | ||
| pass |
There was a problem hiding this comment.
I'm not a fan of this Except: pass construct since it can make it really hard to debug a situation where a developers triggers the exception. They would need to spelunk through the NumPy implementation to understand what is happening.
Wouldn't it be better to do something like imag_part = getattr(arr, "imag", None) and then handle the case when imag_part is None?
I would also probably want to know that I passed in a type that has dtype that isn't usable to create an array scalar rather than have the ValueError silently swallowed.
There was a problem hiding this comment.
Wouldn't it be better to do something like imag_part = getattr(arr, "imag", None) and then handle the case when imag_part is None?
Yeah, I hate except Exception: too, but wasn't sure if a TypeError is broad enough.
getattr could effectively call a ufunc. Now, when it does, it's definitely not a compatible complex!
But in Python I can't know that, so the try/except is to avoid lowering into C (which is the alternative).
I think it should be fairly graceful here (because it almost certainly just raises "dtype not compatible with finfo"), but if you dislike it enough... I guess we can make a "try to fetch the real dtype" C level function.
There was a problem hiding this comment.
OK, I moved it to C, it's not too bad.
| dtype = np.dtype(dt) | ||
| with pytest.raises(ValueError, | ||
| match=r"data type .* not compatible with finfo"): | ||
| finfo(dtype) |
There was a problem hiding this comment.
Maybe a test using the legacy rational dtype that exercises the error path only allowing new user dtypes?
There was a problem hiding this comment.
The path isn't strictly limited to new user dtypes and rational isn't special, but I added it and object and StringDType as well here (object actually adds real coverage since it isn't view compatible).
This does two small tweaks:
1. Remove the requirement for `inexact` scalar derivation (because I had
not realized that this is problematic for bfloat16, etc.).
* This makes the error message very slightly nicer for non-float
finfo inputs, I expect that is OK.
(This means we require all finfo fields to be filled in right now
which quaddtype does as the only user, we could omit the decimal
precision or so if someone asks for it.)
2. For complex dtypes rather than hard-code ours, assume that `arr.imag`
succeeds and use `arr.imag.dtype` when it is a view to find the "real"
dtype.
After that, we assume that `finfo` filling will succeed.
Together (with the "new dtype backporting") this will allow `ml_dtypes`
to support `np.finfo()` fully.
(It is very annoying to test it here, but we run `ml_dtypes` tests now
and they will test it as soon as we start using it there.)
Treat missing integer finfo constants as a compatibility error and test rational and object and StringDType in getlimits (I think both of those are more intersting than rational itself).
This adds a private `_finfo_get_realdtype` helper that directly accesses the `.imag` and `.real` DType attribute slots to decide whether a dtype is (sufficiently certain) composed of a real and complex part with the identical dtype. If it is, we assume that this new dtype is the valid dtype for finfo (if it is an inexact one itself of course). This avoids broad try/except that is otherwise necessary in Python.
5b5c080 to
a40e482
Compare
ngoldbaum
left a comment
There was a problem hiding this comment.
It'd still be nice to get a documentation mention somewhere that the .imag/.real heuristic exists, but I'm happy for that to be a followup.
|
Ooops, just forgot about the docs. Added two sentences to the constants docs about finfo with a side-note for complex (I don't think this belongs in |
This does two small tweaks:
inexactscalar derivation (because I had not realized that this is problematic for bfloat16, etc.).arr.imagsucceeds and usearr.imag.dtypewhen it is a view to find the "real" dtype. After that, we assume thatfinfofilling will succeed.Together (with the "new dtype backporting") this will allow
ml_dtypesto supportnp.finfo()fully.(It is very annoying to test it here, but we run
ml_dtypestests now and they will test it as soon as we start using it there.)(No AI use.)
CC @SwayamInSync since you were involved in the first part. I don't think requiring all "finfo" fields that we currently have in C to be filled should be a problem, right?
(If that changes, there is always the work-around to fill it with NaN in practice, probably.)
(I find the
obj2sctypeannoying and potentially fishy even for quaddtype since it is slightly parametric(?), but that part I don't dare to touch here.)FWIW, I would like to consider this a bug-fix and backport. The only change should be the error message change in practice.