bpo-32978: Fix reading huge floats in AIFC files.#5952
bpo-32978: Fix reading huge floats in AIFC files.#5952serhiy-storchaka wants to merge 3 commits into
Conversation
| data = f.read(10) | ||
| if len(data) != 10: | ||
| raise EOFError | ||
| expon = int.from_bytes(data[:2], 'big', signed=True) |
There was a problem hiding this comment.
I think it would be cleaner to interpret these two bytes as unsigned, and then mask with 0x7fff and with 0x8000 to get the exponent and the sign information. It doesn't really make sense to interpret as two's complement first. However, I appreciate that that's how the original code worked (using a struct format of ">h"), so if it's not changed it's no worse than before.
| lomant = _read_ulong(f) # 4 bytes | ||
| if expon == himant == lomant == 0: | ||
| mant = int.from_bytes(data[2:], 'big') | ||
| if not expon and not mant: |
There was a problem hiding this comment.
Is this special case necessary? I think the right thing to do here is to say: if expon == 0: expon = 1, and then continue to the ldexp call; I think this will do the right thing both on current Python, and on some future hypothetical version of Python where float does actually have the range required to represent the subnormal range of the x87 extended precision format.
There was a problem hiding this comment.
Sorry, but I don't understand the reason of changing 0 to 1 in exponent.
| f = 0.0 | ||
| elif expon == 0x7FFF: | ||
| f = _HUGE_VAL | ||
| f = sys.float_info.max |
There was a problem hiding this comment.
An exponent of 0x7fff indicates an infinity or NaN. I'd expect to see an exception at this point, but I guess we don't want to change existing behaviour. It might be worth a comment, though.
| expon = expon - 0x3FFF | ||
| try: | ||
| f = math.ldexp(mant, expon - 63) | ||
| except OverflowError: |
There was a problem hiding this comment.
As mentioned in the issue discussion, I'd suggest just letting the OverflowError propagate. If we encounter a value like this. But in that case, we should probably also raise for an infinity or NaN above.
So if we want to be internally consistent, I guess the choices are:
- (1) raise on overflow, nan and infinity, or
- (2) return
sys.float_info.maxon overflow, nan and infinity
I find it hard to see any justification for option (2) (especially, turning a nan into sys.float_info.max seems completely nonsensical) other than that that's what the current code does. Backwards compatibility seems unlikely to be a concern here.
|
@serhiy-storchaka Looks like this PR needs to be refreshed. |
|
I removed the " needs backport to 3.6" label, the 3.6 branch no longer accept bugfixes (only security fixes are accepted): https://devguide.python.org/#status-of-python-branches |
|
|
https://bugs.python.org/issue32978