[Python-ideas] PEP 479: Change StopIteration handling inside generators
Chris Angelico
rosuav at gmail.com
Thu Nov 20 03:24:02 CET 2014
On Thu, Nov 20, 2014 at 1:06 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Nov 20, 2014 at 03:24:07AM +1100, Chris Angelico wrote:
>
>> If you write __next__, you write in a "raise StopIteration" when it's
>> done. If you write __getattr__, you write in "raise AttributeError" if
>> the attribute shouldn't exist. Those are sufficiently explicit that it
>> should be reasonably clear that the exception is the key. But when you
>> write a generator, you don't explicitly raise:
>
> That's not true in practice. See my reply to Nick, there is lots of code
> out there which uses StopIteration to exit generators. Some of that code
> isn't very good code -- I've seen "raise StopIteration" immediately
> before falling out the bottom of the generator -- but until now it has
> worked and the impression some people have gotten is that it is actually
> preferred.
Yes, I thought it was rare. I stand corrected. Reword that to "you
don't *need to* explicitly raise", since you can simply return, and it
becomes true again, though.
>> def gen():
>> yield 1
>> yield 2
>> yield 3
>> return 4
>
> Until 3.2, that was a syntax error. For the majority of people who are
> still using Python 2.7, it is *still* a syntax error. To write this in a
> backwards-compatible way, you have to exit the generator with:
>
> raise StopIteration(2)
In most cases you won't need to put a value on it, so bare "return"
will work just fine. I just put a return value onto it so it wouldn't
look trivially useless.
>> The distinction in __next__ is between returning something and raising
>> something. The distinction in a generator is between "yield" and
>> "return". Why should a generator author have to be concerned about one
>> particular exception having magical meaning?
>
> I would put it another way: informally, the distinction between a
> generator and a function is that generators use yield where functions
> use return. Most people are happy with that informal definition, a full
> pedantic explanation of co-routines will just confuse them or bore them.
> The rule they will learn is:
>
> * use return in functions
> * use yield in generators
>
> That makes generators that use both surprising. Since most generators
> either run forever or fall out the bottom when they are done, I expect
> that seeing a generator with a return in it is likely to surprise a lot
> of people. I've known that return works for many years, and I still
> give a double-take whenever I see it in a generator.
But it's just as surprising to put "raise StopIteration" into it. It's
normal to put that into __next__, it's not normal to need it in a
generator. Either way, it's something unusual; so let's go with the
unusual "return" rather than the unusual "raise".
>> That's how __next__ works, only with a different exception, and I
>> think people would agree that this is NOT a good use of
>> KeyboardInterrupt.
>
> Why not? How else are you going to communicate something out of band to
> the consumer except via an exception?
>
> We can argue about whether KeyboardInterrupt is the right exception to
> use or not, but if you insist that this is a bad protocol then you're
> implicitly saying that the iterator protocol is also a bad protocol.
Well, that's exactly what I do mean. KeyboardInterrupt is not a good
way for two parts of a program to communicate with each other, largely
because it can be raised unexpectedly. Which is the point of this PEP:
raising StopIteration unexpectedly should also result in a noisy
traceback.
> I trust that we all expect to be able to factor out the raise into a
> helper function or method, yes? It truly would be surprising if this
> failed:
>
>
> class MyIterator:
> def __iter__(self):
> return self
> def __next__(self):
> return something()
>
>
> def something():
> # Toy helper function.
> if random.random() < 0.5:
> return "Spam!"
> raise StopIteration
>
>
>
> Now let's write this as a generator:
>
> def gen():
> while True:
> yield something()
>
>
> which is much nicer than:
>
> def gen():
> while True:
> try:
> yield something()
> except StopIteration:
> return # converted by Python into raise StopIteration
Sure. There was a suggestion that "return yield from something()"
would work, though, which - I can't confirm that this works, but
assuming it does - would be a lot tidier. But there's still a
difference. Your first helper function was specifically a __next__
helper. It was tied intrinsically to the iterator protocol. If you
want to call a __next__ helper (or actually call next(iter) on
something) inside a generator, you'll have to - if this change goes
through - cope with the fact that generator protocol says "return"
where __next__ protocol says "raise StopIteration". If you want a
generator helper, it'd look like this:
def something():
# Toy helper function.
if random.random() < 0.5:
yield "Spam!"
def gen():
yield from something()
Voila! Now it's a generator helper, following generator protocol.
Every bit as tidy as the original. Let's write a __getitem__ helper:
def something(x):
# Toy helper function.
if random.random() < 0.5:
return "Spam!"
raise KeyError(x)
class X:
def __getitem__(self, x): return something(x)
Same thing. As soon as you get into raising these kinds of exceptions,
you're tying your helper to a specific protocol. All that's happening
with PEP 479 is that generator and iterator protocol are being
distinguished slightly.
ChrisA
More information about the Python-ideas
mailing list