11

This question and my answer got me thinking about this peculiar difference between Python 2.7 and Python 3.4. Take the simple example code:

import timeit
import dis

c = 1000000
r = range(c)
def slow():
    for pos in range(c):
        r[pos:pos+3]

dis.dis(slow)

time = timeit.Timer(lambda: slow()).timeit(number=1)
print('%3.3f' % time)

In Python 2.7, I consistently get 0.165~ and for Python 3.4 I consistently get 0.554~. The only significant difference between the disassemblies is that Python 2.7 emits the SLICE+3 byte code while Python 3.4 emits BUILD_SLICE followed by BINARY_SUBSCR. Note that I've eliminated the candidates for potential slowdown from the other question, namely strings and the fact that xrange doesn't exist in Python 3.4 (which is supposed to be similar to the latter's range class anyways).

Using itertools' islice yields nearly identical timings between the two, so I highly suspect that it's the slicing that's the cause of the difference here.

Why is this happening and is there a link to an authoritative source documenting change in behavior?

EDIT: In response to the answer, I have wrapped the range objects in list, which did give a noticeable speedup. However as I increased the number of iterations in timeit I noticed that the timing differences became larger and larger. As a sanity check, I replaced the slicing with None to see what would happen.

500 iterations in timeit.

c = 1000000
r = list(range(c))
def slow():
    for pos in r:
        None

yields 10.688 and 9.915 respectively. Replacing the for loop with for pos in islice(r, 0, c, 3) yields 7.626 and 6.270 respectively. Replacing None with r[pos] yielded 20~ and 28~ respectively. r[pos:pos+3] yields 67.531 and 106.784 respectively.

As you can see, the timing differences are huge. Again, I'm still convinced the issue is not directly related to range.

4
  • Were you using range or xrange in Python 2? Commented May 5, 2016 at 17:39
  • Try again, with r = list(range(c)). Commented May 5, 2016 at 17:39
  • Python 3 range and Python 2 xrange objects are not similar in this context. Python 3 range objects support slicing, Python 2 xrange objects do not support slicing. Commented May 5, 2016 at 17:51
  • @cdarke In this case it's irrelevant since neither the linked question nor I used xrange. I only mentioned it incase somebody were to bring it up. Commented May 5, 2016 at 17:53

1 Answer 1

10

On Python 2.7, you're iterating over a list and slicing a list. On Python 3.4, you're iterating over a range and slicing a range.

When I run a test with a list on both Python versions:

from __future__ import print_function
import timeit
print(timeit.timeit('x[5:8]', setup='x = list(range(10))'))

I get 0.243554830551 seconds on Python 2.7 and 0.29082867689430714 seconds on Python 3.4, a much smaller difference.


The performance difference you see after eliminating the range object is much smaller. It comes primarily from two factors: addition is a bit slower on Python 3, and Python 3 needs to go through __getitem__ with a slice object for slicing, while Python 2 has __getslice__.

I wasn't able to replicate the timing difference you saw with r[pos]; you may have had some confounding factor in that test.

Sign up to request clarification or add additional context in comments.

Increasing the number of iterations starts to show bigger and bigger gaps in the timing differences. 1.352 and 2.171 respectively for 10 iterations. I'm not sure what I'm missing here.
@user6292850: How are you running that test?
I'm merely increasing the number parameter in timeit, and running python test.py followed by python3 test.py.
@user6292850: I ran some tests and isolated the primary contributors to the remaining performance difference. My tests with r[pos] only showed a 10% runtime difference, unlike your 40% runtime difference, though.
@user6292850 which test.py are you testing? Yours in the question or the one in this answer? Let's test the same thing. I suggest using this format: python -m timeit -s "r = list(range(1000))" "for pos in range(1000): r[pos:pos+3]". Why would you not use the default number of iterations of 1000000?

Your Answer

Draft saved
Draft discarded

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.