Issue 42126: Optimize BUILD_CONST_KEY_MAP for distinct keys

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/86292

classification

Title:	Optimize BUILD_CONST_KEY_MAP for distinct keys
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.10

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	methane	Nosy List:	Mark.Shannon, methane
Priority:	normal	Keywords:	patch

Created on 2020-10-23 08:46 by methane, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 22911	closed	methane, 2020-10-23 09:05

Messages (6)
msg379415 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-23 08:46
BUILD_CONST_KEY_MAP can be optimized based on #41835 optimization. 1. compiler checks keys tuple is distinct. 2. Add distinct flag to BUILD_CONST_KEY_MAP oparg To be considered: * Should we use new opcode, instead of flag in oparg? * Is this technique safe? Wrong co_consts can make invalid dict instance.
msg379430 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-23 11:46
$ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 23.5 ns +- 0.2 ns /home/inada-n/work/python/cpython/python: ..................... 22.4 ns +- 0.1 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 23.5 ns +- 0.2 ns -> [/home/inada-n/work/python/cpython/python] 22.4 ns +- 0.1 ns: 1.05x f aster (-5%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 83.5 ns +- 0.3 ns /home/inada-n/work/python/cpython/python: ..................... 96.7 ns +- 0.2 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 83.5 ns +- 0.3 ns -> [/home/inada-n/work/python/cpython/python] 96.7 ns +- 0.2 ns: 1.16x s lower (+16%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2,3:3}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 108 ns +- 0 ns /home/inada-n/work/python/cpython/python: ..................... 110 ns +- 1 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 108 ns +- 0 ns -> [/home/inada-n/work/python/cpython/python] 110 ns +- 1 ns: 1.01x slower (+1%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2,3:3,4:4}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 134 ns +- 1 ns /home/inada-n/work/python/cpython/python: ..................... 122 ns +- 1 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 134 ns +- 1 ns -> [/home/inada-n/work/python/cpython/python] 122 ns +- 1 ns: 1.10x faster (-9%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2,3:3,4:4,5:5}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 168 ns +- 1 ns /home/inada-n/work/python/cpython/python: ..................... 112 ns +- 1 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 168 ns +- 1 ns -> [/home/inada-n/work/python/cpython/python] 112 ns +- 1 ns: 1.50x faster (-33%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2,3:3,4:4,5:5,6:6}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 223 ns +- 1 ns /home/inada-n/work/python/cpython/python: ..................... 150 ns +- 4 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 223 ns +- 1 ns -> [/home/inada-n/work/python/cpython/python] 150 ns +- 4 ns: 1.48x faster (-33%) $ ./python -m pyperf timeit --compare-to ~/pyenv/versions/3.10-dev/bin/python '{1:1,2:2,3:3,4:4,5:5,6:6,7:7,8:8}' /home/inada-n/pyenv/versions/3.10-dev/bin/python: ..................... 273 ns +- 1 ns /home/inada-n/work/python/cpython/python: ..................... 177 ns +- 1 ns Mean +- std dev: [/home/inada-n/pyenv/versions/3.10-dev/bin/python] 273 ns +- 1 ns -> [/home/inada-n/work/python/cpython/python] 177 ns +- 1 ns: 1.54x faster (-35%)
msg379431 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2020-10-23 12:20
Is this worthwhile? Statically, BUILD_CONST_KEY_MAP represents 0.14% of all bytecode instructions in the standard library, but only 0.03% of the bytecode instructions in functions. Which suggests that dynamically only 1 in 3000 instructions executed is BUILD_CONST_KEY_MAP. The implication is that making BUILD_CONST_KEY_MAP more complex is just as likely to slow things down as speed them up. Do you have any evidence that this will make a measurable difference?
msg379434 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-23 13:53
It is difficult to estimate. Real world applications has different code-style than stdlib. And pyperformance didn't include whole real world applications too. Some application may use dict display in heavy loop. But it is difficult to say how much exactly. One example from pyperformance. It uses inner function and function annotations heavily. BUILD_CONST_KEY_MAP is used to build annotation dict, instead of dict display. https://github.com/tornadoweb/tornado/blob/5eb7bb8efb4e1be8a2cec1744c857a8dadd43af9/tornado/gen.py ``` 591 32 LOAD_CONST 1 ('Future') 34 LOAD_CONST 2 ('None') 36 LOAD_CONST 3 (('future', 'return')) 38 BUILD_CONST_KEY_MAP 5 40 LOAD_CLOSURE 3 (quiet_exceptions) 42 BUILD_TUPLE 1 44 LOAD_CONST 4 (<code object error_callback at 0x7fa7c64195b0, file "/home/inada-n/local/python-dev/lib/python3.10/site-packages/t ornado/gen.py", line 591>) 46 LOAD_CONST 5 ('with_timeout.<locals>.error_callback') 48 MAKE_FUNCTION 12 (annotations, closure) 50 STORE_DEREF 0 (error_callback) ``` I have not posted pyperformance result because this pull request is based on #41835 and I don't measure each performance. But #41835 + this is 16% faster than master on bm_tornado_http.
msg379437 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2020-10-23 14:08
It sounds like the root cause is that annotations are repeatedly being computed. We should address the underlying issue, IMO.
msg380215 - (view)	Author: Inada Naoki (methane) *	Date: 2020-11-02 12:04
OK. Microbenchmark don't justify adding complexity to the eval loop and the compiler.

History
Date	User	Action	Args
2022-04-11 14:59:37	admin	set	github: 86292
2020-11-02 12:04:38	methane	set	status: open -> closed resolution: rejected messages: + msg380215 stage: patch review -> resolved
2020-10-23 14:08:29	Mark.Shannon	set	messages: + msg379437
2020-10-23 13:53:46	methane	set	messages: + msg379434
2020-10-23 12:20:41	Mark.Shannon	set	nosy: + Mark.Shannon messages: + msg379431
2020-10-23 11:46:06	methane	set	messages: + msg379430
2020-10-23 09:05:06	methane	set	keywords: + patch stage: patch review pull_requests: + pull_request21842
2020-10-23 08:46:51	methane	create