Skip to content

Memory leak when using set_type_codec and Python 3.10 #874

@roman-g

Description

@roman-g
  • asyncpg version: 0.25.0
  • PostgreSQL version: 13, 14
  • Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
    the issue with a local PostgreSQL install?
    : Reproduced locally
  • Python version: 3.10.1
  • Platform: Linux
  • Do you use pgbouncer?: no
  • Did you install asyncpg with pip?: yes
  • If you built asyncpg locally, which version of Cython did you use?:
  • Can the issue be reproduced under both asyncio and
    uvloop?
    :

The test I'm running: https://github.com/roman-g/asyncpg-memory-leak

Basically there's a table with a JSONB column, if I call await connection.set_type_codec("jsonb", encoder=json.dumps, decoder=json.loads, schema="pg_catalog"), each SELECT including that column causes a memory leak.

The test fetches all rows 2000 times, then calls set_type_codec, repeats reading, reports memory usage in MB between stages.
The reported usage is like

24
24
89

indicating a massive growth after the last reading.

SqlAlchemy always calls set_type_codec, so the issue affects it by default.

import asyncio
import os
import psutil
import asyncpg
import json


async def main():
    connection = await asyncpg.connect('postgresql://postgres:some_secret@postgresql:5432/postgres')
    await prepare_data(connection)

    print_memory_usage_in_mb()
    await read(connection)
    print_memory_usage_in_mb()

    await connection.set_type_codec("jsonb", encoder=json.dumps, decoder=json.loads, schema="pg_catalog")

    await read(connection)
    print_memory_usage_in_mb()

    await connection.close()


async def prepare_data(connection):
    await connection.execute("DROP TABLE IF EXISTS some_table")
    await connection.execute("""
        CREATE TABLE some_table(
            id serial PRIMARY KEY,
            data jsonb
        )
    """)
    for i in range(1000):
        await connection.execute("INSERT INTO some_table(data) VALUES($1)", '{"key":"value"}')


async def read(connection):
    for i in range(2000):
        result = await connection.fetch("SELECT data FROM some_table")
        assert len(result) > 0


def print_memory_usage_in_mb():
    process = psutil.Process(os.getpid())
    print(round(process.memory_info().rss / 1000000))


asyncio.run(main())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions