Back to Blog

Python Memory Management — Reference Counting, GC & the GIL

February 20, 202610 min read

Python's memory management is often taken for granted. You create objects, use them, and they magically disappear when you're done. But under the hood, CPython uses a sophisticated system of reference counting, cyclic garbage collection, and memory pools to keep your program running efficiently.

Reference Counting

Every Python object has a reference count — a number tracking how many variables or data structures point to it. When you assign an object to a variable, the count goes up. When the variable goes out of scope or is reassigned, the count goes down. When the count reaches zero, the object is immediately freed.

import sys
a = [1, 2, 3]
print(sys.getrefcount(a)) # 2 (a + getrefcount arg)
b = a # refcount increases to 3
del b # refcount back to 2

Cyclic Garbage Collection

Reference counting has a weakness: circular references. If object A references object B and B references A, their reference counts never reach zero — even if no other code uses them.

CPython solves this with a generational garbage collector. It tracks objects in three generations (0, 1, 2). New objects start in generation 0. Objects that survive a GC cycle get promoted to the next generation. Older generations are collected less frequently, since long-lived objects are less likely to become garbage.

Memory Pools (pymalloc)

For small objects (≤512 bytes), CPython doesn't use the system allocator directly. Instead, it maintains its own memory pool system called pymalloc. Memory is divided into arenas (256 KB), pools (4 KB), and blocks — optimized for frequent allocation and deallocation of small objects like integers and short strings.

The Global Interpreter Lock (GIL)

The GIL is a mutex that ensures only one thread executes Python bytecode at a time. It exists because CPython's reference counting is not thread-safe. Without the GIL, two threads could modify a reference count simultaneously, causing memory corruption.

The GIL means CPU-bound multi-threaded Python code won't benefit from multiple cores. For CPU-parallelism, use multiprocessing or concurrent.futures instead.

Visualize It

Use Code Visualizer's Python Visualizer to see reference counts change in real-time as you create, assign, and delete objects.

Common Memory Pitfalls

  • Mutable default arguments: Default lists/dicts are shared across function calls
  • Large global variables: They live for the entire program lifetime
  • Circular references with __del__: Can prevent garbage collection entirely
  • Caching without limits: Unbounded caches cause memory to grow forever