Efficient Memoization with functools.lru

Introduction

Have you ever written a function that re-computes the same expensive result over and over? Enter memoization — caching function results so repeated calls with the same arguments return instantly. Python's functools.lru_cache provides a simple, well-tested way to memoize pure functions.

In this post we'll:

Explain the core concepts and prerequisites.
Walk through hands-on, real-world examples.
Show how lru_cache interacts with parallelism tools such as concurrent.futures and multiprocessing.
Demonstrate patterns for using lru_cache in interactive dashboards (Dash).
Share best practices, pitfalls, and advanced tips.

This guide assumes intermediate Python knowledge (decorators, concurrency basics). Let's get started.

Prerequisites

Python 3.7+ recommended (behaviour consistent across 3.7–3.12).
Familiarity with functions, decorators, and basic concurrency (threads/processes).
Optional: basic exposure to Dash and web frameworks if you plan to integrate caching in an app.

Resources:

Official docs: https://docs.python.org/3/library/functools.html#functools.lru_cache

Core Concepts

Memoization: Storing results of expensive function calls and returning the cached result when inputs repeat.
LRU (Least Recently Used): lru_cache evicts the least recently used entry when the cache is full.
Cache key: Based on function arguments; arguments must be hashable (immutable types).
Thread-safety: lru_cache is thread-safe (it uses an internal lock) in CPython.
Process-safety: lru_cache is not shared between processes; each process has its own cache.

Analogy: Think of lru_cache as a small bookshelf (cache) where you keep copies of frequently read books (results). If the shelf is full, you remove the least recently used book to make room for a new one.

Basic Example: Fibonacci with lru_cache

A classic demonstration: recursive Fibonacci without memoization is exponential time.

from functools import lru_cache
@lru_cache(maxsize=128)
def fib(n: int) -> int:
    """Return the nth Fibonacci number (naive recursive with memoization)."""
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)

Explanation (line-by-line):

from functools import lru_cache — import the decorator.
@lru_cache(maxsize=128) — wrap fib with an LRU cache holding up to 128 results.
def fib(n: int) -> int: — define a typed function for clarity.
Base case: if n < 2, return n.
Recursive case: return sum of previous two Fibonacci numbers; repeated subcalls are cached.

Edge cases:

Non-integer or negative n should be validated in production code.
maxsize=None would create an unbounded cache (use carefully).

Try it:

fib(35) is instantaneous after initial recursion completes; subsequent calls are cached.

Inspecting the cache:

print(fib.cache_info())
Output: CacheInfo(hits=..., misses=..., maxsize=128, currsize=...)

cache_info() helps you understand hit/miss ratio; good for optimization feedback.

Real-World Example: Caching Expensive I/O or Computations

Imagine a function that queries a slow external API or performs a heavy computation.

Example: simulated heavy computation (e.g., image processing, ML inference):

import time
from functools import lru_cache
@lru_cache(maxsize=256)
def heavy_compute(key: str) -> dict:
    """Simulate a heavy computation or API call returning JSON-like data."""
    # Simulate latency
    time.sleep(2)
    # Example result
    return {"key": key, "value": hash(key) % 1000}

Explanation:

Each unique key costs ~2 seconds once; further calls return immediately from cache.
maxsize=256 constrains memory use.

Usage and timing:

import time
start = time.perf_counter()
print(heavy_compute("alpha"))  # ~2s
print("Elapsed:", time.perf_counter() - start)
start = time.perf_counter()
print(heavy_compute("alpha"))  # ~0s (cached)
print("Elapsed:", time.perf_counter() - start)

Edge cases:

If underlying data changes (the API result for "alpha" changes), cache will serve stale results. Consider cache invalidation strategies (timeouts, versioned keys).

Measuring Impact: Profiling Cache Effectiveness

Use cache_info() to measure hits/misses.

print(heavy_compute.cache_info())
Example output: CacheInfo(hits=1, misses=1, maxsize=256, currsize=1)

High hit ratio indicates effective memoization.

Integrating lru_cache with concurrent.futures

concurrent.futures provides ThreadPoolExecutor and ProcessPoolExecutor for parallelism. lru_cache behaves differently across these:

With ThreadPoolExecutor, threads share the same memory space; cache is shared across threads and thread-safe.
With ProcessPoolExecutor or multiprocessing, each process has its own memory; each process has its own cache, so benefits are per-process only.

Example with ThreadPoolExecutor (cache shared):

from concurrent.futures import ThreadPoolExecutor
import time
@lru_cache(maxsize=128)
def fib_cached(n: int) -> int:
    if n < 2:
        return n
    return fib_cached(n - 1) + fib_cached(n - 2)
def call_fib(n):
    return fib_cached(n)
if __name__ == "__main__":
    with ThreadPoolExecutor(max_workers=4) as ex:
        results = list(ex.map(call_fib, [30, 31, 32, 30, 31]))
    print("Cache info:", fib_cached.cache_info())

Explanation:

Threads share the same process memory and the same fib_cached cache.
Concurrent calls can benefit from previously computed results.

Example with ProcessPoolExecutor (cache not shared):

from concurrent.futures import ProcessPoolExecutor
def call_fib_in_process(n):
    # Each worker process will import and create its own lru_cache for fib_cached
    from functools import lru_cache
    @lru_cache(maxsize=128)
    def local_fib(n):
        if n < 2:
            return n
        return local_fib(n - 1) + local_fib(n - 2)
    return local_fib(n)
if __name__ == "__main__":
    with ProcessPoolExecutor(max_workers=4) as ex:
        results = list(ex.map(call_fib_in_process, [30, 31, 32, 30, 31]))
    print(results)

Explanation:

Each process builds its own cache; repeated values across tasks may be recomputed in each worker, losing the benefit of a shared cache.

When to use which:

Use ThreadPool when your task is I/O-bound and you want a shared cache.
Use ProcessPool for CPU-bound tasks (to bypass GIL), but be mindful that lru_cache will be per-process.

Sharing Cached Results Across Processes

If you need a cache shared across processes, consider:

A multiprocessing.Manager().dict() as a shared cache.
Using an external cache like Redis, Memcached, or in-process web server caches (e.g., Flask-Caching).
Using file-based caches or sqlite.

Example: simple shared cache using Manager (naive, for demonstration):

from multiprocessing import Manager, Pool
import time
import hashlib
import json
def init(shared_cache):
    global cache
    cache = shared_cache
def make_key(args):
    # Deterministic key for arguments - use careful hashing in real cases
    return hashlib.sha256(json.dumps(args, sort_keys=True).encode()).hexdigest()
def compute(data):
    k = make_key(data)
    if k in cache:
        return cache[k]  # shared across processes
    # Simulate heavy work
    time.sleep(2)
    result = {"result": sum(data)}  # placeholder
    cache[k] = result
    return result
if __name__ == "__main__":
    manager = Manager()
    shared_cache = manager.dict()
    with Pool(initializer=init, initargs=(shared_cache,)) as p:
        inputs = [[1, 2, 3], [4, 5, 6], [1, 2, 3]]
        results = p.map(compute, inputs)
    print(results)

Notes:

Manager-based caches add IPC overhead; good for moderate sharing, not high-performance low-latency caching.
In production, use Redis or a dedicated caching layer if you need high throughput and persistence.

Using lru_cache in a Dash App

Dash apps often benefit from caching computed results for interactive callbacks. If your Dash app runs in a single process (development or a single worker), lru_cache can be a quick win for caching pure callbacks.

Example (conceptual snippet):

# app.py
from functools import lru_cache
import dash
from dash import html, dcc
import time
app = dash.Dash(__name__)
@lru_cache(maxsize=128)
def expensive_query(param: str):
    time.sleep(2)  # simulate expensive work
    return f"Result for {param}"
app.layout = html.Div([
    dcc.Input(id="input", value="alpha"),
    html.Button("Compute", id="button"),
    html.Div(id="output")
])
@app.callback(
    dash.dependencies.Output("output", "children"),
    [dash.dependencies.Input("button", "n_clicks")],
    [dash.dependencies.State("input", "value")]
)
def on_click(n_clicks, value):
    if not n_clicks:
        return "Press compute"
    return expensive_query(value)
if __name__ == "__main__":
    app.run_server(debug=True)

Important considerations:

For multi-worker deployments (Gunicorn/WSGI with multiple processes), each worker maintains its own lru_cache. To have a centralized cache, use flask_caching with Redis or filesystem backend.
For async callbacks or async frameworks, lru_cache is synchronous. For async caching consider aiocache or server-side caches.

Advanced: Custom Keys, Unhashable Arguments, and Typed Option

lru_cache requires function arguments to be hashable. Mutable types (lists, dicts) will raise TypeError.
You can convert mutable inputs to immutable representations (tuples, frozensets, or serialized JSON) for caching keys.

Example: wrapper that canonicalizes arguments:

from functools import lru_cache, wraps
import json
def hashable_args(func):
    @wraps(func)
    def wrapper(args, kwargs):
        # Convert args/kwargs to string representation; ensure deterministic ordering
        key = json.dumps({"args": args, "kwargs": kwargs}, sort_keys=True, default=str)
        return func(key)
    return wrapper

@lru_cache(maxsize=128)
@hashable_args
def compute_from_key(key_str):
    # Since compute_from_key receives a key string, it can reconstruct
    # or parse key_str if needed, or simply use the key to store results.
    return f"Computed {key_str}"

Caveats:

Using JSON for keys must ensure deterministic serialization (use sort_keys=True).

For large argument structures, building keys can add overhead.

The typed parameter:

@lru_cache(maxsize=128, typed=True) treats arguments of different types as distinct keys (e.g., 1 and 1.0 are different).

Common Pitfalls and How to Avoid Them

Unhashable Arguments: Convert to immutable types or compute a stable key.

Stale Data: Cache doesn't know when the underlying data changes. Strategies: time-based invalidation, versioned keys, or explicit cache_clear().

Memory Growth: Unbounded caches (maxsize=None) can cause OOM. Always consider a reasonable maxsize.

Side Effects: Functions with side effects (writing to DB, network calls) are poor candidates for memoization.

Exception Handling: If a function raises an exception, the result isn't cached. You may want to catch and handle expected exceptions inside the function.

Multiprocessing Misunderstanding: Expecting a single shared cache across processes leads to surprises — consider external caches.

Best Practices

Cache pure functions (no side effects, deterministic).

Choose sensible maxsize values and monitor cache_info().

Use cache_clear() in tests or when state changes.

For web apps deployed with multiple workers, prefer a centralized cache (Redis) or use frameworks' caching utilities (e.g., Flask-Caching).

Profile before and after caching; not all functions benefit.

Use typed=True only if you intentionally want type distinctions in keys.

Combining lru_cache with concurrent.futures and multiprocessing — Patterns

Pattern 1: Use lru_cache + ThreadPoolExecutor for I/O-bound tasks where threads share cache.

Pattern 2: For CPU-bound work, use ProcessPoolExecutor but rely on shared external cache for cross-process memoization (Redis, disk-based, or Manager dict for small workloads).

Pattern 3: Use concurrent.futures as the high-level abstraction for parallel execution — it's simpler than raw multiprocessing and works nicely with thread-shared lru_cache.

Example: Hybrid pipeline

Use lru_cache in main process to avoid re-submitting known work.

Submit remaining unique work to ProcessPoolExecutor for heavy CPU work.

Collect and optionally store results in external cache.

from concurrent.futures import ProcessPoolExecutor, as_completed from functools import lru_cache @lru_cache(maxsize=1024) def preprocess_key(k): # cheap canonicalization, shared in main process return k.strip().lower() def heavy_task(k): # executed in worker processes, no shared lru_cache here import time time.sleep(2) return k, len(k) def batch_process(keys): unique = {} # Preprocess and use main-process cache for k in keys: pk = preprocess_key(k) unique.setdefault(pk, []).append(k)
results = {} with ProcessPoolExecutor() as ex: futures = {ex.submit(heavy_task, pk): pk for pk in unique} for fut in as_completed(futures): pk = futures[fut] results[pk] = fut.result() return results

This pattern reduces duplicate heavy tasks before distributing work to workers.

Advanced Tips

For async functions, consider asynchronous caches or wrap synchronous caches carefully.

If you need more control (time-based expiration, size eviction policies other than LRU), look at cachetools (TTLCache, LFU, etc.).

Use logging and cache_info() to decide maxsize. If hits are low, your cache keys may be too variable.

On long-running services, periodically cache_clear() if the function depends on external mutable resources.

When unit-testing, call myfunc.cache_clear() in test setup/teardown.

Example: Implementing a TTL-like Behavior with lru_cache
lru_cache doesn't support TTL natively. You can combine timestamp keys:

import time
from functools import lru_cache
def with_ttl(ttl_seconds):
    def decorator(func):
        @lru_cache(maxsize=1024)
        def wrapper(args, _ts, *kwargs):
            return func(args, *kwargs)
        def wrapped(args, *kwargs):
            # compute bucketed timestamp to invalidate cache periodically
            ts = int(time.time() / ttl_seconds)
            return wrapper(args, _ts=ts, *kwargs)
        wrapped.cache_clear = wrapper.cache_clear
        wrapped.cache_info = wrapper.cache_info
        return wrapped
    return decorator

@with_ttl(60)
def data_fetch(x):
    # expensive fetch
    return x  2

Note: This creates additional synthetic cache key _ts to expire results every ttl_seconds.

Conclusion

functools.lru_cache is a powerful, simple tool for efficient memoization in Python. Use it for pure, expensive, or frequently repeated computations to reduce runtime and simplify logic. However, remember:

It's per-process: use external caches for multi-process sharing.
Arguments must be hashable or normalized.
Consider memory limits and invalidation needs.

By combining lru_cache smartly with concurrent.futures, careful process design, or external caching layers, you can build responsive, performant Python applications — from CLI tools to interactive Dash dashboards.

Call to Action

Try the examples in this post:

Experiment with cache_info() and adjust maxsize.
Convert a real slow function in your project to use lru_cache and measure speedups.
If you deploy a Dash app, try lru_cache for local caches and evaluate a Redis backend for multi-worker deployments.

If you’d like, paste a sample function from your project and I’ll help you design an appropriate caching strategy.

Efficient Memoization with functools.lru_cache in Real-World Python Applications

Introduction

Prerequisites

Core Concepts

Basic Example: Fibonacci with lru_cache

Output: CacheInfo(hits=..., misses=..., maxsize=128, currsize=...)

Real-World Example: Caching Expensive I/O or Computations

Measuring Impact: Profiling Cache Effectiveness

Example output: CacheInfo(hits=1, misses=1, maxsize=256, currsize=1)

Integrating lru_cache with concurrent.futures

Sharing Cached Results Across Processes

Using lru_cache in a Dash App

Advanced: Custom Keys, Unhashable Arguments, and Typed Option

Common Pitfalls and How to Avoid Them

Best Practices

Combining lru_cache with concurrent.futures and multiprocessing — Patterns

Advanced Tips

Example: Implementing a TTL-like Behavior with lru_cache

Conclusion

Call to Action

Further Reading

Was this article helpful?

Stay Updated with Python Tips

Related Posts