Performance Optimization

Guide to maximizing ElectoralSim performance for large-scale simulations.

Benchmarks

Scale	Create Time	Election Time	Memory
10K voters	18ms	5ms	52 MB
100K voters	109ms	35ms	15 MB
500K voters	608ms	176ms	95 MB
1M voters	1.2s	316ms	148 MB
2M voters	2.4s	672ms	181 MB

Batch processing: ~200ms/election at 500K voters (~5 elections/sec)

Numba JIT Acceleration

Core loops are accelerated with Numba (enabled by default):

from electoral_sim.engine.numba_accel import NUMBA_AVAILABLE

print(f"Numba available: {NUMBA_AVAILABLE}")  # True if installed

Accelerated Functions

vote_mnl_fast — Multinomial logit voting
fptp_count_fast — FPTP vote counting
compute_utilities_numba — Utility matrix calculation

First-Run Compilation

Numba compiles functions on first call. Expect ~500ms delay initially.

GPU Acceleration (CuPy)

For 1M+ voter simulations, enable GPU:

model = ElectionModel(
    n_voters=2_000_000,
    use_gpu=True,
    seed=42
)

Requirements

pip install cupy-cuda12x  # For CUDA 12
# or
pip install cupy-cuda11x  # For CUDA 11

Checking GPU Availability

from electoral_sim.engine.gpu_accel import is_gpu_available

if is_gpu_available():
    print("GPU acceleration enabled")
else:
    print("Falling back to CPU/Numba")

GPU-Accelerated Operations

Utility matrix computation
Multinomial logit sampling
Distance calculations

Batch Elections

For Monte Carlo analysis, use batch processing:

model = ElectionModel(n_voters=500_000, seed=42)

# Run 100 elections efficiently
results = model.run_elections_batch(
    n_elections=100,
    reset_voters=False  # Reuse voter frame
)

# Get aggregate statistics
stats = model.get_aggregate_stats(results)

reset_voters Parameter

False: Reuse same voter population (faster, same demographics)
True: Regenerate stochastic columns (ideology, turnout) each run

Data Type Optimization

Voter data uses optimized dtypes:

Column	dtype	Memory
Integers (age, knowledge)	int8/int16	1-2 bytes
Floats (ideology, probabilities)	float32	4 bytes
IDs	int64	8 bytes

Custom Voter Frame

For maximum control:

import polars as pl
import numpy as np

# Create minimal voter frame
n = 1_000_000
voter_frame = pl.DataFrame({
    "constituency": np.random.randint(0, 543, n, dtype=np.int16),
    "ideology_x": np.random.randn(n).astype(np.float32),
    "ideology_y": np.random.randn(n).astype(np.float32),
    "turnout_prob": np.random.uniform(0.6, 0.9, n).astype(np.float32),
    "unique_id": np.arange(n, dtype=np.int64),
})

model = ElectionModel(voter_frame=voter_frame)

Memory Management

For very large simulations:

import gc

# Run election
results = model.run_election()

# Clear cached data
model.voters.invalidate_cache()

# Force garbage collection
gc.collect()

Parallel Processing

For multiple independent simulations:

from concurrent.futures import ProcessPoolExecutor

def run_simulation(seed):
    model = ElectionModel(n_voters=100_000, seed=seed)
    return model.run_election()

# Run 10 simulations in parallel
with ProcessPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(run_simulation, range(10)))

Profiling

Identify bottlenecks:

import cProfile
import pstats

with cProfile.Profile() as pr:
    model = ElectionModel(n_voters=100_000)
    model.run_election()

stats = pstats.Stats(pr)
stats.sort_stats('cumulative')
stats.print_stats(20)

Tips

Start small: Test with 10K voters before scaling up
Use seeds: Ensure reproducibility while benchmarking
Batch when possible: run_elections_batch is faster than repeated run_election
Profile first: Identify actual bottlenecks before optimizing
GPU for 1M+: Below 1M voters, CPU/Numba is often faster due to transfer overhead