Performance Benchmarks¶

Latest Benchmark Data: Updated February 2026 with measurements on both local SSD and network storage.

⚠️ DISCLAIMER: Benchmark results vary based on hardware, filesystem, and directory structure. Always run your own benchmarks on your target systems for accurate performance data specific to your use case.

Benchmark Methodology¶

Cache Management & Fair Testing¶

All benchmarks use controlled cache clearing to represent real-world performance:

--clear-cache: Clears filesystem cache before each backend (prevents warm-cache bias)
--warmup: Single warmup iteration before timed runs (eliminates cold-start artifacts)
--shuffle: Randomizes backend execution order (prevents first-run advantage)
Iterations: Multiple runs per backend, results reported as median

Test Datasets¶

Dataset	Files	Storage	Hardware
XXLarge	1,000,000	Local NVMe SSD	MacBook Air M4 (16GB RAM)
XLarge	200,000	Network NFS	Linux VM: Intel Xeon Gold 6442Y (4 cores), 24GB RAM

Performance Results¶

Latest Results (XXLarge Dataset, Local SSD, M4 MacBook Air 16GB)¶

1,000,000 files - XXLarge dataset on Apple Silicon - Median of 5 trials

Hardware: MacBook Air M4 (16GB RAM), Local NVMe SSD Test Command:

uv run python benchmarks/benchmark.py \
    --path /tmp/tmp_yk/bench-test \
    --dataset-size xxlarge \
    --backend profiling \
    -n 5 --clear-cache --warmup --shuffle

Profiling Backends (Full Metadata Collection)¶

Backend      │ Median Time │ Files/sec  │ Relative Speed
─────────────┼─────────────┼────────────┼───────────────
Rust         │ 7.335s      │ 136,331    │ 1.00x (baseline)
Async        │ 11.495s     │ 86,994     │ 0.64x
Rust-seq     │ 22.475s     │ 44,495     │ 0.33x
fd           │ 33.329s     │ 30,004     │ 0.22x
Python       │ 35.512s     │ 28,160     │ 0.21x

Traversal Backends (Discovery Only)¶

Backend      │ Median Time │ Files/sec  │ Relative Speed
─────────────┼─────────────┼────────────┼───────────────
os.walk      │ 0.565s      │ 1,771,045  │ 1.00x (baseline)
rust-fast    │ 4.385s      │ 228,035    │ 0.13x
async-fast   │ 10.839s     │ 92,262     │ 0.05x
pathlib      │ 19.305s     │ 51,799     │ 0.03x

Key Observations (Local SSD):

Rust profiling dominates at 1M files (7.3s vs 35.5s for Python)
os.walk is fastest for pure traversal (0.565s), validating Python's efficiency for discovery
Local SSD shows different characteristics than network storage - Rust parallelism shines
Async provides alternative with 0.64x baseline overhead vs 1.5x on network storage
Fast-path overhead significant on parallel backends but minimal on Python baseline

Earlier Results (XLarge Dataset, Network Storage, Profiling)¶

200,000 files - Full metadata collection - Median of 3 trials

Test Command:

uv run python benchmarks/benchmark.py \
    --path /XYZdata/tmp_yk/bench-test \
    --dataset-size xlarge \
    --backend profiling \
    -n 3 --clear-cache --warmup --shuffle

Backend      │ Median Time │ Files/sec  │ Relative Speed
─────────────┼─────────────┼────────────┼───────────────
Rust         │ 2.331s      │ 85,807     │ 1.00x (baseline)
Async        │ 2.842s      │ 70,384     │ 0.82x
Rust-seq     │ 8.584s      │ 23,300     │ 0.27x
fd           │ 14.386s     │ 13,902     │ 0.16x
Python       │ 15.146s     │ 13,205     │ 0.15x

Key Observations:

Rust parallel remains fastest for network storage
Async backend (tokio) shows good performance on NFS with high-latency operations
Sequential backends show their limitations at scale
Results from actual network filesystem (/vitodata) reflect real-world performance

Traversal Performance (XLarge Dataset, Fast-Path Only)¶

200,000 files - File path discovery without metadata collection - Median of 3 trials

Test Command:

uv run python benchmarks/benchmark.py \
    --path /XYZdata/tmp_yk/bench-test \
    --dataset-size xlarge \
    --backend traversal \
    -n 3 --clear-cache --warmup --shuffle

Backend      │ Median Time │ Files/sec  │ Relative Speed
─────────────┼─────────────┼────────────┼───────────────
os.walk      │ 0.412s      │ 485,171    │ 1.00x (baseline)
rust-fast    │ 0.640s      │ 312,698    │ 0.64x
async-fast   │ 2.742s      │ 72,931     │ 0.15x
pathlib      │ 9.933s      │ 20,136     │ 0.04x

Key Observations:

Rust fast-path is slower than pure os.walk for simple path discovery (0.64x)
Python os.walk is very efficient for discovery-only workloads
Async-fast introduces significant overhead when metadata collection is skipped
Pathlib's recursive glob remains slowest at scale
Fast-path variants show less advantage than profiling backends due to lower computational overhead

Key Insights¶

🦀 Rust excels at metadata collection - Fastest for profiling benchmarks across all storage types
⚡ Async is a strong alternative - Good performance on both local and network storage
🐍 Python dominates pure discovery - os.walk is fastest for traversal-only regardless of storage
🎯 Backend selection depends on task - Use Rust for metadata collection, os.walk for discovery-only

Backend Groups for Fair Comparisons¶

The benchmark tool separates backends into groups to ensure fair performance comparisons:

Profiling Backends¶

These backends perform full metadata collection (permissions, sizes, timestamps, etc.):

Rust - Rayon parallel scanner, most optimized for metadata collection
Rust-seq - Sequential Rust baseline for comparison
Async - Tokio async scanner, excellent for high-latency network operations
fd - External tool, traversal optimized but good reference point
Python - Pure Python with os.walk, always available baseline

Use profiling backends to benchmark complete directory scanning with all metadata:

python benchmarks/benchmark.py /path -n 3 --backend profiling

Traversal Backends¶

These backends only discover file paths (fast-path mode - no metadata collection):

os.walk - Python standard library baseline
pathlib - Python pathlib.Path.rglob
rust-fast - Rust with fast_path_only=True for pure discovery
async-fast - Async with fast_path_only=True for pure discovery

Use traversal backends to measure raw discovery performance:

python benchmarks/benchmark.py /path -n 3 --backend traversal

This separation is important because:

Fair comparisons - Profiling and traversal measure different capabilities
Realistic expectations - Fast-path shows what's possible with just path discovery
Use case matching - Choose comparison group matching your actual use case
Methodology clarity - Eliminates confusion about what's being measured

Network Storage Notes¶

Results above are from actual network-mounted storage (/XYZdata) with NFS protocol:

Rust parallel excels even on network storage, thanks to bounded concurrency and work-queue pattern
Async backend is competitive alternative with lower CPU overhead for high-latency operations
fd is available but was not benchmarked in latest tests

Network I/O patterns are different from local storage—your results may vary based on NFS server, latency, and network configuration.

Benchmarking Best Practices¶

Accurate Performance Testing¶

import subprocess
import time
from filoma.directories import DirectoryProfiler

def clear_filesystem_cache():
    """Clear OS filesystem cache for realistic benchmarks."""
    subprocess.run(['sync'], check=True)
    subprocess.run(['sudo', 'tee', '/proc/sys/vm/drop_caches'],
                   input='3\n', text=True, stdout=subprocess.DEVNULL, check=True)
    time.sleep(1)  # Let cache clear settle

def benchmark_backend(backend_name, path, iterations=3):
    """Benchmark a specific backend with cold cache."""
    profiler = DirectoryProfiler(DirectoryProfilerConfig(search_backend=backend_name, show_progress=False))

    # Check if the specific backend is available
    available = ((backend_name == "rust" and profiler.is_rust_available()) or
                (backend_name == "fd" and profiler.is_fd_available()) or
                (backend_name == "python"))  # Python always available
    if not available:
        return None

    times = []
    for i in range(iterations):
        clear_filesystem_cache()
        start = time.time()
    result = profiler.probe(path)
        elapsed = time.time() - start
        times.append(elapsed)

    avg_time = sum(times) / len(times)
    files_per_sec = result['summary']['total_files'] / avg_time

    return {
        'backend': backend_name,
        'avg_time': avg_time,
        'files_per_sec': files_per_sec,
        'total_files': result['summary']['total_files']
    }

# Example usage
results = []
for backend in ['rust', 'fd', 'python']:
    result = benchmark_backend(backend, '/test/directory')
    if result:
        results.append(result)
        print(f"{backend}: {result['avg_time']:.3f}s ({result['files_per_sec']:.0f} files/sec)")

# Find fastest
if results:
    fastest = min(results, key=lambda x: x['avg_time'])
    print(f"\n🏆 Fastest: {fastest['backend']}")

Performance Tips¶

Disable progress bars for benchmarking: show_progress=False
Use fast path only for discovery benchmarks: fast_path_only=True
Clear filesystem cache between runs for realistic results
Run multiple iterations and average the results
Test on your target storage - results vary by filesystem type

Warm vs Cold Cache Comparison¶

# Cold cache (realistic)
clear_filesystem_cache()
start = time.time()
result = profiler.probe("/test/directory")
cold_time = time.time() - start

# Warm cache (for comparison only)
start = time.time()
result = profiler.probe("/test/directory")
warm_time = time.time() - start

print(f"Cold cache: {cold_time:.3f}s (realistic)")
print(f"Warm cache: {warm_time:.3f}s (cached, {cold_time/warm_time:.1f}x slower when cold)")

⚠️ Important: Always use cold cache for realistic benchmarks. Warm cache results can be 2-8x faster but don't represent real-world performance for first-time directory access.

Backend Selection Recommendations¶

Use Case	Recommended Backend	Why
Large directories	Auto (Rust if available)	Best overall performance
Network filesystems	`fd`	Optimized for network I/O
CI/CD environments	Auto	Reliable with graceful fallbacks
Maximum compatibility	`python`	Always works, no dependencies
DataFrame analysis	Auto (Rust if available)	Fastest DataFrame building
Pattern matching	`fd`	Advanced regex/glob support

Your Results May Vary¶

Performance depends on:

Storage type - NVMe SSD > SATA SSD > HDD
Filesystem - ext4, NTFS, APFS, NFS all behave differently
Directory structure - Deep vs wide, file size distribution
System load - CPU, memory, I/O contention
Network latency - Critical for NFS/network storage

Run your own benchmarks on your target systems for accurate performance data.