← Notes
June 16, 2026·2 min readopen-sourcecontributingdev

Parallel Document Uploads: From Sequential Bottleneck to Bounded Concurrency

Parallel Document Uploads: From Sequential Bottleneck to Bounded Concurrency

xai-sdk-python now supports concurrent document uploads with configurable parallelism limits, eliminating sequential processing as a scaling bottleneck. This draft PR adds upload_documents() to both sync and async clients while maintaining backward compatibility.

What

Added a new upload_documents() method to the SDK's sync and async clients that uploads multiple documents in parallel using a bounded ThreadPoolExecutor. The implementation preserves upload order, respects a configurable max_workers limit, and re-raises the first encountered error without suppressing others. The feature is fully backward compatible with existing code.

Why it matters

Sequential document uploads created a hard scaling ceiling for users processing large batches. Network I/O is inherently parallelizable, so bounded concurrency lets multiple uploads proceed simultaneously without overwhelming the system. This addresses a real performance constraint identified in issue 77.

Who it's for

SDK users who batch-upload documents and need predictable performance at scale. The bounded parallelism design prevents resource exhaustion, making it safe for both small and large workloads. Developers using either sync or async patterns benefit equally.

When & where

This contribution is currently in draft PR 166 and has not yet merged to the main branch. The implementation includes 98 passing sync tests and 91 passing async tests, with benchmarks showing 3.9x to 8x speedup in mocked scenarios. Code quality passes ruff linting.

How

The method accepts a list of documents and an optional max_workers parameter (defaults to reasonable SDK-level configuration). Internally, it uses ThreadPoolExecutor to manage a bounded pool of worker threads, submitting all upload tasks and collecting results while preserving order. Error handling captures the first exception and re-raises it after cleanup.

Takeaway

Bounded concurrency is a practical middle ground between sequential processing and unbounded parallelism. This pattern is especially useful in SDKs where you want to improve throughput without requiring users to manage thread pools themselves. The draft status means the API may evolve before merge, so early feedback is valuable.

Draft PR: https://github.com/xai-org/xai-sdk-python/pull/166

Building an AI agent?

I'm packaging how I ship them into one kit. Early access:

AI Agent Starter Kit →