Documentation

Batch Processing

Process multiple files efficiently with automatic chunking, progress tracking, and retry handling

Batch Methods

• SDK auto-chunking: Up to 10,000 files, split into 100-file batches automatically
• Streaming batch (REST): Up to 50 files per request, server-managed concurrency
• Client-side parallel (REST): User-controlled concurrency with rate limit handling
• Presigned batch: For files >100MB (max 5 per batch)

Quick Start

Upload Multiple Files

results = await client.upload(
    ["img1.jpg", "img2.jpg", "img3.jpg"],
    wait_for_descriptions=True,
)

for result in results:
    print(f"{result.filename}: {result.description[:50]}...")

Large Batch Upload

SDK only — up to 10,000 files

from pathlib import Path

image_files = list(Path("/path/to/large/dataset").glob("**/*.jpg"))
print(f"Found {len(image_files)} images")

results = await client.upload(
    image_files,
    wait_for_descriptions=True,
)

print(f"Successfully uploaded {len(results)} images")
print(f"Summary: {results.summary()}")

Client-Side Parallel Upload

REST API — user-controlled concurrency

import asyncio
import aiohttp
from pathlib import Path

async def upload_batch(file_paths, max_concurrent=20):
    semaphore = asyncio.Semaphore(max_concurrent)

    async def upload_one(session, path):
        async with semaphore:
            data = aiohttp.FormData()
            data.add_field("files", open(path, "rb"), filename=Path(path).name)
            async with session.post(
                "https://api.aionvision.tech/api/v2/user-files/upload/stream-batch",
                headers={"Authorization": "Bearer YOUR_API_KEY"},
                data=data
            ) as resp:
                if resp.status == 429:
                    retry_after = int(resp.headers.get("Retry-After", 5))
                    await asyncio.sleep(retry_after)
                    return await upload_one(session, path)
                return await resp.json()

    async with aiohttp.ClientSession() as session:
        tasks = [upload_one(session, p) for p in file_paths]
        return await asyncio.gather(*tasks)

# Usage
files = list(Path("photos").glob("*.jpg"))
results = asyncio.run(upload_batch(files, max_concurrent=20))
print(f"Uploaded {len(results)} files")

Progress Tracking

SDK Progress Callbacks

from aion import AionVision

def on_progress(event: AionVision.UploadProgressEvent):
    print(f"File {event.file_index}: {event.progress_percent:.1f}%")

def on_file_complete(event: AionVision.FileCompleteEvent):
    print(f"Completed: {event.result.filename}")

results = await client.upload(
    files,
    on_progress=on_progress,
    on_file_complete=on_file_complete,
)

REST Batch Status

# Get session status
curl https://api.aionvision.tech/api/v2/uploads/sessions/{session_id}/status \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response:
# {
#   "session_id": "550e8400-...",
#   "status": "processing",
#   "total_files": 20,
#   "completed_files": 15,
#   "failed_files": 1,
#   "progress_percentage": 80.0
# }

# Get detailed results
curl "https://api.aionvision.tech/api/v2/uploads/sessions/{session_id}/results?limit=100" \
  -H "Authorization: Bearer YOUR_API_KEY"

Presigned URL Batch

For files >100MB — max 5 per batch

Monitor Presigned Batch

import asyncio

status = await client.uploads.get_batch_status(batch_id="batch_xyz")

print(f"Status: {status.overall_status}")
print(f"Progress: {status.completion_percentage}%")
print(f"Completed: {status.completed}")
print(f"Failed: {status.failed}")

# Poll until complete
while not status.is_terminal:
    await asyncio.sleep(5.0)
    status = await client.uploads.get_batch_status(batch_id="batch_xyz")

Cancel Batch

# Batch cancellation is available via REST API only
# See the curl example for the endpoint

Failure Handling

SDK only

from aion import AionVision

def on_description_failed(event: AionVision.DescriptionFailedEvent):
    print(f"Failed: {event.result.filename} - {event.result.description_error}")

results = await client.upload(
    files,
    raise_on_failure=False,
    on_description_failed=on_description_failed,
)

if results.has_failures:
    print(f"Summary: {results.summary()}")
    for r in results.retryable():
        print(f"Can retry: {r.image_id}")

Best Practices

Avoid duplicate uploads

When using the REST API, include skip_duplicates=true in the form data to skip identical files (compared by content hash). The SDK handles deduplication automatically.

Handle 429 errors gracefully

Implement exponential backoff using the Retry-After header. The SDK handles this automatically.

Limit concurrent requests

Keep concurrent REST requests between 10-20 for optimal throughput. Too many can trigger rate limiting.

Tier Limits

Free: 10 files per batch
Starter: 20 files per batch
Professional: 20 files per batch
Enterprise: 50 files per batch

What's Next?

Image Uploads

Upload images with streaming or presigned URLs

Document Uploads

Upload and search documents with RAG

Uploads API Reference

Complete endpoint documentation