Documentation

Batch Processing

Process multiple files efficiently with automatic chunking, progress tracking, and retry handling

Batch Methods

  • SDK auto-chunking: Up to 10,000 files, split into 100-file batches automatically
  • Streaming batch (REST): Up to 50 files per request, server-managed concurrency
  • Client-side parallel (REST): User-controlled concurrency with rate limit handling
  • Presigned batch: For files >100MB (max 5 per batch)

Quick Start

Upload Multiple Files

results = await client.upload(
["img1.jpg", "img2.jpg", "img3.jpg"],
wait_for_descriptions=True,
)
for result in results:
print(f"{result.filename}: {result.description[:50]}...")

Large Batch Upload

SDK only — up to 10,000 files

from pathlib import Path
image_files = list(Path("/path/to/large/dataset").glob("**/*.jpg"))
print(f"Found {len(image_files)} images")
results = await client.upload(
image_files,
wait_for_descriptions=True,
)
print(f"Successfully uploaded {len(results)} images")
print(f"Summary: {results.summary()}")

Client-Side Parallel Upload

REST API — user-controlled concurrency

import asyncio
import aiohttp
from pathlib import Path
async def upload_batch(file_paths, max_concurrent=20):
semaphore = asyncio.Semaphore(max_concurrent)
async def upload_one(session, path):
async with semaphore:
data = aiohttp.FormData()
data.add_field("files", open(path, "rb"), filename=Path(path).name)
async with session.post(
"https://api.aionvision.tech/api/v2/user-files/upload/stream-batch",
headers={"Authorization": "Bearer YOUR_API_KEY"},
data=data
) as resp:
if resp.status == 429:
retry_after = int(resp.headers.get("Retry-After", 5))
await asyncio.sleep(retry_after)
return await upload_one(session, path)
return await resp.json()
async with aiohttp.ClientSession() as session:
tasks = [upload_one(session, p) for p in file_paths]
return await asyncio.gather(*tasks)
# Usage
files = list(Path("photos").glob("*.jpg"))
results = asyncio.run(upload_batch(files, max_concurrent=20))
print(f"Uploaded {len(results)} files")

Progress Tracking

SDK Progress Callbacks

from aion import AionVision
def on_progress(event: AionVision.UploadProgressEvent):
print(f"File {event.file_index}: {event.progress_percent:.1f}%")
def on_file_complete(event: AionVision.FileCompleteEvent):
print(f"Completed: {event.result.filename}")
results = await client.upload(
files,
on_progress=on_progress,
on_file_complete=on_file_complete,
)

REST Batch Status

# Get session status
curl https://api.aionvision.tech/api/v2/uploads/sessions/{session_id}/status \
-H "Authorization: Bearer YOUR_API_KEY"
# Response:
# {
# "session_id": "550e8400-...",
# "status": "processing",
# "total_files": 20,
# "completed_files": 15,
# "failed_files": 1,
# "progress_percentage": 80.0
# }
# Get detailed results
curl "https://api.aionvision.tech/api/v2/uploads/sessions/{session_id}/results?limit=100" \
-H "Authorization: Bearer YOUR_API_KEY"

Presigned URL Batch

For files >100MB — max 5 per batch

Monitor Presigned Batch

import asyncio
status = await client.uploads.get_batch_status(batch_id="batch_xyz")
print(f"Status: {status.overall_status}")
print(f"Progress: {status.completion_percentage}%")
print(f"Completed: {status.completed}")
print(f"Failed: {status.failed}")
# Poll until complete
while not status.is_terminal:
await asyncio.sleep(5.0)
status = await client.uploads.get_batch_status(batch_id="batch_xyz")

Cancel Batch

# Batch cancellation is available via REST API only
# See the curl example for the endpoint

Failure Handling

SDK only

from aion import AionVision
def on_description_failed(event: AionVision.DescriptionFailedEvent):
print(f"Failed: {event.result.filename} - {event.result.description_error}")
results = await client.upload(
files,
raise_on_failure=False,
on_description_failed=on_description_failed,
)
if results.has_failures:
print(f"Summary: {results.summary()}")
for r in results.retryable():
print(f"Can retry: {r.image_id}")

Best Practices

Avoid duplicate uploads

When using the REST API, include skip_duplicates=true in the form data to skip identical files (compared by content hash). The SDK handles deduplication automatically.

Handle 429 errors gracefully

Implement exponential backoff using the Retry-After header. The SDK handles this automatically.

Limit concurrent requests

Keep concurrent REST requests between 10-20 for optimal throughput. Too many can trigger rate limiting.

Tier Limits

  • Free: 10 files per batch
  • Starter: 20 files per batch
  • Professional: 20 files per batch
  • Enterprise: 50 files per batch