Documentation
File Types
Types for file management operations
UserFile
Summary of a user file (in list responses)
python
@dataclass(frozen=True)class UserFile: id: str # Unique file identifier size_bytes: int # File size has_full_description: bool # Whether full descriptions exist title: Optional[str] # User or auto-generated title filename: Optional[str] # Original filename thumbnail_url: Optional[str] # URL to thumbnail upload_description: Optional[str] # Quick AI description visible_text: Optional[str] # OCR text tags: Optional[list[str]] # Tags created_at: Optional[datetime] # Upload timestamp content_created_at: Optional[datetime] # EXIF date dimensions: Optional[dict[str, int]] # {width, height} format: Optional[str] # jpeg, png, etc. variant_status: Optional[str] # Variant generation status variant_count: Optional[int] # Number of variants generated medium_url: Optional[str] # URL to medium variant full_url: Optional[str] # URL to full-size blur_hash: Optional[str] # BlurHash for placeholders description_status: Optional[str] # Description generation status description_error: Optional[str] # Error if description failed variant_error: Optional[str] # Error if variant generation failed confidence_score: Optional[float] # Always null for descriptions content_type: Optional[str] # MIME type (image/jpeg, etc.) media_type: Optional[str] # "image" | "document" | "link" content_category: Optional[str] # Content category for tailored AI analysis
# Domain extraction data (populated by content_category) legend_data: Optional[LegendData] architectural_design_data: Optional[ArchitecturalDesignData] ce_plan_data: Optional[CEPlanData] layout_region_data: Optional[dict[str, Any]] real_estate_data: Optional[dict[str, Any]] mining_data: Optional[dict[str, Any]] pid_data: Optional[dict[str, Any]] pfd_data: Optional[dict[str, Any]] schedule_data: Optional[dict[str, Any]] extraction_corrections: Optional[dict[str, Any]] # Per-domain review corrections
# Document-specific fields document_type: Optional[str] # pdf, docx, txt, etc. page_count: Optional[int] # Number of pages text_extraction_status: Optional[str] # Text extraction status chunk_count: Optional[int] # Number of text chunks document_url: Optional[str] # URL to document file
# Link-specific fields source_url: Optional[str] # Original URL domain: Optional[str] # Domain of the link og_metadata: Optional[dict[str, Any]] # Open Graph metadata favicon_url: Optional[str] # URL to favicon crawl_status: Optional[str] # Web crawl status extracted_images: Optional[dict[str, Any]] # Images extracted from link extracted_images_count: Optional[int] # Number of extracted imagesUserFileDetails
Full file details (from get response)
python
@dataclass(frozen=True)class UserFileDetails: id: str size_bytes: int content_type: str # MIME type hash: str # File hash title: Optional[str] tags: Optional[list[str]] dimensions: Optional[dict[str, int]] format: Optional[str] full_url: Optional[str] # 1024px variant URL thumbnail_url: Optional[str] medium_url: Optional[str] original_url: Optional[str] # Always available fallback upload_description: Optional[str] visible_text: Optional[str] # OCR text (plain string) text_regions: Optional[list[dict[str, Any]]] # OCR text with bounding boxes (0-1 normalized) description_generated_at: Optional[datetime] full_descriptions: Optional[list[FullDescription]] processing_history: Optional[list[ProcessingHistory]] created_at: Optional[datetime] updated_at: Optional[datetime] content_created_at: Optional[datetime] # EXIF metadata date original_filename: Optional[str] # Original filename from upload variant_status: Optional[str] # pending | processing | completed | failed variant_count: Optional[int] # Number of variants generated blur_hash: Optional[str] # BlurHash for placeholders description_status: Optional[str] # pending | processing | completed | failed content_category: Optional[str] # Content category for tailored AI analysisFileList
Paginated list of files
python
@dataclass(frozen=True)class FileList: files: list[UserFile] # File summaries total_count: int # Total files matching query has_more: bool # More files availableFullDescription
Detailed AI-generated description
python
@dataclass(frozen=True)class FullDescription: id: str # Description identifier description: str # Full description text visible_text: Optional[str] # OCR text (plain string) text_regions: Optional[list[dict[str, Any]]] # OCR text with bounding boxes (0-1 normalized) confidence_score: Optional[float] # Always null for descriptions processing_time_ms: Optional[int] created_at: Optional[datetime]Text Regions (Bounding Boxes)
The text_regions field on UserFileDetails and FullDescription contains structured OCR data with bounding box coordinates for each detected text region. Coordinates are normalized to 0-1 range (relative to image dimensions).
python
# Each text region is a dict with this structure:{ "text": "EXIT", # Verbatim text as it appears "bounding_box": { "x_min": 0.12, # Left edge (0-1) "y_min": 0.05, # Top edge (0-1) "x_max": 0.28, # Right edge (0-1) "y_max": 0.14 # Bottom edge (0-1) }}
# Access bounding boxes from file details:details = await client.files.get(file_id)for region in details.text_regions or []: text = region["text"] bbox = region["bounding_box"] print(f"'{text}' at ({bbox['x_min']:.2f}, {bbox['y_min']:.2f})" f" → ({bbox['x_max']:.2f}, {bbox['y_max']:.2f})")ProcessingHistory
Processing history entry for a file
python
@dataclass(frozen=True)class ProcessingHistory: id: str # History entry identifier status: str # Processing status created_at: Optional[datetime] # When the operation started completed_at: Optional[datetime] # When the operation completed error_message: Optional[str] # Error message if failedUpdateFileResult
Result of file update operation, returned by files.update()
python
@dataclass(frozen=True)class UpdateFileResult: id: str # File identifier title: Optional[str] # Updated title tags: Optional[list[str]] # Updated tags updated_at: Optional[datetime] # Update timestampDeleteFileResult
Result of file deletion operation, returned by files.delete()
python
@dataclass(frozen=True)class DeleteFileResult: id: str # Deleted file identifier deleted_at: Optional[datetime] # Deletion timestamp message: str # Confirmation messageBatchDeleteFileResult
Result for a single file in a batch delete operation
python
@dataclass(frozen=True)class BatchDeleteFileResult: id: str # File identifier status: str # "deleted", "skipped", or "failed" message: Optional[str] # Additional details about the operation deleted_at: Optional[datetime] # Deletion timestamp (if deleted)BatchDeleteFilesResponse
Response for batch delete operation, returned by files.batch_delete()
python
@dataclass(frozen=True)class BatchDeleteFilesResponse: deleted: list[BatchDeleteFileResult] # Successfully deleted files skipped: list[BatchDeleteFileResult] # Files skipped (e.g., currently processing) failed: list[BatchDeleteFileResult] # Files that failed to delete summary: dict[str, int] # Stats: {total, deleted, skipped, failed}
