Documentation

Semantic Search

Search images and documents using natural language queries with AI-powered understanding

How It Works

Semantic search uses AI to understand the meaning of your query, not just keywords. "Damaged equipment" will find images described as "broken machinery" even if those exact words aren't used. The system iteratively refines results for accuracy.

Multiple embeddings are used under the hood: visual embeddings for image similarity search, and text embeddings for document and description search.

Tip: More Token-Efficient Than Chat

Standalone search agents are the most token-efficient way to search your data. If you don't need multi-turn conversation, use these endpoints directly instead of agentic chat to save on token usage.

Image Search

Search Your Image Library

python
from scopix import Scopix
async with Scopix(api_key="scopix_...") as client:
result = await client.agent_search.images("damaged utility poles")
print(f"Found {result.count} images")
print(f"Summary: {result.summary}")
for img in result.results[:5]:
print(f"- {img.filename}: score={img.score:.2f}")
print(f" Description: {img.description[:100]}...")

Document Search

Search Across Documents

python
result = await client.agent_search.documents("safety inspection procedures")
print(f"Summary: {result.summary}")
print(f"Documents searched: {len(result.document_ids)}")
for chunk in result.results:
print(f"\n{chunk.document_filename} (page {chunk.page_numbers}):")
print(f" Score: {chunk.score:.2f}")
print(f" Text: {chunk.text[:200]}...")

Video Scene Search

Each analyzed video is indexed scene-by-scene with its own description, tags, and vector embedding. Scene search returns matching clips with start/end timestamps so you can deep-link into a specific moment.

Search Scenes Across All Videos

python
result = await client.agent_search.videos(
"closeup of a fire truck at night",
limit=10,
)
for video in result.results:
print(f"{video.video_filename} score={video.score:.3f}")
for scene in video.matched_scenes or []:
print(f" scene {scene.scene_index} "
f"{scene.time_range_formatted} score={scene.score:.3f}")
print(f" {scene.description}")

Simple Chunk Search

Faster, simpler document search without AI reasoning

python
results = await client.files.search(
query="safety inspection requirements",
limit=20,
similarity_threshold=0.3
)
for chunk in results.results:
print(f"{chunk.document_filename}: {chunk.content[:200]}...")

Filtering Results

Filter by Folder

python
result = await client.agent_search.images(
"corrosion or rust damage",
folder_id="fld_abc123",
limit=20
)

Search Specific Images

python
result = await client.agent_search.images(
"equipment with visible damage",
image_ids=["img_001", "img_002", "img_003"]
)
matching_ids = result.result_ids

Filter by Document Type

python
result = await client.agent_search.documents(
"maintenance schedules",
document_types=["pdf"],
limit=25
)

Using Result Refs for UI

SDK only — build interactive interfaces with result references

python
result = await client.agent_search.images("vehicles in parking lot")
# The summary contains [[ref:N]] patterns you can make clickable
print(result.summary)
# "Found 3 vehicles: a red sedan [[ref:1]], a blue truck [[ref:2]]..."
# Map refs to actual results for click handling
for ref_key, ref in result.result_refs.items():
print(f"[[ref:{ref_key}]] -> {ref.ids}")
# ref.count: number of items
# ref.ids: ["img_abc123", ...]
# ref.id_type: "image"
# ref.label: "3 vehicles"

Response Types

ImageSearchAgentResult

python
@dataclass(frozen=True)
class ImageSearchAgentResult:
success: bool
results: list[ImageSearchResultItem] # Full image objects
count: int # Total matches
result_ids: list[str] # Flat list of image IDs
summary: str # Human-readable summary
summary_raw: str # Original with [[ref:X]] patterns
result_refs: dict[str, ResultRefData] # For UI ref linking
search_mode: str # Search strategy used
execution_time_ms: int
iterations: int

DocumentSearchAgentResult

python
@dataclass(frozen=True)
class DocumentSearchAgentResult:
success: bool
results: list[DocumentChunkResultItem] # Matching chunks
count: int # Total chunk matches
chunk_ids: list[str]
document_ids: list[str] # Unique documents found
summary: str # Human-readable summary
summary_raw: str # Original with [[ref:X]] patterns
result_refs: dict[str, ResultRefData] # For UI ref linking
search_mode: str # Search strategy used
execution_time_ms: int
iterations: int

Writing Effective Queries

Be descriptive, not keyword-focused

✓ "Photos showing water damage on ceiling tiles"
✗ "water damage ceiling"

Include context

✓ "Construction workers without proper safety equipment"
✗ "safety violation"

Use natural language

✓ "Show me all the images from the March site inspection"
✗ "date:march type:inspection"