Examples
Basic Geometric example using objectHashes¶
GCP Examples¶
A simple example of how you might utilise the objectHashes can be done via:
from typing import Annotated
from encord.objects.ontology_labels_impl import LabelRowV2
from encord.objects.ontology_object_instance import ObjectInstance
from encord_agents.core.data_model import FrameData
from encord_agents.core.dependencies import Depends
from encord_agents.gcp.dependencies import dep_objects
from encord_agents.gcp.wrappers import editor_agent
@editor_agent
def handle_object_hashes(
frame_data: FrameData,
lr: LabelRowV2,
object_instances: Annotated[list[ObjectInstance], Depends(dep_objects)],
) -> None:
for object_inst in object_instances:
print(object_inst)
An example use case of the above: Suppose that I have my own OCR model and I want to selectively run OCR on objects I've selected from the Encord app. You can then trigger your agent from the app and it'll appropriately send a list of objectHashes to your agent. Then via the dep_objects method above, it gives the agent immediate access to the object instance making it easier to integrate your OCR model.
Test the Agent
- Save the above code as
agent.py
. - Then in your current terminal, run the following command to run the agent in debug mode.
- Open your Project in the Encord platform and navigate to a frame with an object that you want to act on. Choose an object from the bottom left sider and click
Copy URL
as shown:

Tip
The url should have roughly this format: "https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}/0?other_query_params&objectHash={objectHash}"
.
-
In another shell operating from the same working directory, source your virtual environment and test the agent.
-
To see if the test is successful, refresh your browser to see the action taken by the Agent. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
Nested Classification using Claude 3.5 Sonnet¶
The goals of this example are:
- Create an editor agent that automatically adds frame-level classifications.
- Demonstrate how to use the
OntologyDataModel
for classifications.
Prerequisites
Before you begin, ensure you have:
- Created a virtual Python environment.
- Installed all necessary dependencies.
- Have an Anthropic API key.
- Are able to authenticate with Encord.
Run the following commands to set up your environment:
python -m venv venv # Create a virtual Python environment
source venv/bin/activate # Activate the virtual environment
python -m pip install encord-agents anthropic # Install required dependencies
export ANTHROPIC_API_KEY="<your_api_key>" # Set your Anthropic API key
export ENCORD_SSH_KEY_FILE="/path/to/your/private/key" # Define your Encord SSH key
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used.
See the ontology JSON
{
"objects": [],
"classifications": [
{
"id": "1",
"featureNodeHash": "TTkHMtuD",
"attributes": [
{
"id": "1.1",
"featureNodeHash": "+1g9I9Sg",
"type": "text",
"name": "scene summary",
"required": false,
"dynamic": false
}
]
},
{
"id": "2",
"featureNodeHash": "xGV/wCD0",
"attributes": [
{
"id": "2.1",
"featureNodeHash": "k3EVexk7",
"type": "radio",
"name": "is there a person in the frame?",
"required": false,
"options": [
{
"id": "2.1.1",
"featureNodeHash": "EkGwhcO4",
"label": "yes",
"value": "yes",
"options": [
{
"id": "2.1.1.1",
"featureNodeHash": "mj9QCDY4",
"type": "text",
"name": "What is the person doing?",
"required": false
}
]
},
{
"id": "2.1.2",
"featureNodeHash": "37rMLC/v",
"label": "no",
"value": "no",
"options": []
}
],
"dynamic": false
}
]
}
]
}
To construct the same Ontology as used in this example, run the following script.
import json
from encord.objects.ontology_structure import OntologyStructure
from encord_agents.core.utils import get_user_client
encord_client = get_user_client()
structure = OntologyStructure.from_dict(json.loads("{the_json_above}"))
ontology = encord_client.create_ontology(
title="Your ontology title",
structure=structure
)
print(ontology.ontology_hash)
The aim is to trigger an agent that transforms a labeling task from Figure A to Figure B. (Hint: Click the images and use the keyboard arrows to toggle between them.)


Create the Agent
Here is the full code, but a section-by-section explanation follows.
The full code for agent.py
-
Import dependencies and set up the Project.
Info
Ensure you insert your Project's unique identifier.
agent.pyimport os from anthropic import Anthropic from encord.objects.ontology_labels_impl import LabelRowV2 from numpy.typing import NDArray from typing_extensions import Annotated from encord_agents.core.ontology import OntologyDataModel from encord_agents.core.utils import get_user_client from encord_agents.core.video import Frame from encord_agents.gcp import Depends, editor_agent from encord_agents.gcp.dependencies import FrameData, dep_single_frame client = get_user_client() project = client.get_project("<your_project_hash>")
-
Create a data model and a system prompt based on the Project Ontology to tell Claude how to structure its response:
agent.pydata_model = OntologyDataModel(project.ontology_structure.classifications) system_prompt = f""" You're a helpful assistant that's supposed to help fill in json objects according to this schema: ```json {data_model.model_json_schema_str} ``` Please only respond with valid json. """
??? "See the result of
data_model.model_json_schema_str
for the given example"{ "$defs": { "IsThereAPersonInTheFrameRadioModel": { "properties": { "feature_node_hash": { "const": "k3EVexk7", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "k3EVexk7" ], "title": "Feature Node Hash", "type": "string" }, "choice": { "description": "Choose exactly one answer from the given options.", "discriminator": { "mapping": { "37rMLC/v": "#/$defs/NoNestedRadioModel", "EkGwhcO4": "#/$defs/YesNestedRadioModel" }, "propertyName": "feature_node_hash" }, "oneOf": [ { "$ref": "#/$defs/YesNestedRadioModel" }, { "$ref": "#/$defs/NoNestedRadioModel" } ], "title": "Choice" } }, "required": [ "feature_node_hash", "choice" ], "title": "IsThereAPersonInTheFrameRadioModel", "type": "object" }, "NoNestedRadioModel": { "properties": { "feature_node_hash": { "const": "37rMLC/v", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "37rMLC/v" ], "title": "Feature Node Hash", "type": "string" }, "title": { "const": "no", "default": "Constant value - should be included as-is.", "enum": [ "no" ], "title": "Title", "type": "string" } }, "required": [ "feature_node_hash" ], "title": "NoNestedRadioModel", "type": "object" }, "SceneSummaryTextModel": { "properties": { "feature_node_hash": { "const": "+1g9I9Sg", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "+1g9I9Sg" ], "title": "Feature Node Hash", "type": "string" }, "value": { "description": "Please describe the image as accurate as possible focusing on 'scene summary'", "maxLength": 1000, "minLength": 0, "title": "Value", "type": "string" } }, "required": [ "feature_node_hash", "value" ], "title": "SceneSummaryTextModel", "type": "object" }, "WhatIsThePersonDoingTextModel": { "properties": { "feature_node_hash": { "const": "mj9QCDY4", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "mj9QCDY4" ], "title": "Feature Node Hash", "type": "string" }, "value": { "description": "Please describe the image as accurate as possible focusing on 'What is the person doing?'", "maxLength": 1000, "minLength": 0, "title": "Value", "type": "string" } }, "required": [ "feature_node_hash", "value" ], "title": "WhatIsThePersonDoingTextModel", "type": "object" }, "YesNestedRadioModel": { "properties": { "feature_node_hash": { "const": "EkGwhcO4", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "EkGwhcO4" ], "title": "Feature Node Hash", "type": "string" }, "what_is_the_person_doing": { "$ref": "#/$defs/WhatIsThePersonDoingTextModel", "description": "A text attribute with carefully crafted text to describe the property." } }, "required": [ "feature_node_hash", "what_is_the_person_doing" ], "title": "YesNestedRadioModel", "type": "object" } }, "properties": { "scene_summary": { "$ref": "#/$defs/SceneSummaryTextModel", "description": "A text attribute with carefully crafted text to describe the property." }, "is_there_a_person_in_the_frame": { "$ref": "#/$defs/IsThereAPersonInTheFrameRadioModel", "description": "A mutually exclusive radio attribute to choose exactly one option that best matches to the give visual input." } }, "required": [ "scene_summary", "is_there_a_person_in_the_frame" ], "title": "ClassificationModel", "type": "object" }
-
Create an Anthropic API client to communicate with Claude.
-
Define the editor agent.
agent.py@editor_agent() def agent( frame_data: FrameData, lr: LabelRowV2, content: Annotated[NDArray, Depends(dep_single_frame)], ): frame = Frame(frame_data.frame, content=content) message = anthropic_client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, system=system_prompt, messages=[ { "role": "user", "content": [frame.b64_encoding(output_format="anthropic")], } ], ) try: classifications = data_model(message.content[0].text) for clf in classifications: clf.set_for_frames(frame_data.frame, confidence=0.5, manual_annotation=False) lr.add_classification_instance(clf) except Exception: import traceback traceback.print_exc() print(f"Response from model: {message.content[0].text}") lr.save()
The agent follows these steps:
- Automatically retrieves the frame content using the
dep_single_frame
dependency. - Sends the frame image to Claude for analysis.
- Parses Claude's response into classification instances using the predefined data model.
- Adds the classifications to the label row and saves the results.
Test the Agent
-
In your current terminal, run the following command to run the agent in debug mode.
-
Open your Project in the Encord platform and navigate to a frame you want to add a classification to. Copy the URL from your browser.
Tip
The url should have roughly this format:
"https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
. -
In another shell operating from the same working directory, source your virtual environment and test the agent.
-
To see if the test is successful, refresh your browser to view the classifications generated by Claude. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
Nested Attributes using Claude 3.5 Sonnet¶
The goals of this example are:
- Create an editor agent that can convert generic object annotations (class-less coordinates) into class specific annotations with nested attributes like descriptions, radio buttons, and checklists.
- Demonstrate how to use both the
OntologyDataModel
and thedep_object_crops
dependency.
Prerequisites
Before you begin, ensure you have:
- Created a virtual Python environment.
- Installed all necessary dependencies.
- Have an Anthropic API key.
- Are able to authenticate with Encord.
Run the following commands to set up your environment:
python -m venv venv # Create a virtual Python environment
source venv/bin/activate # Activate the virtual environment
python -m pip install encord-agents anthropic # Install required dependencies
export ANTHROPIC_API_KEY="<your_api_key>" # Set your Anthropic API key
export ENCORD_SSH_KEY_FILE="/path/to/your/private/key" # Define your Encord SSH key
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used provided the object types are the same and there is one entry called "generic"
.
See the ontology JSON
{
"objects": [
{
"id": "1",
"name": "person",
"color": "#D33115",
"shape": "bounding_box",
"featureNodeHash": "2xlDPPAG",
"required": false,
"attributes": [
{
"id": "1.1",
"featureNodeHash": "aFCN9MMm",
"type": "text",
"name": "activity",
"required": false,
"dynamic": false
}
]
},
{
"id": "2",
"name": "animal",
"color": "#E27300",
"shape": "bounding_box",
"featureNodeHash": "3y6JxTUX",
"required": false,
"attributes": [
{
"id": "2.1",
"featureNodeHash": "2P7LTUZA",
"type": "radio",
"name": "type",
"required": false,
"options": [
{
"id": "2.1.1",
"featureNodeHash": "gJvcEeLl",
"label": "dolphin",
"value": "dolphin",
"options": []
},
{
"id": "2.1.2",
"featureNodeHash": "CxrftGS4",
"label": "monkey",
"value": "monkey",
"options": []
},
{
"id": "2.1.3",
"featureNodeHash": "OQyWm7Sm",
"label": "dog",
"value": "dog",
"options": []
},
{
"id": "2.1.4",
"featureNodeHash": "CDKmYJK/",
"label": "cat",
"value": "cat",
"options": []
}
],
"dynamic": false
},
{
"id": "2.2",
"featureNodeHash": "5fFgrM+E",
"type": "text",
"name": "description",
"required": false,
"dynamic": false
}
]
},
{
"id": "3",
"name": "vehicle",
"color": "#16406C",
"shape": "bounding_box",
"featureNodeHash": "llw7qdWW",
"required": false,
"attributes": [
{
"id": "3.1",
"featureNodeHash": "79mo1G7Q",
"type": "text",
"name": "type - short and concise",
"required": false,
"dynamic": false
},
{
"id": "3.2",
"featureNodeHash": "OFrk07Ds",
"type": "checklist",
"name": "visible",
"required": false,
"options": [
{
"id": "3.2.1",
"featureNodeHash": "KmX/HjRT",
"label": "wheels",
"value": "wheels"
},
{
"id": "3.2.2",
"featureNodeHash": "H6qbEcdj",
"label": "frame",
"value": "frame"
},
{
"id": "3.2.3",
"featureNodeHash": "gZ9OucoQ",
"label": "chain",
"value": "chain"
},
{
"id": "3.2.4",
"featureNodeHash": "cit3aZSz",
"label": "head lights",
"value": "head_lights"
},
{
"id": "3.2.5",
"featureNodeHash": "qQ3PieJ/",
"label": "tail lights",
"value": "tail_lights"
}
],
"dynamic": false
}
]
},
{
"id": "4",
"name": "generic",
"color": "#FE9200",
"shape": "bounding_box",
"featureNodeHash": "jootTFfQ",
"required": false,
"attributes": []
}
],
"classifications": []
}
`
To construct the Ontology used in this example, run the following script:
```python
import json
from encord.objects.ontology_structure import OntologyStructure
from encord_agents.core.utils import get_user_client
encord_client = get_user_client()
structure = OntologyStructure.from_dict(json.loads("{the_json_above}"))
ontology = encord_client.create_ontology(
title="Your ontology title",
structure=structure
)
print(ontology.ontology_hash)
The goal is create an agent that takes a labeling task from Figure A to Figure B, below (Hint: you can click them and use keyboard arrows toggle between images).


Create the Agent
Warning
Some code blocks in this section have incorrect indentation. If you plan to copy and paste, we strongly recommend using the full code below instead of the individual sub-sections.
Here is the full code, but a section-by-section explanation follows.
The full code for agent.py
- Create a file called
"agent.py"
. -
Run the following imports and read the Project Ontology. Ensure that you replace
<project_hash>
with the unique identifier of your Project.agent.pyimport os from anthropic import Anthropic from encord.objects.ontology_labels_impl import LabelRowV2 from typing_extensions import Annotated from encord_agents.core.ontology import OntologyDataModel from encord_agents.core.utils import get_user_client from encord_agents.gcp import Depends, editor_agent from encord_agents.gcp.dependencies import FrameData, InstanceCrop, dep_object_crops # User client client = get_user_client() project = client.get_project("<project_hash>")
-
Extract the generic Ontology object and the Ontology objects that we are interested in. The following code sorts the Ontology objects based on whether they have the title
"generic"
or not.
We use the generic object to query image crops within the agent. Before doing so, we utilizeother_objects
to communicate the specific information we want Claude to focus on.
To facilitate this, theOntologyDataModel
class helps translate Encord OntologyObjects
into a Pydantic model, as well as convert JSON objects into EncordObjectInstance
s. -
Prepare the system prompt to go along with every object crop. For that, we use the
data_model
from above to create the json schema. It is worth noticing that we pass in just theother_objetcs
such that the model is only allowed to choose between the object types that are not of the generic one.agent.pydata_model = OntologyDataModel(other_objects) system_prompt = f""" You're a helpful assistant that's supposed to help fill in json objects according to this schema: `{data_model.model_json_schema_str}` Please only respond with valid json. """
See the result of
data_model.model_json_schema_str
for the given example{ "$defs": { "ActivityTextModel": { "properties": { "feature_node_hash": { "const": "aFCN9MMm", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "aFCN9MMm" ], "title": "Feature Node Hash", "type": "string" }, "value": { "description": "Please describe the image as accurate as possible focusing on 'activity'", "maxLength": 1000, "minLength": 0, "title": "Value", "type": "string" } }, "required": [ "feature_node_hash", "value" ], "title": "ActivityTextModel", "type": "object" }, "AnimalNestedModel": { "properties": { "feature_node_hash": { "const": "3y6JxTUX", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "3y6JxTUX" ], "title": "Feature Node Hash", "type": "string" }, "type": { "$ref": "#/$defs/TypeRadioModel", "description": "A mutually exclusive radio attribute to choose exactly one option that best matches to the give visual input." }, "description": { "$ref": "#/$defs/DescriptionTextModel", "description": "A text attribute with carefully crafted text to describe the property." } }, "required": [ "feature_node_hash", "type", "description" ], "title": "AnimalNestedModel", "type": "object" }, "DescriptionTextModel": { "properties": { "feature_node_hash": { "const": "5fFgrM+E", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "5fFgrM+E" ], "title": "Feature Node Hash", "type": "string" }, "value": { "description": "Please describe the image as accurate as possible focusing on 'description'", "maxLength": 1000, "minLength": 0, "title": "Value", "type": "string" } }, "required": [ "feature_node_hash", "value" ], "title": "DescriptionTextModel", "type": "object" }, "PersonNestedModel": { "properties": { "feature_node_hash": { "const": "2xlDPPAG", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "2xlDPPAG" ], "title": "Feature Node Hash", "type": "string" }, "activity": { "$ref": "#/$defs/ActivityTextModel", "description": "A text attribute with carefully crafted text to describe the property." } }, "required": [ "feature_node_hash", "activity" ], "title": "PersonNestedModel", "type": "object" }, "TypeRadioEnum": { "enum": [ "dolphin", "monkey", "dog", "cat" ], "title": "TypeRadioEnum", "type": "string" }, "TypeRadioModel": { "properties": { "feature_node_hash": { "const": "2P7LTUZA", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "2P7LTUZA" ], "title": "Feature Node Hash", "type": "string" }, "choice": { "$ref": "#/$defs/TypeRadioEnum", "description": "Choose exactly one answer from the given options." } }, "required": [ "feature_node_hash", "choice" ], "title": "TypeRadioModel", "type": "object" }, "TypeShortAndConciseTextModel": { "properties": { "feature_node_hash": { "const": "79mo1G7Q", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "79mo1G7Q" ], "title": "Feature Node Hash", "type": "string" }, "value": { "description": "Please describe the image as accurate as possible focusing on 'type - short and concise'", "maxLength": 1000, "minLength": 0, "title": "Value", "type": "string" } }, "required": [ "feature_node_hash", "value" ], "title": "TypeShortAndConciseTextModel", "type": "object" }, "VehicleNestedModel": { "properties": { "feature_node_hash": { "const": "llw7qdWW", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "llw7qdWW" ], "title": "Feature Node Hash", "type": "string" }, "type__short_and_concise": { "$ref": "#/$defs/TypeShortAndConciseTextModel", "description": "A text attribute with carefully crafted text to describe the property." }, "visible": { "$ref": "#/$defs/VisibleChecklistModel", "description": "A collection of boolean values indicating which concepts are applicable according to the image content." } }, "required": [ "feature_node_hash", "type__short_and_concise", "visible" ], "title": "VehicleNestedModel", "type": "object" }, "VisibleChecklistModel": { "properties": { "feature_node_hash": { "const": "OFrk07Ds", "description": "UUID for discrimination. Must be included in json as is.", "enum": [ "OFrk07Ds" ], "title": "Feature Node Hash", "type": "string" }, "wheels": { "description": "Is 'wheels' applicable or not?", "title": "Wheels", "type": "boolean" }, "frame": { "description": "Is 'frame' applicable or not?", "title": "Frame", "type": "boolean" }, "chain": { "description": "Is 'chain' applicable or not?", "title": "Chain", "type": "boolean" }, "head_lights": { "description": "Is 'head lights' applicable or not?", "title": "Head Lights", "type": "boolean" }, "tail_lights": { "description": "Is 'tail lights' applicable or not?", "title": "Tail Lights", "type": "boolean" } }, "required": [ "feature_node_hash", "wheels", "frame", "chain", "head_lights", "tail_lights" ], "title": "VisibleChecklistModel", "type": "object" } }, "properties": { "choice": { "description": "Choose exactly one answer from the given options.", "discriminator": { "mapping": { "2xlDPPAG": "#/$defs/PersonNestedModel", "3y6JxTUX": "#/$defs/AnimalNestedModel", "llw7qdWW": "#/$defs/VehicleNestedModel" }, "propertyName": "feature_node_hash" }, "oneOf": [ { "$ref": "#/$defs/PersonNestedModel" }, { "$ref": "#/$defs/AnimalNestedModel" }, { "$ref": "#/$defs/VehicleNestedModel" } ], "title": "Choice" } }, "required": [ "choice" ], "title": "ObjectsRadioModel", "type": "object" }
-
With the system prompt ready, instantiate an API client for Claude.
-
Define the editor agent.
- All arguments are automatically injected when the agent is called. For details on dependency injection, see here.
- The
dep_object_crops
dependency allows filtering. In this case, it includes only "generic" object crops, excluding those already converted to actual labels.
-
Call Claude using the image crops. Notice how the
crop
variable has a convenientb64_encoding
method to produce an input that Claude understands.
# Query Claude
changes = False
for crop in crops:
message = anthropic_client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system=system_prompt,
messages=[
{
"role": "user",
"content": [crop.b64_encoding(output_format="anthropic")],
}
],
)
-
To parse the message from Claude, the
data_model
is again useful. When called with a JSON string, it attempts to parse it with respect to the the JSON schema we saw above to create an Encord object instance. If successful, the old generic object can be removed and the newly classified object added.agent.pytry: instance = data_model(message.content[0].text) coordinates = crop.instance.get_annotation(frame=frame_data.frame).coordinates instance.set_for_frames( coordinates=coordinates, frames=frame_data.frame, confidence=0.5, manual_annotation=False, ) lr.remove_object(crop.instance) lr.add_object_instance(instance) changes = True except Exception: import traceback traceback.print_exc() print(f"Response from model: {message.content[0].text}")
-
Save the labels with Encord.
Test the Agent
-
In your current terminal, run the following command to run the agent in debug mode.
-
Open your Project in the Encord platform and navigate to a frame you want to add a generic object to. Copy the URL from your browser.
Hint
The url should have the following format:
"https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
. -
In another shell operating from the same working directory, source your virtual environment and test the agent.
-
To see if the test is successful, refresh your browser to view the classifications generated by Claude. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
FastAPI Examples¶
Basic Geometric example using objectHashes¶
A simple example of how you might utilise the objectHashes can be done via:
from typing import Annotated
from encord.objects.ontology_labels_impl import LabelRowV2
from encord.objects.ontology_object_instance import ObjectInstance
from fastapi import Depends, FastAPI
from encord_agents.fastapi.cors import EncordCORSMiddleware
from encord_agents.fastapi.dependencies import (
FrameData,
dep_label_row,
dep_objects,
)
# Initialize FastAPI app
app = FastAPI()
app.add_middleware(EncordCORSMiddleware)
@app.post("/handle-object-hashes")
def handle_object_hashes(
frame_data: FrameData,
lr: Annotated[LabelRowV2, Depends(dep_label_row)],
object_instances: Annotated[list[ObjectInstance], Depends(dep_objects)],
) -> None:
for object_inst in object_instances:
print(object_inst)
An example use case of the above: Suppose that I have my own OCR model and I want to selectively run OCR on objects I've selected from the Encord app. You can then trigger your agent from the app and it'll appropriately send a list of objectHashes to your agent. Then via the dep_objects method above, it gives the agent immediate access to the object instance making it easier to integrate your OCR model.
Test the Agent
-
First save the above code as main.py then in your current terminal run the following command to runFastAPI server in development mode with auto-reload enabled.
-
Open your Project in the Encord platform and navigate to a frame with an object that you want to act on. Choose an object from the bottom left sider and click
Copy URL
as shown:

!!! tip
The url should have roughly this format: "https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}/0?other_query_params&objectHash={objectHash}"
.
-
In another shell operating from the same working directory, source your virtual environment and test the agent.
-
To see if the test is successful, refresh your browser to see the action taken by the Agent. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
Nested Classification using Claude 3.5 Sonnet¶
The goals of this example is to:
- Create an editor agent that can automatically fill in frame-level classifications in the Label Editor.
- Demonstrate how to use the
OntologyDataModel
for classifications. - Demonstrate how to build an agent using FastAPI that can be self-hosted.
Prerequisites
Before you begin, ensure you have:
- Created a virtual Python environment.
- Installed all necessary dependencies.
- Have an Anthropic API key.
- Are able to authenticate with Encord.
Run the following commands to set up your environment:
python -m venv venv # Create a virtual Python environment
source venv/bin/activate # Activate the virtual environment
python -m pip install "fastapi[standard]" encord-agents anthropic # Install required dependencies
export ANTHROPIC_API_KEY="<your_api_key>" # Set your Anthropic API key
export ENCORD_SSH_KEY_FILE="/path/to/your/private/key" # Define your Encord SSH key
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used.
See the ontology JSON
[Same JSON as in GCP Frame Classification example]
The aim is to trigger an agent that transforms a labeling task from Figure A to Figure B. (Hint: Click the images and use the keyboard arrows to toggle between them.)
Create the FastAPI agent
Here is the full code, but a section-by-section explanation follows.
The full code for main.py
-
Import dependencies and set up the Project. The CORS middleware is crucial as it allows the Encord platform to make requests to your API.
main.pyimport os import numpy as np from anthropic import Anthropic from encord.objects.ontology_labels_impl import LabelRowV2 from fastapi import Depends, FastAPI, Form from numpy.typing import NDArray from typing_extensions import Annotated from encord_agents.core.data_model import Frame from encord_agents.core.ontology import OntologyDataModel from encord_agents.core.utils import get_user_client from encord_agents.fastapi.cors import EncordCORSMiddleware from encord_agents.fastapi.dependencies import ( FrameData, dep_label_row, dep_single_frame, ) # Initialize FastAPI app app = FastAPI() app.add_middleware(EncordCORSMiddleware)
-
Set up the Project and create a data model based on the Ontology.
-
Create the system prompt that tells Claude how to structure its response.
main.pysystem_prompt = f""" You're a helpful assistant that's supposed to help fill in json objects according to this schema: ```json {data_model.model_json_schema_str} ``` Please only respond with valid json. """ ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY") anthropic_client = Anthropic(api_key=ANTHROPIC_API_KEY)
-
Define the endpoint to handle the classification:
main.py@app.post("/frame_classification") async def classify_frame( frame_data: FrameData, lr: Annotated[LabelRowV2, Depends(dep_label_row)], content: Annotated[NDArray[np.uint8], Depends(dep_single_frame)], ): """Classify a frame using Claude.""" frame = Frame(frame=frame_data.frame, content=content) message = anthropic_client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, system=system_prompt, messages=[ { "role": "user", "content": [frame.b64_encoding(output_format="anthropic")], } ], ) try: classifications = data_model(message.content[0].text) for clf in classifications: clf.set_for_frames(frame_data.frame, confidence=0.5, manual_annotation=False) lr.add_classification_instance(clf) except Exception: import traceback traceback.print_exc() print(f"Response from model: {message.content[0].text}") lr.save()
The endpoint:
- Receives frame data via FastAPI's Form dependency.
- Retrieves the label row and frame content using Encord agents' dependencies.
- Constructs a
Frame
object with the content. - Sends the frame image to Claude for analysis.
- Parses Claude's response into classification instances.
- Adds classifications to the label row and saves the updated data.
Test the Agent
-
In your current terminal run the following command to runFastAPI server in development mode with auto-reload enabled.
-
Open your Project in the Encord platform and navigate to a frame you want to add a classification to. Copy the URL from your browser.
Tip
The url should have the following format:
"https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
. -
In another shell operating from the same working directory, source your virtual environment and test the agent.
-
To see if the test is successful, refresh your browser to view the classifications generated by Claude. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
Nested Attributes using Claude 3.5 Sonnet¶
The goals of this example are:
- Create an editor agent that can convert generic object annotations (class-less coordinates) into class specific annotations with nested attributes like descriptions, radio buttons, and checklists.
- Demonstrate how to use both the
OntologyDataModel
and thedep_object_crops
dependency.
Prerequisites
Before you begin, ensure you have:
- Created a virtual Python environment.
- Installed all necessary dependencies.
- Have an Anthropic API key.
- Are able to authenticate with Encord.
Run the following commands to set up your environment:
python -m venv venv # Create a virtual Python environment
source venv/bin/activate # Activate the virtual environment
python -m pip install encord-agents anthropic # Install required dependencies
export ANTHROPIC_API_KEY="<your_api_key>" # Set your Anthropic API key
export ENCORD_SSH_KEY_FILE="/path/to/your/private/key" # Define your Encord SSH key
Project Setup
Create a Project with visual content (images, image groups, image sequences, or videos) in Encord. This example uses the following Ontology, but any Ontology containing classifications can be used provided the object types are the same and there is one entry called "generic".
See the ontology JSON
[Same JSON as in GCP Object Classification example]
The goal is to trigger an agent that takes a labeling task from Figure A to Figure B, below:


Create the FastAPI Agent
Here is the full code, but a section-by-section explanation follows.
The full code for main.py
-
Set up the FastAPI app and CORS middleware.
main.pyimport os from anthropic import Anthropic from encord.objects.ontology_labels_impl import LabelRowV2 from fastapi import Depends, FastAPI from typing_extensions import Annotated from encord_agents.core.data_model import InstanceCrop from encord_agents.core.ontology import OntologyDataModel from encord_agents.core.utils import get_user_client from encord_agents.fastapi.cors import EncordCORSMiddleware from encord_agents.fastapi.dependencies import ( FrameData, dep_label_row, dep_object_crops, ) # Initialize FastAPI app app = FastAPI() app.add_middleware(EncordCORSMiddleware)
-
Set up the client, Project, and extract the generic Ontology object.
-
Create the data model and system prompt for Claude.
main.pydata_model = OntologyDataModel(other_objects) system_prompt = f""" You're a helpful assistant that's supposed to help fill in json objects according to this schema: `{data_model.model_json_schema_str}` Please only respond with valid json. """ # Claude setup ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY") anthropic_client = Anthropic(api_key=ANTHROPIC_API_KEY)
-
Define the attribute endpoint:
@app.post("/object_classification")
async def classify_objects(
frame_data: FrameData,
lr: Annotated[LabelRowV2, Depends(dep_label_row)],
crops: Annotated[
list[InstanceCrop],
Depends(dep_object_crops(filter_ontology_objects=[generic_ont_obj])),
],
):
"""Classify generic objects using Claude."""
# Query Claude for each crop
changes = False
for crop in crops:
message = anthropic_client.messages.create(
model="claude-3-haiku-20240307",
max_tokens=1024,
system=system_prompt,
messages=[
{
"role": "user",
"content": [crop.b64_encoding(output_format="anthropic")],
}
],
)
# Parse result
try:
instance = data_model(message.content[0].text)
coordinates = crop.instance.get_annotation(frame=frame_data.frame).coordinates
instance.set_for_frames(
coordinates=coordinates,
frames=frame_data.frame,
confidence=0.5,
manual_annotation=False,
)
lr.remove_object(crop.instance)
lr.add_object_instance(instance)
changes = True
except Exception:
import traceback
traceback.print_exc()
print(f"Response from model: {message.content[0].text}")
# Save changes
if changes:
lr.save()
The endpoint:
- Receives frame data using FastAPI's Form dependency.
- Retrieves the label row using
dep_label_row
. - Fetches object crops, filtered to include only "generic" objects, using
dep_object_crops
. - For each crop:
- Sends the cropped image to Claude for analysis.
- Parses the response into an object instance.
- Replaces the generic object with the classified instance.
- Saves the updated label row.
Testing the Agent
-
In your current terminal run the following command to runFastAPI server in development mode with auto-reload enabled.
-
Open your Project in the Encord platform and navigate to a frame you want to add a classification to. Copy the URL from your browser.
Tip
The url should have roughly this format:
"https://app.encord.com/label_editor/{project_hash}/{data_hash}/{frame}"
. -
In another shell operating from the same working directory, source your virtual environment and test the agent:
-
To see if the test is successful, refresh your browser to view the classifications generated by Claude. Once the test runs successfully, you are ready to deploy your agent. Visit the deployment documentation to learn more.
Agent Examples in the Making¶
The following example are being worked on:
- Tightening Bounding Boxes with SAM
- Extrapolating labels with DINOv
- Triggering internal notification system
- Label assertion