{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use the Ultralytics YOLO Model\n",
"\n",
"This notebook demonstrates how to use a task agent to pre-label images with predictions, specifically using a bounding box prediction model.\n",
"\n",
"Before we begin, ensure that all dependencies are installed and that you are authenticated with Encord.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set Up"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"Ensure that you have the `encord-agents` library installed:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install \"encord-agents[vision]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Encord Authentication\n",
"\n",
"Encord uses ssh-keys for authentication. The following is a code cell for setting the `ENCORD_SSH_KEY` environment variable. It contains the raw content of your private ssh key file.\n",
"\n",
"If you have not setup an ssh key, see our [documentation](https://agents-docs.encord.com/authentication/).\n",
"\n",
"> 💡 In colab, you can set the key once in the secrets in the left sidebar and load it in new notebooks. IF YOU ARE NOT RUNNING THE CODE IN THE COLLAB NOTEBOOK, you must set the environment variable directly.\n",
"> ```python\n",
"> os.environ[\"ENCORD_SSH_KEY\"] = \"\"\"paste-private-key-here\"\"\"\n",
"> ```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import userdata\n",
"\n",
"key_contet = userdata.get(\"ENCORD_SSH_KEY\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"ENCORD_SSH_KEY\"] = key_contet\n",
"# or you can set a path to a file\n",
"# os.environ[\"ENCORD_SSH_KEY_FILE\"] = \"/path/to/your/private/key\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### [Alternative] Temporary Key\n",
"There's also the option of generating a temporary (fresh) ssh key pair via the code cell below.\n",
"Please follow the instructions printed when executing the code."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ⚠️ Safe to skip if you have authenticated already\n",
"import os\n",
"\n",
"from encord_agents.utils.colab import generate_public_private_key_pair_with_instructions\n",
"\n",
"private_key_path, public_key_path = generate_public_private_key_pair_with_instructions()\n",
"os.environ[\"ENCORD_SSH_KEY_FILE\"] = private_key_path.as_posix()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up your Ontology\n",
"\n",
"Create an Ontology that matches the expected output of your pre-labeling agent.\n",
"For example, if your model predicts classes `surfboard`, `person`, and `car`, then the Ontology should look like this:\n",
"\n",
"> 💡 Our DETR model predicts more objects but here the focus on the car predictions in this example"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" \n",
" Figure 1: Project Ontology.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define an Ontology Map\n",
"\n",
"We need to map the model predictions to their respective Encord Ontology items. The simplest way to do this is by using the `featureNodeHash` of the target, which can be found either through the Ontology preview JSON in the app or by using the Encord SDK.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ontology_map = {\"car\": \"uFmVW/cr\"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define a Model for Predictions\n",
"\n",
"Define a model that predicts labels, bounding boxes, and confidence scores, and use it to detect objects in images.\n",
"\n",
"We'll use the YOLOv11 model from Ultralytics following:\n",
"https://docs.ultralytics.com/modes/predict/"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install ultralytics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from ultralytics import YOLO\n",
"\n",
"model = YOLO(\"yolo11n.pt\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[📖 See our documentation here](https://docs.encord.com/platform-documentation/GettingStarted/gettingstarted-create-ontology) to learn how to create Ontologies."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dataclasses import dataclass\n",
"\n",
"import numpy as np\n",
"from encord.objects.coordinates import BoundingBoxCoordinates\n",
"from numpy.typing import NDArray\n",
"\n",
"\n",
"# Data class to hold predictions from our model\n",
"@dataclass\n",
"class ModelPrediction:\n",
" featureHash: str\n",
" coords: BoundingBoxCoordinates\n",
" conf: float\n",
"\n",
"\n",
"def YOLO_predict(image: NDArray[np.uint8]) -> list[ModelPrediction]:\n",
" print(image, image.shape, type(image))\n",
" outputs = model(image)\n",
"\n",
" print(outputs)\n",
" for output in outputs:\n",
" clses = [ontology_map[x] for x in output.boxes.cls.tolist()]\n",
" print(clses)\n",
" confs = output.boxes.conf.tolist()\n",
" boxes = output.boxes.xywhn.tolist()\n",
" encord_boxes = [\n",
" BoundingBoxCoordinates(height=x[3], width=x[2], top_left_x=x[0], top_left_y=x[1]) for x in boxes\n",
" ]\n",
" return [ModelPrediction(featureHash=x, coords=y, conf=z) for x, y, z in zip(clses, encord_boxes, confs)]\n",
"\n",
"\n",
"agent = YOLO_predict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a Workflow\n",
"\n",
"Create a Project in the Encord platform with the following Workflow.\n",
"\n",
"The Workflow includes a pre-labeling agent node before the annotation stage. This agent node automatically pre-labels tasks with model predictions, where your custom code for pre-labeling is executed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" \n",
" Figure 2: Project workflow.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[📖 See our documentation here](https://docs.encord.com/platform-documentation/Annotate/annotate-projects/annotate-workflows-and-templates#creating-workflows) to learn how to create a Workflow in Encord."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define the Pre-Labeling Agent\n",
"\n",
"The following code provides a template for defining an agent that does pre-labeling. The code uses the [`dep_video_iterator` dependency](../../reference/task_agents.md#encord_agents.tasks.dependencies.dep_video_iterator) to automatically load an iterator of frames as RGB numpy arrays from the video.\n",
"\n",
"Ensure that you replace `` with the unique ID of your Encord Project.\n",
"\n",
"> ⚠️ This example assumes:\n",
"> - Your Project contains only videos, and you want to pre-label all frames in each video.\n",
"> - The agent stage is named \"pre-label\" and the pathway is named \"annotate.\" If your setup differs, update the `stage` parameter of the decorator and the returned string accordingly.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import Iterable\n",
"\n",
"import numpy as np\n",
"from encord.objects.ontology_labels_impl import LabelRowV2\n",
"from encord.project import Project\n",
"from numpy.typing import NDArray\n",
"from typing_extensions import Annotated\n",
"\n",
"from encord_agents.core.data_model import Frame\n",
"from encord_agents.tasks import Depends, Runner\n",
"from encord_agents.tasks.dependencies import dep_single_frame\n",
"\n",
"# a. Define a runner that executes the agent on every task in the agent stage\n",
"runner = Runner(project_hash=\"\")\n",
"\n",
"\n",
"# b. Specify the logic that goes into the \"pre-label\" agent node.\n",
"@runner.stage(stage=\"pre-label\")\n",
"def pre_segment(\n",
" lr: LabelRowV2,\n",
" project: Project,\n",
" frame: Annotated[NDArray[np.uint8], Depends(dep_single_frame)],\n",
") -> str:\n",
" ontology = project.ontology_structure\n",
"\n",
" # d. Predict - we could do batching here to speed up the process\n",
" outputs = agent(frame)\n",
"\n",
" # e. Store the results\n",
" for output in outputs:\n",
" ins = ontology.get_child_by_hash(output.featureHash).create_instance()\n",
" ins.set_for_frames(frames=0, coordinates=output.coords, confidence=output.conf)\n",
"\n",
" lr.add_object_instance(ins)\n",
"\n",
" lr.save()\n",
" return \"annotate\" # Tell where the task should go"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running the Agent\n",
"The `runner` object is callable which means that you can just call it to prioritize your tasks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Run the agent\n",
"runner()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Outcome\n",
"\n",
"Your agent now assigns labels to the videos and routes them appropriately through the Workflow to the annotation stage.\n",
"As a result, every annotation task should already have pre-existing labels (predictions) included.\n",
"\n",
"> 💡 To run this as a command-line interface, save the code in an `agents.py` file and replace: \n",
"> ```python\n",
"> runner()\n",
"> ``` \n",
"> with: \n",
"> ```python\n",
"> if __name__ == \"__main__\":\n",
"> runner.run()\n",
"> ``` \n",
"> This lets you set parameters like the project hash from the command line: \n",
"> ```bash\n",
"> python agent.py --project-hash \"...\"\n",
"> ```\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"colab": {
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "encord-agents-Cw_LL1Rx-py3.11",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}