{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pre-Label Videos with Bounding Boxes\n",
"\n",
"This notebook demonstrates how to use a task agent to pre-label videos with predictions. To simplify the process, we'll use \"fake\" predictions, though the methodology remains the same when applying the notebook with an actual model."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation\n",
"\n",
"Ensure that you have the `encord-agents` library installed:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install \"encord-agents[vision]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Authentication\n",
"\n",
"The library authenticates via ssh-keys. Below, is a code cell for setting the `ENCORD_SSH_KEY` environment variable. It should contain the raw content of your private ssh key file.\n",
"\n",
"If you have not yet setup an ssh key, please follow the [documentation](https://agents-docs.encord.com/authentication/).\n",
"\n",
"> 💡 **Colab users**: In colab, you can set the key once in the secrets in the left sidebar and load it in new notebooks with\n",
"> ```python\n",
"> from google.colab import userdata\n",
"> key_content = userdata.get(\"ENCORD_SSH_KEY\")\n",
"> ```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"ENCORD_SSH_KEY\"] = \"private_key_file_content\"\n",
"# or you can set a path to a file\n",
"# os.environ[\"ENCORD_SSH_KEY_FILE\"] = \"/path/to/your/private/key\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### [Alternative] Temporary Key\n",
"There's also the option of generating a temporary (fresh) ssh key pair via the code cell below.\n",
"Please follow the instructions printed when executing the code."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ⚠️ Safe to skip if you have authenticated already\n",
"import os\n",
"\n",
"from encord_agents.utils.colab import generate_public_private_key_pair_with_instructions\n",
"\n",
"private_key_path, public_key_path = generate_public_private_key_pair_with_instructions()\n",
"os.environ[\"ENCORD_SSH_KEY_FILE\"] = private_key_path.as_posix()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define a _fake_ Model for Predictions\n",
"\n",
"To keep things minimal, we will define a \"_fake_\" model which predicts labels, bounding boxes, and confidences.\n",
"We'll use the model to simulate predictiong objects on images below.\n",
"\n",
"> 💡 This should be the main place for you to change code in order to integrate your own detection model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"from dataclasses import dataclass\n",
"\n",
"import numpy as np\n",
"from encord.objects.coordinates import BoundingBoxCoordinates\n",
"from numpy.typing import NDArray\n",
"\n",
"\n",
"# Data class to hold predictions from our \"model\"\n",
"@dataclass\n",
"class ModelPrediction:\n",
" label: int\n",
" coords: BoundingBoxCoordinates\n",
" conf: float\n",
"\n",
"\n",
"# Model \"simulation\"\n",
"def fake_predict(image: NDArray[np.uint8]) -> list[ModelPrediction]:\n",
" \"\"\"\n",
" Simple function that takes in an nd array of pixel values ([h, w, c], RGB)\n",
" And return a list of random bounding boxes. Random in location and with\n",
" three different\n",
" \"\"\"\n",
" return [\n",
" ModelPrediction(\n",
" label=random.choice(range(3)),\n",
" coords=BoundingBoxCoordinates(\n",
" top_left_x=random.random() * 0.5,\n",
" top_left_y=random.random() * 0.5,\n",
" width=random.random() * 0.5,\n",
" height=random.random() * 0.5,\n",
" ),\n",
" conf=random.random(),\n",
" )\n",
" for _ in range(10)\n",
" ]\n",
"\n",
"\n",
"model = fake_predict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 3: Set up your Ontology\n",
"\n",
"Create an Ontology that matches the expected output of your pre-labeling agent. \n",
"For example, if your model predicts classes `surfboard`, `person`, and `car` with class labels 0, 1, and 2, respectively, then the ontology should look like this:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" \n",
" Figure 1: Project ontology.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[📖 Here](https://docs.encord.com/platform-documentation/GettingStarted/gettingstarted-create-ontology) is the documentation for creating ontologies."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a Workflow with a Pre-Labeling Agent Node\n",
"\n",
"Create a Project in the Encord platform with a workflow that includes a pre-labeling agent node before the annotation stage. This node, called **\"pre-label,\"** runs custom code to generate model predictions, automatically pre-labeling tasks before they are sent for annotation.\n",
"\n",
"[📖 Here](https://docs.encord.com/platform-documentation/Annotate/annotate-projects/annotate-workflows-and-templates#creating-workflows) is the documentation for creating Workflows in Encord."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" \n",
" Figure 2: Project workflow.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define the Pre-Labeling Agent\n",
"\n",
"The following code provides a template for defining an agent that does pre-labeling.\n",
"We assume that the project only contains videos and the we want to do pre-labeling on all frames in each video.\n",
"\n",
"If your agent node is named \"pre-label\" and the pathway to the annotation stage is named \"annotate,\" you will only have to change the `` to your actual project hash to make it work.\n",
"If your naming, on the other hand, is different, then you can update the `stage` parameter of the decorator and the returned string, respectively, to comply with your own setup.\n",
"\n",
"Note that this code uses the [`dep_video_iterator` dependency](../../reference/task_agents/#encord_agents.tasks.dependencies.dep_video_iterator) to automatically load an iterator of frames as RGB numpy arrays from the video."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import Iterable\n",
"\n",
"from encord.objects.ontology_labels_impl import LabelRowV2\n",
"from encord.project import Project\n",
"from typing_extensions import Annotated\n",
"\n",
"from encord_agents.core.data_model import Frame\n",
"from encord_agents.tasks import Depends, Runner\n",
"from encord_agents.tasks.dependencies import dep_video_iterator\n",
"\n",
"# a. Define a runner that will execute the agent on every task in the agent stage\n",
"runner = Runner(project_hash=\"\")\n",
"\n",
"\n",
"# b. Specify the logic that goes into the \"pre-label\" agent node.\n",
"@runner.stage(stage=\"pre-label\")\n",
"def run_something(\n",
" lr: LabelRowV2,\n",
" project: Project,\n",
" frames: Annotated[Iterable[Frame], Depends(dep_video_iterator)],\n",
") -> str:\n",
" ontology = project.ontology_structure\n",
"\n",
" # c. Loop over the frames in the video\n",
" for frame in frames: # For every frame in the video\n",
" # d. Predict - we could do batching here to speed up the process\n",
" outputs = model(frame.content)\n",
"\n",
" # e. Store the results\n",
" for output in outputs:\n",
" ins = ontology.objects[output.label].create_instance()\n",
" ins.set_for_frames(frames=frame.frame, coordinates=output.coords, confidence=output.conf)\n",
"\n",
" lr.add_object_instance(ins)\n",
"\n",
" lr.save()\n",
" return \"annotate\" # Tell where the task should go"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running the Agent\n",
"\n",
"The `runner` object is callable, allowing you to use it to prioritize tasks efficiently."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Run the agent\n",
"runner()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Outcome\n",
"\n",
"Your agent assigns labels to videos and routes them through the workflow to the annotation stage. As a result, each annotation task includes pre-labeled predictions. \n",
"\n",
"> 💡 To run this as a command-line interface, save the code in an `agents.py` file and replace: \n",
"> ```python\n",
"> runner()\n",
"> ``` \n",
"> with: \n",
"> ```python\n",
"> if __name__ == \"__main__\":\n",
"> runner.run()\n",
"> ``` \n",
"> This lets you set parameters like the project hash from the command line: \n",
"> ```bash\n",
"> python agent.py --project-hash \"...\"\n",
"> ```\n"
]
}
],
"metadata": {
"colab": {
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "encord-agents-Cw_LL1Rx-py3.11",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}