{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pre-Label Videos with Bounding Boxes\n", "\n", "This notebook demonstrates how to use a task agent to pre-label videos with predictions. To simplify the process, we'll use \"fake\" predictions, though the methodology remains the same when applying the notebook with an actual model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation\n", "\n", "Ensure that you have the `encord-agents` library installed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!python -m pip install \"encord-agents[vision]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authentication\n", "\n", "The library authenticates via ssh-keys. Below, is a code cell for setting the `ENCORD_SSH_KEY` environment variable. It should contain the raw content of your private ssh key file.\n", "\n", "If you have not yet setup an ssh key, please follow the [documentation](https://agents-docs.encord.com/authentication/).\n", "\n", "> 💡 **Colab users**: In colab, you can set the key once in the secrets in the left sidebar and load it in new notebooks with\n", "> ```python\n", "> from google.colab import userdata\n", "> key_content = userdata.get(\"ENCORD_SSH_KEY\")\n", "> ```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "os.environ[\"ENCORD_SSH_KEY\"] = \"private_key_file_content\"\n", "# or you can set a path to a file\n", "# os.environ[\"ENCORD_SSH_KEY_FILE\"] = \"/path/to/your/private/key\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [Alternative] Temporary Key\n", "There's also the option of generating a temporary (fresh) ssh key pair via the code cell below.\n", "Please follow the instructions printed when executing the code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ⚠️ Safe to skip if you have authenticated already\n", "import os\n", "\n", "from encord_agents.utils.colab import generate_public_private_key_pair_with_instructions\n", "\n", "private_key_path, public_key_path = generate_public_private_key_pair_with_instructions()\n", "os.environ[\"ENCORD_SSH_KEY_FILE\"] = private_key_path.as_posix()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define a _fake_ Model for Predictions\n", "\n", "To keep things minimal, we will define a \"_fake_\" model which predicts labels, bounding boxes, and confidences.\n", "We'll use the model to simulate predictiong objects on images below.\n", "\n", "> 💡 This should be the main place for you to change code in order to integrate your own detection model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "from dataclasses import dataclass\n", "\n", "import numpy as np\n", "from encord.objects.coordinates import BoundingBoxCoordinates\n", "from numpy.typing import NDArray\n", "\n", "\n", "# Data class to hold predictions from our \"model\"\n", "@dataclass\n", "class ModelPrediction:\n", " label: int\n", " coords: BoundingBoxCoordinates\n", " conf: float\n", "\n", "\n", "# Model \"simulation\"\n", "def fake_predict(image: NDArray[np.uint8]) -> list[ModelPrediction]:\n", " \"\"\"\n", " Simple function that takes in an nd array of pixel values ([h, w, c], RGB)\n", " And return a list of random bounding boxes. Random in location and with\n", " three different\n", " \"\"\"\n", " return [\n", " ModelPrediction(\n", " label=random.choice(range(3)),\n", " coords=BoundingBoxCoordinates(\n", " top_left_x=random.random() * 0.5,\n", " top_left_y=random.random() * 0.5,\n", " width=random.random() * 0.5,\n", " height=random.random() * 0.5,\n", " ),\n", " conf=random.random(),\n", " )\n", " for _ in range(10)\n", " ]\n", "\n", "\n", "model = fake_predict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Set up your Ontology\n", "\n", "Create an Ontology that matches the expected output of your pre-labeling agent. \n", "For example, if your model predicts classes `surfboard`, `person`, and `car` with class labels 0, 1, and 2, respectively, then the ontology should look like this:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "
Figure 1: Project ontology.
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[📖 Here](https://docs.encord.com/platform-documentation/GettingStarted/gettingstarted-create-ontology) is the documentation for creating ontologies." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a Workflow with a Pre-Labeling Agent Node\n", "\n", "Create a Project in the Encord platform with a workflow that includes a pre-labeling agent node before the annotation stage. This node, called **\"pre-label,\"** runs custom code to generate model predictions, automatically pre-labeling tasks before they are sent for annotation.\n", "\n", "[📖 Here](https://docs.encord.com/platform-documentation/Annotate/annotate-projects/annotate-workflows-and-templates#creating-workflows) is the documentation for creating Workflows in Encord." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "
Figure 2: Project workflow.
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define the Pre-Labeling Agent\n", "\n", "The following code provides a template for defining an agent that does pre-labeling.\n", "We assume that the project only contains videos and the we want to do pre-labeling on all frames in each video.\n", "\n", "If your agent node is named \"pre-label\" and the pathway to the annotation stage is named \"annotate,\" you will only have to change the `` to your actual project hash to make it work.\n", "If your naming, on the other hand, is different, then you can update the `stage` parameter of the decorator and the returned string, respectively, to comply with your own setup.\n", "\n", "Note that this code uses the [`dep_video_iterator` dependency](../../reference/task_agents/#encord_agents.tasks.dependencies.dep_video_iterator) to automatically load an iterator of frames as RGB numpy arrays from the video." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from typing import Iterable\n", "\n", "from encord.objects.ontology_labels_impl import LabelRowV2\n", "from encord.project import Project\n", "from typing_extensions import Annotated\n", "\n", "from encord_agents.core.data_model import Frame\n", "from encord_agents.tasks import Depends, Runner\n", "from encord_agents.tasks.dependencies import dep_video_iterator\n", "\n", "# a. Define a runner that will execute the agent on every task in the agent stage\n", "runner = Runner(project_hash=\"\")\n", "\n", "\n", "# b. Specify the logic that goes into the \"pre-label\" agent node.\n", "@runner.stage(stage=\"pre-label\")\n", "def run_something(\n", " lr: LabelRowV2,\n", " project: Project,\n", " frames: Annotated[Iterable[Frame], Depends(dep_video_iterator)],\n", ") -> str:\n", " ontology = project.ontology_structure\n", "\n", " # c. Loop over the frames in the video\n", " for frame in frames: # For every frame in the video\n", " # d. Predict - we could do batching here to speed up the process\n", " outputs = model(frame.content)\n", "\n", " # e. Store the results\n", " for output in outputs:\n", " ins = ontology.objects[output.label].create_instance()\n", " ins.set_for_frames(frames=frame.frame, coordinates=output.coords, confidence=output.conf)\n", "\n", " lr.add_object_instance(ins)\n", "\n", " lr.save()\n", " return \"annotate\" # Tell where the task should go" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Running the Agent\n", "\n", "The `runner` object is callable, allowing you to use it to prioritize tasks efficiently." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Run the agent\n", "runner()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Outcome\n", "\n", "Your agent assigns labels to videos and routes them through the workflow to the annotation stage. As a result, each annotation task includes pre-labeled predictions. \n", "\n", "> 💡 To run this as a command-line interface, save the code in an `agents.py` file and replace: \n", "> ```python\n", "> runner()\n", "> ``` \n", "> with: \n", "> ```python\n", "> if __name__ == \"__main__\":\n", "> runner.run()\n", "> ``` \n", "> This lets you set parameters like the project hash from the command line: \n", "> ```bash\n", "> python agent.py --project-hash \"...\"\n", "> ```\n" ] } ], "metadata": { "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "encord-agents-Cw_LL1Rx-py3.11", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 0 }