Skip to content

Overview

Editor Agents allow you to integrate your own API endpoint with Encord, enhancing your annotation processes. For example, this could be a model hosted on a server or a cloud function. Annotators can trigger Editor Agents while annotating in the Label Editor.

Some common use-cases are:

  • Validate the current state of the annotations within a frame, video, or image group. You might, for example, want to give the labelers an option to annotate the current state of the labels before submitting.
  • Do custom conversions or modifications of labels before they are submitted. For example, you could be simplifying polygons with an RDP algorithm.
  • Employ custom prompting models like DINOv or T-Rex2 to speed up annotations.
  • Trigger notifications internally related to the given task.

Editor Agents are actions your annotators can trigger while they are labeling.

Info

Editor Agents are API endpoints triggered on individual tasks within the Label Editor. They differ from Task Agents, which are Workflow components that activate on all tasks passing through the Agent stage.

General Concepts

Editor Agents work in the following way:

sequenceDiagram
    participant A as Annotator
    participant B as Encord Label Editor #5F47FE
    participant C as Editor Agent [custom API]
    participant D as Encord SDK #5F47FE

    A->>B: Trigger Agent
    B->>C: Call with project, data, and frame num
    C->>D: Perform edits to the underlying label row
    D->>C: confirmation
    C->>B: 200 response
    B->>A: Label editor refresh

Use encord-agents to define the logic for the "Editor Agent [custom API]" section of the diagram. You are responsible for programmatically determining what happens when your custom API receives a project_hash, data_hash, and potentially a frame number and objectHashes.

We help with two different ways of building such Custom APIs:

  1. Using Google run functions which is Google's way of building cloud functions.
  2. Using FastAPI which is a flexible (self-hosted) python library for building custom APIs.
  3. Using Modal which provides a serverless cloud for engineers and researchers who want to build compute-intensive applications without thinking about infrastructure.

Tip

The encord-agents library takes a lot of inspiration from FastAPI. Specifically, we have adopted the idea of dependency injections from that library. While our injection scheme is not as sophisticated, it should feel familiar.

Google Cloud Run functions are ideal for lightweight operations, such as serving as proxies for model inference APIs or making minor label adjustments. In contrast, FastAPI and Modal apps are better suited for hosting your own models and handling resource-intensive tasks.

In the next section, we include a GCP example. If you need to build a FastAPI (or Modal) application, feel free to skip it.