{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook demonstrates how to use the Encord Agents Task Runner to create a video recaptioning workflow. We'll use the GPT-4o-mini model to automatically generate multiple captions based on a human-written description.\n",
"\n",
"In the notebook, we follow the Workflow below.\n",
"Every video is being annotated with a caption by a human (the pink node).\n",
"Successively, a data agent produces multiple new captions automatically (the purple node).\n",
"Finally, a humans reviews all four captions (the yellow node) before the item is complete.\n",
"If there are no human captions when the task reaches the data agent, it'll send it back for annotation.\n",
"Similarly, if the task is rejected during review, it's also sent back for another round of annotation.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Annotations are done following the labeling ontology below.\n",
"The first field `\"Caption\"` is for the human to fill in.\n",
"The latter three recaptions are for the data agent to fill in.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
Expand to see ontology JSON
\n",
"\n",
"{\n",
" \"objects\": [],\n",
" \"classifications\": [\n",
" {\n",
" \"id\": \"1\",\n",
" \"featureNodeHash\": \"GCH8VHIK\",\n",
" \"attributes\": [\n",
" {\n",
" \"id\": \"1.1\",\n",
" \"name\": \"Caption\",\n",
" \"type\": \"text\",\n",
" \"required\": false,\n",
" \"featureNodeHash\": \"Yg7xXEfC\"\n",
" }\n",
" ]\n",
" },\n",
" {\n",
" \"id\": \"2\",\n",
" \"featureNodeHash\": \"PwQAwYid\",\n",
" \"attributes\": [\n",
" {\n",
" \"id\": \"2.1\",\n",
" \"name\": \"Caption Rephrased 1\",\n",
" \"type\": \"text\",\n",
" \"required\": false,\n",
" \"featureNodeHash\": \"aQdXJwbG\"\n",
" }\n",
" ]\n",
" },\n",
" {\n",
" \"id\": \"3\",\n",
" \"featureNodeHash\": \"3a/aSnHO\",\n",
" \"attributes\": [\n",
" {\n",
" \"id\": \"3.1\",\n",
" \"name\": \"Caption Rephrased 2\",\n",
" \"type\": \"text\",\n",
" \"required\": false,\n",
" \"featureNodeHash\": \"8zY6H62x\"\n",
" }\n",
" ]\n",
" },\n",
" {\n",
" \"id\": \"4\",\n",
" \"featureNodeHash\": \"FNjXp5TU\",\n",
" \"attributes\": [\n",
" {\n",
" \"id\": \"4.1\",\n",
" \"name\": \"Caption Rephrased 3\",\n",
" \"type\": \"text\",\n",
" \"required\": false,\n",
" \"featureNodeHash\": \"sKg1Kq/m\"\n",
" }\n",
" ]\n",
" }\n",
" ]\n",
"}\n",
"
\n",
"Code for generating ontology
\n",
" \n",
"import json\n",
"from encord.objects.ontology_structure import OntologyStructure\n",
"from encord.objects.attributes import TextAttribute\n",
"\n",
"structure = OntologyStructure()\n",
"caption = structure.add_classification()\n",
"caption.add_attribute(TextAttribute, \"Caption\")\n",
"re1 = structure.add_classification()\n",
"re1.add_attribute(TextAttribute, \"Recaption 1\")\n",
"re2 = structure.add_classification()\n",
"re2.add_attribute(TextAttribute, \"Recaption 2\")\n",
"re3 = structure.add_classification()\n",
"re3.add_attribute(TextAttribute, \"Recaption 3\")\n",
"\n",
"print(json.dumps(structure.to_dict(), indent=2))\n",
"\n",
"create_ontology = False\n",
"if create_ontology:\n",
" from encord.user_client import EncordUserClient\n",
" client = EncordUserClient.create_with_ssh_private_key() # Look in auth section for authentication\n",
" client.create_ontology(\"title\", \"description\", structure)\n",
"
\n",
"