"This notebook tutorial shows how to detect COTS using a pre-trained COTS detector implemented in TensorFlow. On top of just running the model on each frame of the video, the tracking code in this notebook aligns detections from frame to frame creating a consistent track for each COTS. Each track is given an id and frame count. Here is an example image from a video of a reef showing labeled COTS starfish.\n",
"This notebook tutorial shows how to detect COTS using a pre-trained COTS detector implemented in TensorFlow. On top of just running the model on each frame of the video, the tracking code in this notebook aligns detections from frame to frame creating a consistent track for each COTS. Each track is given an id and frame count. Here is an example image from a video of a reef showing labeled COTS starfish.\n",
"\n",
"\n",
...
@@ -86,6 +86,8 @@
...
@@ -86,6 +86,8 @@
"id": "a4R2T97u442o"
"id": "a4R2T97u442o"
},
},
"source": [
"source": [
"## Setup \n",
"\n",
"Install all needed packages."
"Install all needed packages."
]
]
},
},
...
@@ -99,7 +101,8 @@
...
@@ -99,7 +101,8 @@
"source": [
"source": [
"# remove the existing datascience package to avoid package conflicts in the colab environment\n",
"# remove the existing datascience package to avoid package conflicts in the colab environment\n",
"Re-encode the video, and reduce its size (Colab crashes if you try to embed the full size video)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_li0qe-gh1iT"
},
"outputs": [],
"source": [
"subprocess.check_call([\n",
" \"ffmpeg\", \"-y\", \"-i\", tmp_video_path,\n",
" \"-vf\",\"scale=800:-1\",\n",
" \"-crf\", \"18\",\n",
" \"-preset\", \"veryfast\",\n",
" \"-vcodec\", \"libx264\", preview_video_path])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2ItoiHyYQGya"
},
"source": [
"The images you downloaded are frames of a movie showing a top view of a coral reef with crown-of-thorns starfish. The movie looks like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SiOsbr8xePkg"
},
"outputs": [],
"source": [
"embed_video_file(preview_video_path)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9Z0DTbWrZMZ-"
},
"source": [
"Can you se them? there are lots. The goal of the model is to put boxes around all of the starfish. Each starfish will get its own ID, and that ID will be stable as the camera passes over it."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "d0iALUwM0g2p"
},
"source": [
"## Load the model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fVq6vNBTxM62"
},
"source": [
"Download the trained COTS detection model that matches your preferences from earlier."
"That works well for one frame, but to count the number of COTS in a video you'll need to track the detections from frame to frame. The raw detection indices are not stable, they're just sorted by the detection score. Below both sets of detections are overlaid on the second image with the first frame's detections in white and the second frame's in orange, the indices are not aligned. The positions are shifted because of camera motion between teh two frames:"
"# Load the model and perform inference and tracking on sample data\n",
"Now keep the white boxes for the initial detections, and the the orange boxes for the new set of detections. But add the the optical-flow propagated tracks in green. You can see that by using optical-flow to propagate the old detections to the new frame the alignment is quite good. It's this alignment between the old and new detections (between the green and orange boxes) that allows the tracker to make a persistemt track for each COTS. "
"Load trained model from disk and create the inference function `model_fn()`. This might take a little while."
"# Define **OpticalFlowTracker** class and its related classes\n",
"# Define **OpticalFlowTracker** class and its related classes\n",
"\n",
"\n",
"These help track the movement of each COTS object throughout the image frames."
"These help track the movement of each COTS object across the video frames.\n",
"\n",
"The tracker collects related detections into `Track` objects. \n",
"\n",
"The class's init is defined below, it's methods are defined in the following cells.\n",
"\n",
"The `__init__` method just initializes the track counter (`track_id`), and sets some default values for the tracking and optical flow configurations. "
]
]
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count": null,
"execution_count": null,
"metadata": {
"metadata": {
"id": "tybwY3eaY803"
"id": "3j2Ka1uGEoz4"
},
},
"outputs": [],
"outputs": [],
"source": [
"source": [
"def box_area(x0, y0, x1, y1):\n",
"class OpticalFlowTracker:\n",
" return (x1 - x0 + 1) * (y1 - y0 + 1)\n",
" \"\"\"Optical flow tracker.\"\"\"\n",
"\n",
" @classmethod\n",
" def add_method(cls, fun):\n",
" \"\"\"Attach a new method to the class.\"\"\"\n",
"Internally the tracker will use small `Track` and `Tracklet` classes to organize the data. The `Tracklet` class is just a `Detection` with a timestamp, while a `Track` is a track ID, the most recent detection and a list of `Tracklet` objects forming the history of the track."
]
]
},
},
{
{
...
@@ -302,16 +884,15 @@
...
@@ -302,16 +884,15 @@
},
},
"outputs": [],
"outputs": [],
"source": [
"source": [
"@dataclasses.dataclass(frozen=True)\n",
"class Tracklet:\n",
"class Tracklet:\n",
" def __init__(self, timestamp, detection):\n",
" timestamp:float\n",
" self.timestamp = timestamp\n",
" detection:Detection\n",
" # Store a copy here to make sure the coordinates will not be updated\n",
" # when the optical flow propagation runs using another reference to this\n",
"The tracker keeps a list of active `Track` objects.\n",
"\n",
"The main `update` method takes an image, along with the list of detections and the timestamp for that image. On each frame step it performs the following sub-tasks:\n",
"\n",
"\n",
" def __repr__(self):\n",
"* The tracker uses optical flow to calculate where each `Track` expects to see a new `Detection`.\n",
" result = f'Track {self.id}'\n",
"* The tracker matches up the actual detections for the frame to the expected detections for each Track.\n",
" for linked_det in self.linked_dets:\n",
"* If a detection doesn't get matched to an existing track, a new track is created for the detection.\n",
" result += '\\n' + linked_det.__repr__()\n",
"* If a track stops getting assigned new detections, it is eventually deactivated. "
" for track, det in zip(self.tracks, detections)]\n"
" img = cv2.imread(filename)\n",
" video_writer.write(img)\n",
"cv2.destroyAllWindows()\n",
"video_writer.release()"
]
]
},
},
{
{
"cell_type": "markdown",
"cell_type": "markdown",
"metadata": {
"metadata": {
"id": "cHsKpPyviWmF"
"id": "uLbVeetwD0ph"
},
},
"source": [
"source": [
"Re-encode the video, and reduce its size (Colab crashes if you try to embed the full size video)."
"The `apply_detections_to_tracks` method compares each detection to the updated bounding box for each track. The detection is added to the track that matches best, if the match is better than the `overlap_threshold`. If no track is better than the threshold, the detection is used to create a new track. \n",
"\n",
"If a track has no new detection assigned to it the predicted-detection is used."
"The goal of the model is to put boxes around all of the starfish. Each starfish gets its own ID, and that ID will be stable as the camera passes over it."