"This notebook tutorial shows how to detect COTS using a pre-trained COTS detector implemented in TensorFlow. On top of just running the model on each frame of the video, the tracking code in this notebook aligns detections from frame to frame creating a consistent track for each COTS. Each track is given an id and frame count. Here is an example image from a video of a reef showing labeled COTS starfish.\n",
"This notebook tutorial shows how to detect COTS using a pre-trained COTS detector implemented in TensorFlow. On top of just running the model on each frame of the video, the tracking code in this notebook aligns detections from frame to frame creating a consistent track for each COTS. Each track is given an id and frame count. Here is an example image from a video of a reef showing labeled COTS starfish.\n",
"\n",
"\n",
...
@@ -77,7 +77,7 @@
...
@@ -77,7 +77,7 @@
"id": "YxCF1t-Skag8"
"id": "YxCF1t-Skag8"
},
},
"source": [
"source": [
"It is recommended to enable GPU to accelerate the inference. On CPU, this runs for about 40 minutes, but on GPU it takes only 10 minutes. (from colab menu: *Runtime > Change runtime type > Hardware accelerator > select \"GPU\"*)."
"It is recommended to enable GPU to accelerate the inference. On CPU, this runs for about 40 minutes, but on GPU it takes only 10 minutes. (In Colab it should already be set to GPU in the Runtime menu: *Runtime > Change runtime type > Hardware accelerator > select \"GPU\"*)."
]
]
},
},
{
{
...
@@ -402,6 +402,8 @@
...
@@ -402,6 +402,8 @@
"id": "KSOf4V8WhTHF"
"id": "KSOf4V8WhTHF"
},
},
"source": [
"source": [
"## Raw model outputs\n",
"\n",
"Try running the model on the image. The model expects a batch of images so add an outer `batch` dimension before calling the model.\n",
"Try running the model on the image. The model expects a batch of images so add an outer `batch` dimension before calling the model.\n",
"\n",
"\n",
"Note: The model only runs correctly with a batch size of 1.\n",
"Note: The model only runs correctly with a batch size of 1.\n",
"The two sets of bounding boxes above don't line up because of camera movement. \n",
"The two sets of bounding boxes above don't line up because of camera movement. \n",
"To see in more detail how tracks are aligned, initialize the tracker with the first image, and then run the optical flow step, `propagate_tracks`. "
"To see in more detail how tracks are aligned, initialize the tracker with the first image, and then run the optical flow step, `propagate_tracks`. "
]
]
...
@@ -811,7 +819,7 @@
...
@@ -811,7 +819,7 @@
"id": "jbZ-7ICCENWG"
"id": "jbZ-7ICCENWG"
},
},
"source": [
"source": [
"# Define **OpticalFlowTracker** class and its related classes\n",
"# Define **OpticalFlowTracker** class\n",
"\n",
"\n",
"These help track the movement of each COTS object across the video frames.\n",
"These help track the movement of each COTS object across the video frames.\n",
"\n",
"\n",
...
@@ -1120,6 +1128,8 @@
...
@@ -1120,6 +1128,8 @@
"id": "gY0AH-KUHPlC"
"id": "gY0AH-KUHPlC"
},
},
"source": [
"source": [
"## Test run the tracker\n",
"\n",
"So reload the test images, and run the detections to test out the tracker.\n",
"So reload the test images, and run the detections to test out the tracker.\n",
"\n",
"\n",
"On the first frame it creates and returns one track per detection:"
"On the first frame it creates and returns one track per detection:"
...
@@ -1337,6 +1347,7 @@
...
@@ -1337,6 +1347,7 @@
},
},
"source": [
"source": [
"# Output the detection results and play the result video\n",
"# Output the detection results and play the result video\n",
"\n",
"Once the inference is done, we use OpenCV to draw the bounding boxes (Line 9-10) and write the tracked COTS's information (Line 13-20: `COTS ID` `(sequence index/ sequence length)`) on each frame's image. Finally, we combine all frames into a video for visualisation."
"Once the inference is done, we use OpenCV to draw the bounding boxes (Line 9-10) and write the tracked COTS's information (Line 13-20: `COTS ID` `(sequence index/ sequence length)`) on each frame's image. Finally, we combine all frames into a video for visualisation."