Commit 36b403d0 authored by Fan Yang's avatar Fan Yang Committed by A. Unique TensorFlower
Browse files

Internal change

PiperOrigin-RevId: 460830316
parent 68d6c14b
# Waste Identification ML - ( Mask RCNN with TF Lite )
This projects aims to develop a TensorFlow Lite model based Mask RCNN instance
segmentation model for on-device inference.
## Background
The sustainability team at Google wants to build a computer vision based ML
model for waste identification. An ML model which detects the trash objects in
the images and can identify their material type and packaging type. This
projects aims to accelerate innovation in the waste management industry by
providing no-cist open sourced ML models. This would help reduce barriers for
technology adoption, and provide much needed efficientcy, traceability &
transparency, which inturn can help increase recycling rates.
## Code Structure
This is an implementation of Mask RCNN based on Python 3 and Tensorflow 2.x. The
model generates bounding boxes and segmentation masks for each instance of an
object in the image. The repository includes :
* Source code for training a Mask RCNN model.
* Inference code
* Pre-trained weights for inferencing
* Docker to deploy the model in any operating system and run.
* Jupyter notebook to visualize the detection pipeline at every step.
* Evaluation metric of the validation dataset.
* Example of training on your own custom dataset.
The code is designed in such a way so that it can be extend. If you use in your
research or industrial solutions then please consider citing this repository.
## Pre-requisites
## Prepare dataset
## Setup virtual systems for training
### ***Start a TPU v3-32 instance***
- [x] Set up a Google cloud account on GCP
- [x] Go to the cloud console and create a new project.
- [x] While setting up your project, you will be asked to set up a billing
account. You will only be charged after you start using it.
- [x] Create a cloud TPU project
- [x] Link for the above 4 steps can be
[found here](https://cloud.google.com/tpu/docs/setup-gcp-account)
- [x] Once the project is created, select the project from the cloud console.
- [x] On the top right, click cloud shell to open the terminal. See
[TPU Quickstart](https://cloud.google.com/tpu/docs/quick-starts) for
instructions.
An example command would look like:
```bash
ctpu up --name
<tpu-name> --zone <zone> --tpu-size=v3-32 --tf-version nightly --project
<project ID>
```
**Example** -
- This model requires TF version >= 2.5. Currently, that is only available via
a nightly build on Cloud.
- You can check TPU types with their cores and memory
[here](https://cloud.google.com/tpu/docs/types-zones#tpu-vm) and select
accordingly.
- CAREFULLY choose a TPU type which can be turned ON and OFF after usage.
Preferred one is below - `bash ctpu up --name waste-identification --zone
us-central1-a --tpu-size=v3-8 --tf-version nightly --project
waste-identification-ml` After the execution of the above command you will
see 2 virtual devices with name "waste-identification" each in TPU and
COMPUTE ENGINE section.
### ***Get into the virtual machine***
The virtual machine which is a TPU host can be seen in the COMPUTE ENGINE
section of GCP. We will use this virtual machine to start the training process.
This machine will use another virtual instance of TPU that is found in the TPU
section of the GCP. To get inside the TPU host virtual machine :
- Go the COMPUTE ENGINE section in the GCP
- Find your instance there
- Under the "Connect" tab of your instance you will see "SSH"
- click on SSH and it will open another window which will take you inside the
virtual machine.
- Use the following commands inside the virtual machine window :
```bash
$ git clone https://github.com/tensorflow/models.git
$ cd models
$ pip3 install -r official/requirements.txt
```
## Roadmap
- Provide ML model pre-trained weights with Docker to run for detection of
Material type.
- Deploy a model tp detect the packaging type of the objects.
- Deploy a model to detect the brands of the object.
# Copyright 2022 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Create a list of dictionaries for categories according to the taxonomy.
Example usage-
build_material(MATERIAL_LIST,'material-types')
build_material(MATERIAL_FORM_LIST,'material-form-types')
build_material(MATERIAL_SUBCATEGORY_LIST,'material-subcategory-types')
build_material(MATERIAL_FORM_SUBCATEGORY_LIST,'material-form-subcategory-types')
"""
#! /usr/bin/env python
from typing import List, Dict, Union
MATERIAL_LIST = [
'Inorganic-wastes', 'Textiles', 'Rubber-and-Leather', 'Wood', 'Food',
'Plastics', 'Yard-trimming', 'Fiber', 'Glass', 'Metals'
]
MATERIAL_FORM_LIST = [
'Flexibles', 'Bottle', 'Jar', 'Carton', 'Sachets-&-Pouch', 'Blister-pack',
'Tray', 'Tube', 'Can', 'Tub', 'Cosmetic', 'Box', 'Clothes', 'Bulb',
'Cup-&-glass', 'Book-&-magazine', 'Bag', 'Lid', 'Clamshell', 'Mirror',
'Tangler', 'Cutlery', 'Cassette-&-tape', 'Electronic-devices', 'Battery',
'Pen-&-pencil', 'Paper-products', 'Foot-wear', 'Scissor', 'Toys', 'Brush',
'Pipe', 'Foil', 'Hangers'
]
MATERIAL_SUBCATEGORY_LIST = [
'HDPE_Flexible_Color', 'HDPE_Rigid_Color', 'LDPE_Flexible_Color',
'LDPE_Rigid_Color', 'PP_Flexible_Color', 'PP_Rigid_Color', 'PETE', 'PS',
'PVC', 'Others-MLP', 'Others-Tetrapak', 'Others-HIPC', 'Aluminium',
'Ferrous_Iron', 'Ferrous_Steel', 'Non-ferrous_Lead', 'Non-ferrous_Copper',
'Non-ferrous_Zinc'
]
def build_material(category_list: List[str],
supercategory: str) -> List[Dict[str, Union[int, str]]]:
"""Creates a list of dictionaries for the category classes.
Args:
category_list: list of categories from MATERIAL_LIST, MATERIAL_FORM_LIST,
MATERIAL_SUBCATEGORY_LIST
supercategory: supercategory can be 'material-types', 'material-form-types',
'material-subcategory-types', 'material-form-subcategory-types'
Returns:
List of dictionaries returning categories with their IDs
"""
list_of_dictionaries = []
for num, m in enumerate(category_list, start=1):
list_of_dictionaries.append({
'id': num,
'name': m,
'supercategory': supercategory
})
return list_of_dictionaries
[config]
config_folder_path = /mydrive/TFVision/pre-processing/config/
[paths]
annotation_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/annotations_18012022_coco.json
images_folder_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/images
new_annotation_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/material_annotations_18012022_coco.json
[merge]
input_files = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/20122021/material_annotations_20122021_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/27122021/material_annotations_27122021_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/03012022/material_annotations_03012022_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/10012022/material_annotations_10012022_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/material_annotations_18012022_coco.json
output_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/output.json
[split]
input_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/output.json
output_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/
[tfrecord]
tensorflow_model_folder = /mydrive/TFVision/
training_data_folder = /mydrive/TFVision/tfrecords/train/
validation_data_folder = /mydrive/TFVision/tfrecords/val/
training_images_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/Total_images/train/
training_annotation_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/_train.json
validation_images_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/Total_images/validation/
validation_annotation_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/_val.json
# Copyright 2022 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""To visualize of the category distribution in an annotated JSON file."""
#! /usr/bin/env python3
import json
from absl import app
from absl import flags
import numpy as np
import pandas as pd
# Define the flags
FLAGS = flags.FLAGS
# path to annotated JSON file whose distribution needs to be plotted
_PATH = flags.DEFINE_string(
'path', None, 'path to the annotated JSON file', required=True)
def visualize_annotation_file(path: str) -> None:
"""Plot a bar graph showing the category distribution.
Args:
path: path to the annotated JSON file.
"""
# get annotation file data into a variable
with open(path) as json_file:
data = json.load(json_file)
# count the occurance of each category in the annotation file
category_names = [i['name'] for i in data['categories']]
category_ids = [i['category_id'] for i in data['annotations']]
values, counts = np.unique(category_ids, return_counts=True)
# create a dataframe with all possible values
# with their counts and visualize it.
df = pd.DataFrame(counts, index=values, columns=['counts'])
df = df.reindex(range(1, len(data['categories']) + 1), fill_value=0)
df.index = category_names
df.plot.bar(
figsize=(20, 5),
width=0.5,
xlabel='Material types',
ylabel='count of material types')
def main(_):
visualize_annotation_file(_PATH.value)
if __name__ == '__main__':
app.run(main)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment