Internal change

PiperOrigin-RevId: 460830316

Internal change
PiperOrigin-RevId: 460830316
36b403d0 · Fan Yang · A. Unique TensorFlower · 68d6c14b · 36b403d0 · 36b403d0
Commit 36b403d0 authored Jul 13, 2022 by Fan Yang Committed by A. Unique TensorFlower Jul 13, 2022
8 changed files
--- a/official/projects/waste_identification_ml/README.md
+++ b/official/projects/waste_identification_ml/README.md
+# Waste Identification ML - ( Mask RCNN with TF Lite )
+This projects aims to develop a TensorFlow Lite model based Mask RCNN instance
+segmentation model for on-device inference.
+## Background
+The sustainability team at Google wants to build a computer vision based ML
+model for waste identification. An ML model which detects the trash objects in
+the images and can identify their material type and packaging type. This
+projects aims to accelerate innovation in the waste management industry by
+providing no-cist open sourced ML models. This would help reduce barriers for
+technology adoption, and provide much needed efficientcy, traceability &
+transparency, which inturn can help increase recycling rates.
+## Code Structure
+This is an implementation of Mask RCNN based on Python 3 and Tensorflow 2.x. The
+model generates bounding boxes and segmentation masks for each instance of an
+object in the image. The repository includes :
+* Source code for training a Mask RCNN model.
+* Inference code
+* Pre-trained weights for inferencing
+* Docker to deploy the model in any operating system and run.
+* Jupyter notebook to visualize the detection pipeline at every step.
+* Evaluation metric of the validation dataset.
+* Example of training on your own custom dataset.
+The code is designed in such a way so that it can be extend. If you use in your
+research or industrial solutions then please consider citing this repository.
+## Pre-requisites
+## Prepare dataset
+## Setup virtual systems for training
+### ***Start a TPU v3-32 instance***
+-   [x] Set up a Google cloud account on GCP
+-   [x] Go to the cloud console and create a new project.
+-   [x] While setting up your project, you will be asked to set up a billing
+    account. You will only be charged after you start using it.
+-   [x] Create a cloud TPU project
+-   [x] Link for the above 4 steps can be
+    [found here](https://cloud.google.com/tpu/docs/setup-gcp-account)
+-   [x] Once the project is created, select the project from the cloud console.
+-   [x] On the top right, click cloud shell to open the terminal. See
+    [TPU Quickstart](https://cloud.google.com/tpu/docs/quick-starts) for
+    instructions.
+    An example command would look like:
+    ```bash
+    ctpu up --name
+    <tpu-name> --zone <zone> --tpu-size=v3-32 --tf-version nightly --project
+    <project ID>
+    ```
+    **Example** -
+-   This model requires TF version >= 2.5. Currently, that is only available via
+    a nightly build on Cloud.
+-   You can check TPU types with their cores and memory
+    [here](https://cloud.google.com/tpu/docs/types-zones#tpu-vm) and select
+    accordingly.
+-   CAREFULLY choose a TPU type which can be turned ON and OFF after usage.
+    Preferred one is below - `bash ctpu up --name waste-identification --zone
+    us-central1-a --tpu-size=v3-8 --tf-version nightly --project
+    waste-identification-ml` After the execution of the above command you will
+    see 2 virtual devices with name "waste-identification" each in TPU and
+    COMPUTE ENGINE section.
+### ***Get into the virtual machine***
+The virtual machine which is a TPU host can be seen in the COMPUTE ENGINE
+section of GCP. We will use this virtual machine to start the training process.
+This machine will use another virtual instance of TPU that is found in the TPU
+section of the GCP. To get inside the TPU host virtual machine :
+-   Go the COMPUTE ENGINE section in the GCP
+-   Find your instance there
+-   Under the "Connect" tab of your instance you will see "SSH"
+-   click on SSH and it will open another window which will take you inside the
+    virtual machine.
+-   Use the following commands inside the virtual machine window :
+```bash
+$ git clone https://github.com/tensorflow/models.git
+$ cd models
+$ pip3 install -r official/requirements.txt
+```
+## Roadmap
+-   Provide ML model pre-trained weights with Docker to run for detection of
+    Material type.
+-   Deploy a model tp detect the packaging type of the objects.
+-   Deploy a model to detect the brands of the object.
--- a/official/projects/waste_identification_ml/pre_processing/coco_to_tfrecord.ipynb
+++ b/official/projects/waste_identification_ml/pre_processing/coco_to_tfrecord.ipynb
--- a/official/projects/waste_identification_ml/pre_processing/config/categories_list_of_dictionaries.py
+++ b/official/projects/waste_identification_ml/pre_processing/config/categories_list_of_dictionaries.py
+# Copyright 2022 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Create a list of dictionaries for categories according to the taxonomy.
+Example usage-
+    build_material(MATERIAL_LIST,'material-types')
+    build_material(MATERIAL_FORM_LIST,'material-form-types')
+    build_material(MATERIAL_SUBCATEGORY_LIST,'material-subcategory-types')
+    build_material(MATERIAL_FORM_SUBCATEGORY_LIST,'material-form-subcategory-types')
+"""
+#! /usr/bin/env python
+from typing import List, Dict, Union
+MATERIAL_LIST = [
+    'Inorganic-wastes', 'Textiles', 'Rubber-and-Leather', 'Wood', 'Food',
+    'Plastics', 'Yard-trimming', 'Fiber', 'Glass', 'Metals'
+]
+MATERIAL_FORM_LIST = [
+    'Flexibles', 'Bottle', 'Jar', 'Carton', 'Sachets-&-Pouch', 'Blister-pack',
+    'Tray', 'Tube', 'Can', 'Tub', 'Cosmetic', 'Box', 'Clothes', 'Bulb',
+    'Cup-&-glass', 'Book-&-magazine', 'Bag', 'Lid', 'Clamshell', 'Mirror',
+    'Tangler', 'Cutlery', 'Cassette-&-tape', 'Electronic-devices', 'Battery',
+    'Pen-&-pencil', 'Paper-products', 'Foot-wear', 'Scissor', 'Toys', 'Brush',
+    'Pipe', 'Foil', 'Hangers'
+]
+MATERIAL_SUBCATEGORY_LIST = [
+    'HDPE_Flexible_Color', 'HDPE_Rigid_Color', 'LDPE_Flexible_Color',
+    'LDPE_Rigid_Color', 'PP_Flexible_Color', 'PP_Rigid_Color', 'PETE', 'PS',
+    'PVC', 'Others-MLP', 'Others-Tetrapak', 'Others-HIPC', 'Aluminium',
+    'Ferrous_Iron', 'Ferrous_Steel', 'Non-ferrous_Lead', 'Non-ferrous_Copper',
+    'Non-ferrous_Zinc'
+]
+def build_material(category_list: List[str],
+                   supercategory: str) -> List[Dict[str, Union[int, str]]]:
+  """Creates a list of dictionaries for the category classes.
+  Args:
+    category_list: list of categories from MATERIAL_LIST, MATERIAL_FORM_LIST,
+      MATERIAL_SUBCATEGORY_LIST
+    supercategory: supercategory can be 'material-types', 'material-form-types',
+      'material-subcategory-types', 'material-form-subcategory-types'
+  Returns:
+    List of dictionaries returning categories with their IDs
+  """
+  list_of_dictionaries = []
+  for num, m in enumerate(category_list, start=1):
+    list_of_dictionaries.append({
+        'id': num,
+        'name': m,
+        'supercategory': supercategory
+    })
+  return list_of_dictionaries
--- a/official/projects/waste_identification_ml/pre_processing/config/config.ini
+++ b/official/projects/waste_identification_ml/pre_processing/config/config.ini
+[config]
+config_folder_path = /mydrive/TFVision/pre-processing/config/
+[paths]
+annotation_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/annotations_18012022_coco.json
+images_folder_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/images
+new_annotation_path = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/material_annotations_18012022_coco.json
+[merge]
+input_files = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/20122021/material_annotations_20122021_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/27122021/material_annotations_27122021_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/03012022/material_annotations_03012022_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/10012022/material_annotations_10012022_coco.json,/mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/18012022/material_annotations_18012022_coco.json
+output_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/output.json
+[split]
+input_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/output.json
+output_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/
+[tfrecord]
+tensorflow_model_folder = /mydrive/TFVision/
+training_data_folder = /mydrive/TFVision/tfrecords/train/
+validation_data_folder = /mydrive/TFVision/tfrecords/val/
+training_images_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/Total_images/train/
+training_annotation_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/_train.json
+validation_images_folder = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/Total_images/validation/
+validation_annotation_file = /mydrive/gtech/MRFs/Recykal/Latest_sharing_by_sanket/Google_Recykal/Taxonomy_version_2/_val.json
--- a/official/projects/waste_identification_ml/pre_processing/config/visualization.py
+++ b/official/projects/waste_identification_ml/pre_processing/config/visualization.py
+# Copyright 2022 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""To visualize of the category distribution in an annotated JSON file."""
+#! /usr/bin/env python3
+import json
+from absl import app
+from absl import flags
+import numpy as np
+import pandas as pd
+# Define the flags
+FLAGS = flags.FLAGS
+# path to annotated JSON file whose distribution needs to be plotted
+_PATH = flags.DEFINE_string(
+    'path', None, 'path to the annotated JSON file', required=True)
+def visualize_annotation_file(path: str) -> None:
+  """Plot a bar graph showing the category distribution.
+  Args:
+    path: path to the annotated JSON file.
+  """
+  # get annotation file data into a variable
+  with open(path) as json_file:
+    data = json.load(json_file)
+    # count the occurance of each category in the annotation file
+    category_names = [i['name'] for i in data['categories']]
+    category_ids = [i['category_id'] for i in data['annotations']]
+    values, counts = np.unique(category_ids, return_counts=True)
+    # create a dataframe with all possible values
+    # with their counts and visualize it.
+    df = pd.DataFrame(counts, index=values, columns=['counts'])
+    df = df.reindex(range(1, len(data['categories']) + 1), fill_value=0)
+    df.index = category_names
+    df.plot.bar(
+        figsize=(20, 5),
+        width=0.5,
+        xlabel='Material types',
+        ylabel='count of material types')
+def main(_):
+  visualize_annotation_file(_PATH.value)
+if __name__ == '__main__':
+  app.run(main)
--- a/official/projects/waste_identification_ml/pre_processing/merge_coco_files.ipynb
+++ b/official/projects/waste_identification_ml/pre_processing/merge_coco_files.ipynb
--- a/official/projects/waste_identification_ml/pre_processing/split_coco_files.ipynb
+++ b/official/projects/waste_identification_ml/pre_processing/split_coco_files.ipynb
--- a/official/projects/waste_identification_ml/pre_trained_models.md.html
+++ b/official/projects/waste_identification_ml/pre_trained_models.md.html