Release SuperBench v0.2.0

SuperBench v0.2.0 Release Notes
===============================

SuperBench Framework
--------------------

* Implemented a CLI to provide a command line interface.
* Implemented Runner for nodes control and management.
* Implemented Executor.
* Implemented Benchmark framework.

Supported Benchmarks
--------------------

* Supported Micro-benchmarks
  * GEMM FLOPS (GFLOPS, TensorCore, cuBLAS, cuDNN)
  * Kernel Launch Time (Kernel_Launch_Event_Time, Kernel_Launch_Wall_Time)
  * Operator Performance (MatMul, Sharding_MatMul)
* Supported Model-benchmarks
  * CNN models
  (Reference: [torchvision models](https://github.com/pytorch/vision/tree/v0.8.0/torchvision/models))
    * ResNet (ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152)
    * DenseNet (DenseNet-161, DenseNet-169, DenseNet-201)
    * VGG (VGG-11, VGG-13, VGG-16, VGG-19, VGG11_bn, VGG13_bn, VGG16_bn, VGG19_bn)
    * MNASNet (mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3)
    * AlexNet
    * GoogLeNet
    * Inception_v3
    * mobilenet_v2
    * ResNeXt (resnext50_32x4d, resnext101_32x8d)
    * Wide ResNet (wide_resnet50_2, wide_resnet101_2)
    * ShuffleNet (shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0)
    * SqueezeNet (squeezenet1_0, squeezenet1_1)
  * LSTM model
  * BERT models (BERT-Base, BERT-Large)
  * GPT-2 model (specify which config)

Examples and Documents
----------------------

* Added examples to run benchmarks respectively.
* Tutorial Documents (introduction, getting-started, developer-guides, APIs, benchmarks).
* Built SuperBench [website](https://aka.ms/superbench/).
This tag has no release notes.