remove bentoml doc in favor of blogpost (#4182)

7a47df22 · Will Berman · GitHub · d620070b · 7a47df22 · d620070b
Unverified Commit 7a47df22 authored Jul 20, 2023 by Will Berman Committed by GitHub Jul 21, 2023
Show whitespace changes
Inline Side-by-side

Showing with 0 additions and 202 deletions

docs/source/en/_toctree.yml docs/source/en/_toctree.yml +0 -2

docs/source/en/optimization/bentoml.mdx docs/source/en/optimization/bentoml.mdx +0 -200

No files found.
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -117,8 +117,6 @@
    title: Habana Gaudi
  - local: optimization/tome
    title: Token Merging
-  - local: optimization/bentoml
-    title: BentoML Integration
  title: Optimization/Special Hardware
 - sections:
  - local: conceptual/philosophy

--- a/docs/source/en/optimization/bentoml.mdx
+++ b/docs/source/en/optimization/bentoml.mdx
-<!--Copyright 2023 The HuggingFace Team. All rights reserved.
-Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
-the License. You may obtain a copy of the License at
-http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
-an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
-specific language governing permissions and limitations under the License.
-->
-# BentoML Integration Guide
-[[open-in-colab]]
-[BentoML](https://github.com/bentoml/BentoML/) is an open-source framework designed for building,
-shipping, and scaling AI applications. It allows users to easily package and serve diffusion models
-for production, ensuring reliable and efficient deployments. It features out-of-the-box operational
-management tools like monitoring and tracing, and facilitates the deployment to various cloud platforms
-with ease. BentoML's distributed architecture and the separation of API server logic from
-model inference logic enable efficient scaling of deployments, even with budget constraints.
-As a result, integrating it with Diffusers provides a valuable tool for real-world deployments.
-This tutorial demonstrates how to integrate BentoML with Diffusers.
-## Prerequisites
- Install [Diffusers](https://huggingface.co/docs/diffusers/installation).
- Install BentoML by running `pip install bentoml`. For more information, see the [BentoML documentation](https://docs.bentoml.com).
-## Import a diffusion model
-First, you need to prepare the model. BentoML has its own [Model Store](https://docs.bentoml.com/en/latest/concepts/model.html)
-for model management. Create a `download_model.py` file as below to import a diffusion model into BentoML's Model
-Store:
-```py
-import bentoml
-bentoml.diffusers.import_model(
-    "sd2.1",  # Model tag in the BentoML Model Store
-    "stabilityai/stable-diffusion-2-1",  # Hugging Face model identifier
-)
-```
-This code snippet downloads the Stable Diffusion 2.1 model (using it's repo id
-`stabilityai/stable-diffusion-2-1`) from the Hugging Face Hub (or use the cached download
-files if the model is already downloaded) and imports it into the BentoML Model
-Store with the name `sd2.1`.
-For models already fine-tuned and stored on disk, you can provide the path instead of
-the repo id.
-```py
-import bentoml
-bentoml.diffusers.import_model(
-    "sd2.1-local",
-    "./local_stable_diffusion_2.1/",
-)
-```
-You can view the model in the Model Store:
-```
-bentoml models list
-Tag                                                                 Module                              Size       Creation Time       
-sd2.1:ysrlmubascajwnry                                              bentoml.diffusers                   33.85 GiB  2023-07-12 16:47:44 
-```
-## Turn a diffusion model into a RESTful service with BentoML
-Once the diffusion model is in BentoML's Model Store, you can implement a text-to-image
-service with it. The Stable Diffusion model accepts various arguments
-in addition to the required prompt to guide the image generation process.
-To validate these input arguments, use BentoML's [pydantic](https://github.com/pydantic/pydantic) integration.
-Create a `sdargs.py` file with an example pydantic model:
-```py
-import typing as t
-from pydantic import BaseModel
-class SDArgs(BaseModel):
-    prompt: str
-    negative_prompt: t.Optional[str] = None
-    height: t.Optional[int] = 512
-    width: t.Optional[int] = 512
-    class Config:
-        extra = "allow"
-```
-This pydantic model requires a string field `prompt` and three optional fields: `height`, `width`, and `negative_prompt`,
-each with corresponding types. The `extra = "allow"` line supports adding additional fields not defined in the `SDArgs` class.
-In a real-world scenario, you may define all the desired fields and not allow extra ones.
-Next, create a BentoML Service file that defines a Stable Diffusion service:
-```py
-import bentoml
-from bentoml.io import Image, JSON
-from sdargs import SDArgs
-bento_model = bentoml.diffusers.get("sd2.1:latest")
-sd21_runner = bento_model.to_runner(name="sd21-runner")
-svc = bentoml.Service("stable-diffusion-21", runners=[sd21_runner])
-@svc.api(input=JSON(pydantic_model=SDArgs), output=Image())
-async def txt2img(input_data):
-    kwargs = input_data.dict()
-    res = await sd21_runner.async_run(**kwargs)
-    images = res[0]
-    return images[0]
-```
-Save the file as `service.py`, and spin up a BentoML Service endpoint using:
-```
-bentoml serve service:svc
-```
-An HTTP server with `/txt2img` endpoint that accepts a JSON dictionary should be up at
-port 3000. Go to <http://127.0.0.1:3000> in your web browser to access the Swagger UI.
-You can also test the text-to-image generation using `curl` and write the returned image to
-`output.jpg`.
-```
-curl -X POST http://127.0.0.1:3000/txt2img \
-     -H 'Content-Type: application/json' \
-     -d "{\"prompt\":\"a black cat\", \"height\":768, \"width\":768}" \
-     --output output.jpg
-```
-## Package a BentoML Service for cloud deployment
-To deploy a BentoML Service, you need to pack it into a BentoML
-[Bento](https://docs.bentoml.com/en/latest/concepts/bento.html), a file archive with all the source code,
-models, data files, and dependencies. This can be done by providing a `bentofile.yaml` file as follows:
-```yaml
-service: "service.py:svc"
-include:
-  - "service.py"
-python:
-  packages:
-    - torch
-    - transformers
-    - accelerate
-    - diffusers
-    - triton
-    - xformers
-    - pydantic
-docker:
-    distro: debian
-    cuda_version: "11.6"
-```
-The `bentofile.yaml` file contains [Bento build
-options](https://docs.bentoml.com/en/latest/concepts/bento.html#bento-build-options),
-such as package dependencies and Docker options.
-Then you build a Bento using:
-```
-bentoml build
-```
-The output looks like:
-```
-Successfully built Bento(tag="stable-diffusion-21:crkuh7a7rw5bcasc").
-Possible next steps:
- * Containerize your Bento with `bentoml containerize`:
-    $ bentoml containerize stable-diffusion-21:crkuh7a7rw5bcasc
- * Push to BentoCloud with `bentoml push`:
-    $ bentoml push stable-diffusion-21:crkuh7a7rw5bcasc
-```
-You can create a Docker image based on the Bento by running the following command and deploy it to a cloud provider.
-```
-bentoml containerize stable-diffusion-21:crkuh7a7rw5bcasc
-```
-If you want an end-to-end solution for deploying and managing models, you can push the Bento to [Yatai](https://github.com/bentoml/Yatai) or
-[BentoCloud](https://bentoml.com/cloud) for a distributed deployment.
-For more information about BentoML's integration with Diffusers, see the [BentoML Diffusers
-Guide](https://docs.bentoml.com/en/latest/frameworks/diffusers.html).