@@ -10,19 +10,46 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
...
@@ -10,19 +10,46 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
specific language governing permissions and limitations under the License.
-->
-->
# Models
# BaseOutputs
Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models.
All models have outputs that are instances of subclasses of [`~utils.BaseOutput`]. Those are
The primary function of these models is to denoise an input sample, by modeling the distribution $p_\theta(\mathbf{x}_{t-1}|\mathbf{x}_t)$.
data structures containing all the information returned by the model, but that can also be used as tuples or
The models are built on the base class ['ModelMixin'] that is a `torch.nn.module` with basic functionality for saving and loading models both locally and from the HuggingFace hub.
dictionaries.
## API
Let's see how this looks in an example:
Models should provide the `def forward` function and initialization of the model.
```python
All saving, loading, and utilities should be in the base ['ModelMixin'] class.
- The ['UNetModel'] was proposed in [TODO](https://arxiv.org/) and has been used in paper1, paper2, paper3.
The `outputs` object is a [`~pipeline_utils.ImagePipelineOutput`], as we can see in the
- Extensions of the ['UNetModel'] include the ['UNetGlideModel'] that uses attention and timestep embeddings for the [GLIDE](https://arxiv.org/abs/2112.10741) paper, the ['UNetGradTTS'] model from this [paper](https://arxiv.org/abs/2105.06337) for text-to-speech, ['UNetLDMModel'] for latent-diffusion models in this [paper](https://arxiv.org/abs/2112.10752), and the ['TemporalUNet'] used for time-series prediciton in this reinforcement learning [paper](https://arxiv.org/abs/2205.09991).
documentation of that class below, it means it has an image attribute.
- TODO: mention VAE / SDE score estimation
\ No newline at end of file
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you will get `None`:
```python
outputs.images
```
or via keyword lookup
```python
outputs["images"]
```
When considering our `outputs` object as tuple, it only considers the attributes that don't have `None` values.
Here for instance, we could retrieve images via indexing:
```python
outputs[:1]
```
which will return the tuple `(outputs.images)` for instance.