"tests/vscode:/vscode.git/clone" did not exist on "0efbb6e93e9bb5307e1925746980e102b94e7254"
Unverified Commit cf028d0c authored by NielsRogge's avatar NielsRogge Committed by GitHub
Browse files

Add batch of resources (#20647)



* Add resources

* Add more resources

* Add more resources

* Add TAPAS

* Fix pipeline tag

* Fix pipeline tags

* Remove pipeline tag

* Remove depth-estimation tag

* Update docs/source/en/model_doc/segformer.mdx
Co-authored-by: default avatarMaria Khalusova <kafooster@gmail.com>

* Apply suggestion

* Fix segformer
Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: default avatarMaria Khalusova <kafooster@gmail.com>
parent bb300ac6
...@@ -91,14 +91,21 @@ In Computer Vision: ...@@ -91,14 +91,21 @@ In Computer Vision:
- [Image classification with ViT](https://huggingface.co/google/vit-base-patch16-224) - [Image classification with ViT](https://huggingface.co/google/vit-base-patch16-224)
- [Object Detection with DETR](https://huggingface.co/facebook/detr-resnet-50) - [Object Detection with DETR](https://huggingface.co/facebook/detr-resnet-50)
- [Semantic Segmentation with SegFormer](https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512) - [Semantic Segmentation with SegFormer](https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512)
- [Panoptic Segmentation with DETR](https://huggingface.co/facebook/detr-resnet-50-panoptic) - [Panoptic Segmentation with MaskFormer](https://huggingface.co/facebook/maskformer-swin-small-coco)
- [Depth Estimation with DPT](https://huggingface.co/docs/transformers/model_doc/dpt)
- [Video Classification with VideoMAE](https://huggingface.co/docs/transformers/model_doc/videomae)
In Audio: In Audio:
- [Automatic Speech Recognition with Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base-960h) - [Automatic Speech Recognition with Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base-960h)
- [Keyword Spotting with Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks) - [Keyword Spotting with Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)
- [Audio Classification with Audio Spectrogram Transformer](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
In Multimodal tasks: In Multimodal tasks:
- [Table Question Answering with TAPAS](https://huggingface.co/google/tapas-base-finetuned-wtq)
- [Visual Question Answering with ViLT](https://huggingface.co/dandelin/vilt-b32-finetuned-vqa) - [Visual Question Answering with ViLT](https://huggingface.co/dandelin/vilt-b32-finetuned-vqa)
- [Zero-shot Image Classification with CLIP](https://huggingface.co/openai/clip-vit-large-patch14)
- [Document Question Answering with LayoutLM](https://huggingface.co/impira/layoutlm-document-qa)
- [Zero-shot Video Classification with X-CLIP](https://huggingface.co/docs/transformers/model_doc/xclip)
**[Write With Transformer](https://transformer.huggingface.co)**, built by the Hugging Face team, is the official demo of this repo’s text generation capabilities. **[Write With Transformer](https://transformer.huggingface.co)**, built by the Hugging Face team, is the official demo of this repo’s text generation capabilities.
......
...@@ -67,6 +67,15 @@ alt="drawing" width="600"/> ...@@ -67,6 +67,15 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). The JAX/FLAX version of this model was This model was contributed by [nielsr](https://huggingface.co/nielsr). The JAX/FLAX version of this model was
contributed by [kamalkraj](https://huggingface.co/kamalkraj). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/beit). contributed by [kamalkraj](https://huggingface.co/kamalkraj). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/beit).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with BEiT.
<PipelineTag pipeline="image-classification"/>
- [`BeitForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## BEiT specific outputs ## BEiT specific outputs
......
...@@ -30,7 +30,6 @@ impact on transfer learning. ...@@ -30,7 +30,6 @@ impact on transfer learning.
This model was contributed by [nielsr](https://huggingface.co/nielsr). This model was contributed by [nielsr](https://huggingface.co/nielsr).
The original code can be found [here](https://github.com/google-research/big_transfer). The original code can be found [here](https://github.com/google-research/big_transfer).
## Resources ## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with BiT. A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with BiT.
...@@ -45,19 +44,16 @@ If you're interested in submitting a resource to be included here, please feel f ...@@ -45,19 +44,16 @@ If you're interested in submitting a resource to be included here, please feel f
[[autodoc]] BitConfig [[autodoc]] BitConfig
## BitImageProcessor ## BitImageProcessor
[[autodoc]] BitImageProcessor [[autodoc]] BitImageProcessor
- preprocess - preprocess
## BitModel ## BitModel
[[autodoc]] BitModel [[autodoc]] BitModel
- forward - forward
## BitForImageClassification ## BitForImageClassification
[[autodoc]] BitForImageClassification [[autodoc]] BitForImageClassification
......
...@@ -77,22 +77,13 @@ This model was contributed by [valhalla](https://huggingface.co/valhalla). The o ...@@ -77,22 +77,13 @@ This model was contributed by [valhalla](https://huggingface.co/valhalla). The o
## Resources ## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with CLIP. If you're A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with CLIP.
interested in submitting a resource to be included here, please feel free to open a Pull Request and we will review it.
The resource should ideally demonstrate something new instead of duplicating an existing resource.
<PipelineTag pipeline="text-to-image"/>
- A blog post on [How to use CLIP to retrieve images from text](https://huggingface.co/blog/fine-tune-clip-rsicd).
- A blog bost on [How to use CLIP for Japanese text to image generation](https://huggingface.co/blog/japanese-stable-diffusion).
<PipelineTag pipeline="image-to-text"/> - A blog post on [How to fine-tune CLIP on 10,000 image-text pairs](https://huggingface.co/blog/fine-tune-clip-rsicd).
- A notebook showing [Video to text matching with CLIP for videos](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/X-CLIP/Video_text_matching_with_X_CLIP.ipynb). - CLIP is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/contrastive-image-text).
<PipelineTag pipeline="zero-shot-classification"/>
- A notebook showing [Zero shot video classification using CLIP for video](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/X-CLIP/Zero_shot_classify_a_YouTube_video_with_X_CLIP.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we will review it.
The resource should ideally demonstrate something new instead of duplicating an existing resource.
## CLIPConfig ## CLIPConfig
......
...@@ -40,16 +40,24 @@ alt="drawing" width="600"/> ...@@ -40,16 +40,24 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). TensorFlow version of the model was contributed by [ariG23498](https://github.com/ariG23498), This model was contributed by [nielsr](https://huggingface.co/nielsr). TensorFlow version of the model was contributed by [ariG23498](https://github.com/ariG23498),
[gante](https://github.com/gante), and [sayakpaul](https://github.com/sayakpaul) (equal contribution). The original code can be found [here](https://github.com/facebookresearch/ConvNeXt). [gante](https://github.com/gante), and [sayakpaul](https://github.com/sayakpaul) (equal contribution). The original code can be found [here](https://github.com/facebookresearch/ConvNeXt).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with ConvNeXT.
<PipelineTag pipeline="image-classification"/>
- [`ConvNextForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## ConvNextConfig ## ConvNextConfig
[[autodoc]] ConvNextConfig [[autodoc]] ConvNextConfig
## ConvNextFeatureExtractor ## ConvNextFeatureExtractor
[[autodoc]] ConvNextFeatureExtractor [[autodoc]] ConvNextFeatureExtractor
## ConvNextImageProcessor ## ConvNextImageProcessor
[[autodoc]] ConvNextImageProcessor [[autodoc]] ConvNextImageProcessor
...@@ -60,7 +68,6 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). TensorFlo ...@@ -60,7 +68,6 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). TensorFlo
[[autodoc]] ConvNextModel [[autodoc]] ConvNextModel
- forward - forward
## ConvNextForImageClassification ## ConvNextForImageClassification
[[autodoc]] ConvNextForImageClassification [[autodoc]] ConvNextForImageClassification
......
...@@ -38,6 +38,16 @@ Tips: ...@@ -38,6 +38,16 @@ Tips:
This model was contributed by [anugunj](https://huggingface.co/anugunj). The original code can be found [here](https://github.com/microsoft/CvT). This model was contributed by [anugunj](https://huggingface.co/anugunj). The original code can be found [here](https://github.com/microsoft/CvT).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with CvT.
<PipelineTag pipeline="image-classification"/>
- [`CvtForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## CvtConfig ## CvtConfig
[[autodoc]] CvtConfig [[autodoc]] CvtConfig
......
...@@ -37,9 +37,6 @@ Tips: ...@@ -37,9 +37,6 @@ Tips:
- For Data2VecAudio, preprocessing is identical to [`Wav2Vec2Model`], including feature extraction - For Data2VecAudio, preprocessing is identical to [`Wav2Vec2Model`], including feature extraction
- For Data2VecText, preprocessing is identical to [`RobertaModel`], including tokenization. - For Data2VecText, preprocessing is identical to [`RobertaModel`], including tokenization.
- For Data2VecVision, preprocessing is identical to [`BeitModel`], including feature extraction. - For Data2VecVision, preprocessing is identical to [`BeitModel`], including feature extraction.
- To know how a pre-trained Data2Vec vision model can be fine-tuned on the task of image classification, you can check out
[this notebook](https://colab.research.google.com/github/sayakpaul/TF-2.0-Hacks/blob/master/data2vec_vision_image_classification.ipynb).
This model was contributed by [edugp](https://huggingface.co/edugp) and [patrickvonplaten](https://huggingface.co/patrickvonplaten). This model was contributed by [edugp](https://huggingface.co/edugp) and [patrickvonplaten](https://huggingface.co/patrickvonplaten).
[sayakpaul](https://github.com/sayakpaul) and [Rocketknight1](https://github.com/Rocketknight1) contributed Data2Vec for vision in TensorFlow. [sayakpaul](https://github.com/sayakpaul) and [Rocketknight1](https://github.com/Rocketknight1) contributed Data2Vec for vision in TensorFlow.
...@@ -48,6 +45,17 @@ The original code (for NLP and Speech) can be found [here](https://github.com/py ...@@ -48,6 +45,17 @@ The original code (for NLP and Speech) can be found [here](https://github.com/py
The original code for vision can be found [here](https://github.com/facebookresearch/data2vec_vision/tree/main/beit). The original code for vision can be found [here](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with Data2Vec.
<PipelineTag pipeline="image-classification"/>
- [`Data2VecVisionForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
- To fine-tune [`TFData2VecVisionForImageClassification`] on a custom dataset, see [this notebook](https://colab.research.google.com/github/sayakpaul/TF-2.0-Hacks/blob/master/data2vec_vision_image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## Data2VecTextConfig ## Data2VecTextConfig
[[autodoc]] Data2VecTextConfig [[autodoc]] Data2VecTextConfig
......
...@@ -24,7 +24,7 @@ The abstract from the paper is the following: ...@@ -24,7 +24,7 @@ The abstract from the paper is the following:
Tips: Tips:
- One can use [`DeformableDetrImageProcessor`] to prepare images (and optional targets) for the model. - One can use [`DeformableDetrImageProcessor`] to prepare images (and optional targets) for the model.
- Training Deformable DETR is equivalent to training the original [DETR](detr) model. Demo notebooks can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR). - Training Deformable DETR is equivalent to training the original [DETR](detr) model. See the [resources](#resources) section below for demo notebooks.
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/deformable_detr_architecture.png" <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/deformable_detr_architecture.png"
alt="drawing" width="600"/> alt="drawing" width="600"/>
...@@ -33,6 +33,16 @@ alt="drawing" width="600"/> ...@@ -33,6 +33,16 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/fundamentalvision/Deformable-DETR). This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/fundamentalvision/Deformable-DETR).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with Deformable DETR.
<PipelineTag pipeline="object-detection"/>
- Demo notebooks regarding inference + fine-tuning on a custom dataset for [`DeformableDetrForObjectDetection`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/Deformable-DETR).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## DeformableDetrImageProcessor ## DeformableDetrImageProcessor
[[autodoc]] DeformableDetrImageProcessor [[autodoc]] DeformableDetrImageProcessor
...@@ -47,18 +57,15 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). The origi ...@@ -47,18 +57,15 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). The origi
- pad_and_create_pixel_mask - pad_and_create_pixel_mask
- post_process_object_detection - post_process_object_detection
## DeformableDetrConfig ## DeformableDetrConfig
[[autodoc]] DeformableDetrConfig [[autodoc]] DeformableDetrConfig
## DeformableDetrModel ## DeformableDetrModel
[[autodoc]] DeformableDetrModel [[autodoc]] DeformableDetrModel
- forward - forward
## DeformableDetrForObjectDetection ## DeformableDetrForObjectDetection
[[autodoc]] DeformableDetrForObjectDetection [[autodoc]] DeformableDetrForObjectDetection
......
...@@ -71,6 +71,19 @@ Tips: ...@@ -71,6 +71,19 @@ Tips:
This model was contributed by [nielsr](https://huggingface.co/nielsr). The TensorFlow version of this model was added by [amyeroberts](https://huggingface.co/amyeroberts). This model was contributed by [nielsr](https://huggingface.co/nielsr). The TensorFlow version of this model was added by [amyeroberts](https://huggingface.co/amyeroberts).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DeiT.
<PipelineTag pipeline="image-classification"/>
- [`DeiTForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
Besides that:
- [`DeiTForMaskedImageModeling`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-pretraining).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## DeiTConfig ## DeiTConfig
......
...@@ -37,9 +37,6 @@ baselines.* ...@@ -37,9 +37,6 @@ baselines.*
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/facebookresearch/detr). This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/facebookresearch/detr).
The quickest way to get started with DETR is by checking the [example notebooks](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR) (which showcase both inference and
fine-tuning on custom data).
Here's a TLDR explaining how [`~transformers.DetrForObjectDetection`] works: Here's a TLDR explaining how [`~transformers.DetrForObjectDetection`] works:
First, an image is sent through a pre-trained convolutional backbone (in the paper, the authors use First, an image is sent through a pre-trained convolutional backbone (in the paper, the authors use
...@@ -153,6 +150,15 @@ outputs of the model using one of the postprocessing methods of [`~transformers. ...@@ -153,6 +150,15 @@ outputs of the model using one of the postprocessing methods of [`~transformers.
be be provided to either `CocoEvaluator` or `PanopticEvaluator`, which allow you to calculate metrics like be be provided to either `CocoEvaluator` or `PanopticEvaluator`, which allow you to calculate metrics like
mean Average Precision (mAP) and Panoptic Quality (PQ). The latter objects are implemented in the [original repository](https://github.com/facebookresearch/detr). See the [example notebooks](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR) for more info regarding evaluation. mean Average Precision (mAP) and Panoptic Quality (PQ). The latter objects are implemented in the [original repository](https://github.com/facebookresearch/detr). See the [example notebooks](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR) for more info regarding evaluation.
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DETR.
<PipelineTag pipeline="object-detection"/>
- All example notebooks illustrating fine-tuning [`DetrForObjectDetection`] and [`DetrForSegmentation`] on a custom dataset an be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## DETR specific outputs ## DETR specific outputs
......
...@@ -61,12 +61,20 @@ Taken from the <a href="https://arxiv.org/abs/2209.15001">original paper</a>.</s ...@@ -61,12 +61,20 @@ Taken from the <a href="https://arxiv.org/abs/2209.15001">original paper</a>.</s
This model was contributed by [Ali Hassani](https://huggingface.co/alihassanijr). This model was contributed by [Ali Hassani](https://huggingface.co/alihassanijr).
The original code can be found [here](https://github.com/SHI-Labs/Neighborhood-Attention-Transformer). The original code can be found [here](https://github.com/SHI-Labs/Neighborhood-Attention-Transformer).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DiNAT.
<PipelineTag pipeline="image-classification"/>
- [`DinatForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## DinatConfig ## DinatConfig
[[autodoc]] DinatConfig [[autodoc]] DinatConfig
## DinatModel ## DinatModel
[[autodoc]] DinatModel [[autodoc]] DinatModel
......
...@@ -65,3 +65,13 @@ A notebook that illustrates inference for document image classification can be f ...@@ -65,3 +65,13 @@ A notebook that illustrates inference for document image classification can be f
As DiT's architecture is equivalent to that of BEiT, one can refer to [BEiT's documentation page](beit) for all tips, code examples and notebooks. As DiT's architecture is equivalent to that of BEiT, one can refer to [BEiT's documentation page](beit) for all tips, code examples and notebooks.
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/dit). This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/dit).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DiT.
<PipelineTag pipeline="image-classification"/>
- [`BeitForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
\ No newline at end of file
...@@ -28,37 +28,40 @@ alt="drawing" width="600"/> ...@@ -28,37 +28,40 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/isl-org/DPT). This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/isl-org/DPT).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DPT.
- Demo notebooks for [`DPTForDepthEstimation`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DPT).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## DPTConfig ## DPTConfig
[[autodoc]] DPTConfig [[autodoc]] DPTConfig
## DPTFeatureExtractor ## DPTFeatureExtractor
[[autodoc]] DPTFeatureExtractor [[autodoc]] DPTFeatureExtractor
- __call__ - __call__
- post_process_semantic_segmentation - post_process_semantic_segmentation
## DPTImageProcessor ## DPTImageProcessor
[[autodoc]] DPTImageProcessor [[autodoc]] DPTImageProcessor
- preprocess - preprocess
- post_process_semantic_segmentation - post_process_semantic_segmentation
## DPTModel ## DPTModel
[[autodoc]] DPTModel [[autodoc]] DPTModel
- forward - forward
## DPTForDepthEstimation ## DPTForDepthEstimation
[[autodoc]] DPTForDepthEstimation [[autodoc]] DPTForDepthEstimation
- forward - forward
## DPTForSemanticSegmentation ## DPTForSemanticSegmentation
[[autodoc]] DPTForSemanticSegmentation [[autodoc]] DPTForSemanticSegmentation
......
...@@ -31,7 +31,6 @@ The abstract from the paper is the following: ...@@ -31,7 +31,6 @@ The abstract from the paper is the following:
Tips: Tips:
- A notebook illustrating inference with [`GLPNForDepthEstimation`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/GLPN/GLPN_inference_(depth_estimation).ipynb).
- One can use [`GLPNImageProcessor`] to prepare images for the model. - One can use [`GLPNImageProcessor`] to prepare images for the model.
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/glpn_architecture.jpg" <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/glpn_architecture.jpg"
...@@ -41,6 +40,12 @@ alt="drawing" width="600"/> ...@@ -41,6 +40,12 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/vinvino02/GLPDepth). This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/vinvino02/GLPDepth).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with GLPN.
- Demo notebooks for [`GLPNForDepthEstimation`] can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/GLPN).
## GLPNConfig ## GLPNConfig
[[autodoc]] GLPNConfig [[autodoc]] GLPNConfig
......
...@@ -24,11 +24,16 @@ The abstract from the paper is the following: ...@@ -24,11 +24,16 @@ The abstract from the paper is the following:
Tips: Tips:
- You may specify `output_segmentation=True` in the forward of `GroupViTModel` to get the segmentation logits of input texts. - You may specify `output_segmentation=True` in the forward of `GroupViTModel` to get the segmentation logits of input texts.
- The quickest way to get started with GroupViT is by checking the [example notebooks](https://github.com/xvjiarui/GroupViT/blob/main/demo/GroupViT_hf_inference_notebook.ipynb) (which showcase zero-shot segmentation inference). One can also check out the [HuggingFace Spaces demo](https://huggingface.co/spaces/xvjiarui/GroupViT) to play with GroupViT.
This model was contributed by [xvjiarui](https://huggingface.co/xvjiarui). The TensorFlow version was contributed by [ariG23498](https://huggingface.co/ariG23498) with the help of [Yih-Dar SHIEH](https://huggingface.co/ydshieh), [Amy Roberts](https://huggingface.co/amyeroberts), and [Joao Gante](https://huggingface.co/joaogante). This model was contributed by [xvjiarui](https://huggingface.co/xvjiarui). The TensorFlow version was contributed by [ariG23498](https://huggingface.co/ariG23498) with the help of [Yih-Dar SHIEH](https://huggingface.co/ydshieh), [Amy Roberts](https://huggingface.co/amyeroberts), and [Joao Gante](https://huggingface.co/joaogante).
The original code can be found [here](https://github.com/NVlabs/GroupViT). The original code can be found [here](https://github.com/NVlabs/GroupViT).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with GroupViT.
- The quickest way to get started with GroupViT is by checking the [example notebooks](https://github.com/xvjiarui/GroupViT/blob/main/demo/GroupViT_hf_inference_notebook.ipynb) (which showcase zero-shot segmentation inference).
- One can also check out the [HuggingFace Spaces demo](https://huggingface.co/spaces/xvjiarui/GroupViT) to play with GroupViT.
## GroupViTConfig ## GroupViTConfig
......
...@@ -38,8 +38,6 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr), based on ...@@ -38,8 +38,6 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr), based on
Tips: Tips:
- Demo notebooks for ImageGPT can be found
[here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/ImageGPT).
- ImageGPT is almost exactly the same as [GPT-2](gpt2), with the exception that a different activation - ImageGPT is almost exactly the same as [GPT-2](gpt2), with the exception that a different activation
function is used (namely "quick gelu"), and the layer normalization layers don't mean center the inputs. ImageGPT function is used (namely "quick gelu"), and the layer normalization layers don't mean center the inputs. ImageGPT
also doesn't have tied input- and output embeddings. also doesn't have tied input- and output embeddings.
...@@ -71,6 +69,17 @@ Tips: ...@@ -71,6 +69,17 @@ Tips:
| MiT-b4 | [3, 8, 27, 3] | [64, 128, 320, 512] | 768 | 62.6 | 83.6 | | MiT-b4 | [3, 8, 27, 3] | [64, 128, 320, 512] | 768 | 62.6 | 83.6 |
| MiT-b5 | [3, 6, 40, 3] | [64, 128, 320, 512] | 768 | 82.0 | 83.8 | | MiT-b5 | [3, 6, 40, 3] | [64, 128, 320, 512] | 768 | 82.0 | 83.8 |
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with ImageGPT.
<PipelineTag pipeline="image-classification"/>
- Demo notebooks for ImageGPT can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/ImageGPT).
- [`ImageGPTForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## ImageGPTConfig ## ImageGPTConfig
[[autodoc]] ImageGPTConfig [[autodoc]] ImageGPTConfig
......
...@@ -61,6 +61,15 @@ Tips: ...@@ -61,6 +61,15 @@ Tips:
This model was contributed by [anugunj](https://huggingface.co/anugunj). The original code can be found [here](https://github.com/facebookresearch/LeViT). This model was contributed by [anugunj](https://huggingface.co/anugunj). The original code can be found [here](https://github.com/facebookresearch/LeViT).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LeViT.
<PipelineTag pipeline="image-classification"/>
- [`LevitForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## LevitConfig ## LevitConfig
......
...@@ -37,7 +37,6 @@ model.push_to_hub("name_of_repo_on_the_hub") ...@@ -37,7 +37,6 @@ model.push_to_hub("name_of_repo_on_the_hub")
- When preparing data for the model, make sure to use the token vocabulary that corresponds to the RoBERTa checkpoint you combined with the Layout Transformer. - When preparing data for the model, make sure to use the token vocabulary that corresponds to the RoBERTa checkpoint you combined with the Layout Transformer.
- As [lilt-roberta-en-base](https://huggingface.co/SCUT-DLVCLab/lilt-roberta-en-base) uses the same vocabulary as [LayoutLMv3](layoutlmv3), one can use [`LayoutLMv3TokenizerFast`] to prepare data for the model. - As [lilt-roberta-en-base](https://huggingface.co/SCUT-DLVCLab/lilt-roberta-en-base) uses the same vocabulary as [LayoutLMv3](layoutlmv3), one can use [`LayoutLMv3TokenizerFast`] to prepare data for the model.
The same is true for [lilt-roberta-en-base](https://huggingface.co/SCUT-DLVCLab/lilt-infoxlm-base): one can use [`LayoutXLMTokenizerFast`] for that model. The same is true for [lilt-roberta-en-base](https://huggingface.co/SCUT-DLVCLab/lilt-infoxlm-base): one can use [`LayoutXLMTokenizerFast`] for that model.
- Demo notebooks for LiLT can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LiLT).
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/lilt_architecture.jpg" <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/lilt_architecture.jpg"
alt="drawing" width="600"/> alt="drawing" width="600"/>
...@@ -47,6 +46,13 @@ alt="drawing" width="600"/> ...@@ -47,6 +46,13 @@ alt="drawing" width="600"/>
This model was contributed by [nielsr](https://huggingface.co/nielsr). This model was contributed by [nielsr](https://huggingface.co/nielsr).
The original code can be found [here](https://github.com/jpwang/lilt). The original code can be found [here](https://github.com/jpwang/lilt).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LiLT.
- Demo notebooks for LiLT can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LiLT).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## LiltConfig ## LiltConfig
......
...@@ -44,6 +44,16 @@ Unsupported features: ...@@ -44,6 +44,16 @@ Unsupported features:
This model was contributed by [matthijs](https://huggingface.co/Matthijs). The original code and weights can be found [here](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md). This model was contributed by [matthijs](https://huggingface.co/Matthijs). The original code and weights can be found [here](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with MobileNetV1.
<PipelineTag pipeline="image-classification"/>
- [`MobileNetV1ForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## MobileNetV1Config ## MobileNetV1Config
[[autodoc]] MobileNetV1Config [[autodoc]] MobileNetV1Config
......
...@@ -48,6 +48,16 @@ Unsupported features: ...@@ -48,6 +48,16 @@ Unsupported features:
This model was contributed by [matthijs](https://huggingface.co/Matthijs). The original code and weights can be found [here for the main model](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet) and [here for DeepLabV3+](https://github.com/tensorflow/models/tree/master/research/deeplab). This model was contributed by [matthijs](https://huggingface.co/Matthijs). The original code and weights can be found [here for the main model](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet) and [here for DeepLabV3+](https://github.com/tensorflow/models/tree/master/research/deeplab).
## Resources
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with MobileNetV2.
<PipelineTag pipeline="image-classification"/>
- [`MobileNetV2ForImageClassification`] is supported by this [example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification) and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
## MobileNetV2Config ## MobileNetV2Config
[[autodoc]] MobileNetV2Config [[autodoc]] MobileNetV2Config
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment