[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 3552d0e0 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/docs/source/model_sharing.rst
+++ b/docs/source/model_sharing.rst
@@ -60,7 +60,7 @@ Basic steps
 In order to upload a model, you'll need to first create a git repo. This repo will live on the model hub, allowing
 users to clone it and you (and your organization members) to push to it.

-You can create a model repo directly from the website, `here <https://huggingface.co/new>`.
+You can create a model repo **directly from `the /new page on the website <https://huggingface.co/new>`__.**

 Alternatively, you can use the ``transformers-cli``. The next steps describe that process:

@@ -82,12 +82,12 @@ This creates a repo on the model hub, which can be cloned.

 .. code-block:: bash

-    git clone https://huggingface.co/username/your-model-name
-
    # Make sure you have git-lfs installed
    # (https://git-lfs.github.com/)
    git lfs install

+    git clone https://huggingface.co/username/your-model-name
+
 When you have your local clone of your repo and lfs installed, you can then add/remove from that clone as you would
 with any other git repo.

@@ -98,8 +98,12 @@ with any other git repo.
    echo "hello" >> README.md
    git add . && git commit -m "Update from $USER"

-We are intentionally not wrapping git too much, so as to stay intuitive and easy-to-use.
+We are intentionally not wrapping git too much, so that you can go on with the workflow you're used to and the tools
+you already know.

+The only learning curve you might have compared to regular git is the one for git-lfs. The documentation at
+`git-lfs.github.com <https://git-lfs.github.com/>`__ is decent, but we'll work on a tutorial with some tips and tricks
+in the coming weeks!

 Make your model work on all frameworks
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -110,7 +114,7 @@ Make your model work on all frameworks
 You probably have your favorite framework, but so will other users! That's why it's best to upload your model with both
 PyTorch `and` TensorFlow checkpoints to make it easier to use (if you skip this step, users will still be able to load
 your model in another framework, but it will be slower, as it will have to be converted on the fly). Don't worry, it's
-super easy to do (and in a future version, it will all be automatic). You will need to install both PyTorch and
+super easy to do (and in a future version, it might all be automatic). You will need to install both PyTorch and
 TensorFlow for this step, but you don't need to worry about the GPU, so it should be very easy. Check the `TensorFlow
 installation page <https://www.tensorflow.org/install/pip#tensorflow-2.0-rc-is-available>`__ and/or the `PyTorch
 installation page <https://pytorch.org/get-started/locally/#start-locally>`__ to see how.
@@ -192,7 +196,7 @@ status`` command:
    git add --all
    git status

-Finally, the files should be comitted:
+Finally, the files should be committed:

 .. code-block:: bash

@@ -210,23 +214,20 @@ This will upload the folder containing the weights, tokenizer and configuration
 Add a model card
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-To make sure everyone knows what your model can do, what its limitations and potential bias or ethetical
-considerations, please add a README.md model card to the 🤗 Transformers repo under `model_cards/`. It should then be
-placed in a subfolder with your username or organization, then another subfolder named like your model
-(`awesome-name-you-picked`). Or just click on the "Create a model card on GitHub" button on the model page, it will get
-you directly to the right location. If you need one, `here <https://github.com/huggingface/model_card>`__ is a model
-card template (meta-suggestions are welcome).
+To make sure everyone knows what your model can do, what its limitations, potential bias or ethical considerations are,
+please add a README.md model card to your model repo. You can just create it, or there's also a convenient button
+titled "Add a README.md" on your model page. A model card template can be found `here
+<https://github.com/huggingface/model_card>`__ (meta-suggestions are welcome). model card template (meta-suggestions
+are welcome).

-If your model is fine-tuned from another model coming from the model hub (all 🤗 Transformers pretrained models do),
-don't forget to link to its model card so that people can fully trace how your model was built.
+.. note::

-If you have never made a pull request to the 🤗 Transformers repo, look at the :doc:`contributing guide <contributing>`
-to see the steps to follow.
+    Model cards used to live in the 🤗 Transformers repo under `model_cards/`, but for consistency and scalability we
+    migrated every model card from the repo to its corresponding huggingface.co model repo.

-.. note::
+If your model is fine-tuned from another model coming from the model hub (all 🤗 Transformers pretrained models do),
+don't forget to link to its model card so that people can fully trace how your model was built.

-    You can also send your model card in the folder you uploaded with the CLI by placing it in a `README.md` file
-    inside `path/to/awesome-name-you-picked/`.

 Using your model
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -262,7 +263,8 @@ First you need to install `git-lfs` in the environment used by the notebook:

    sudo apt-get install git-lfs

-Then you can use the :obj:`transformers-cli` to create your new repo:
+Then you can use either create a repo directly from `huggingface.co <https://huggingface.co/>`__ , or use the
+:obj:`transformers-cli` to create it:


 .. code-block:: bash
@@ -274,13 +276,14 @@ Once it's created, you can clone it and configure it (replace username by your u

 .. code-block:: bash

+    git lfs install
+
    git clone https://username:password@huggingface.co/username/your-model-name
    # Alternatively if you have a token,
    # you can use it instead of your password
    git clone https://username:token@huggingface.co/username/your-model-name

    cd your-model-name
-    git lfs install
    git config --global user.email "email@example.com"
    # Tip: using the same email than for your huggingface.co account will link your commits to your profile
    git config --global user.name "Your name"

--- a/model_cards/Cinnamon/electra-small-japanese-discriminator/README.md
+++ b/model_cards/Cinnamon/electra-small-japanese-discriminator/README.md
---
-language: ja
-license: apache-2.0
---
-
-## Japanese ELECTRA-small
-
-We provide a Japanese **ELECTRA-Small** model, as described in [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://openreview.net/pdf?id=r1xMH1BtvB).
-
-Our pretraining process employs subword units derived from the [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/latest), using the [Byte-Pair Encoding](https://www.aclweb.org/anthology/P16-1162.pdf) method and building on an initial tokenization with [mecab-ipadic-NEologd](https://github.com/neologd/mecab-ipadic-neologd). For optimal performance, please take care to set your MeCab dictionary appropriately.
-
-## How to use the discriminator in `transformers`
-
-```
-from transformers import BertJapaneseTokenizer, ElectraForPreTraining
-
-tokenizer = BertJapaneseTokenizer.from_pretrained('Cinnamon/electra-small-japanese-discriminator', mecab_kwargs={"mecab_option": "-d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd"})
-
-model = ElectraForPreTraining.from_pretrained('Cinnamon/electra-small-japanese-discriminator')
-```
--- a/model_cards/Cinnamon/electra-small-japanese-generator/README.md
+++ b/model_cards/Cinnamon/electra-small-japanese-generator/README.md
---
-language: ja
---
-## Japanese ELECTRA-small
-
-We provide a Japanese **ELECTRA-Small** model, as described in [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://openreview.net/pdf?id=r1xMH1BtvB).
-
-Our pretraining process employs subword units derived from the [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/latest), using the [Byte-Pair Encoding](https://www.aclweb.org/anthology/P16-1162.pdf) method and building on an initial tokenization with [mecab-ipadic-NEologd](https://github.com/neologd/mecab-ipadic-neologd). For optimal performance, please take care to set your MeCab dictionary appropriately.
-
-```
-# ELECTRA-small generator usage
-
-from transformers import BertJapaneseTokenizer, ElectraForMaskedLM
-
-tokenizer = BertJapaneseTokenizer.from_pretrained('Cinnamon/electra-small-japanese-generator', mecab_kwargs={"mecab_option": "-d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd"})
-
-model = ElectraForMaskedLM.from_pretrained('Cinnamon/electra-small-japanese-generator')
-```
--- a/model_cards/DJSammy/bert-base-danish-uncased_BotXO,ai/README.md
+++ b/model_cards/DJSammy/bert-base-danish-uncased_BotXO,ai/README.md
---
-language: da
-tags:
- bert
- masked-lm
-license: cc-by-4.0
-datasets:
- common_crawl
- wikipedia
-pipeline_tag: fill-mask
-widget:
- text: "København er [MASK] i Danmark."
---
-
-# Danish BERT (uncased) model 
-
-[BotXO.ai](https://www.botxo.ai/) developed this model. For data and training details see their [GitHub repository](https://github.com/botxo/nordic_bert).  
-
-The original model was trained in TensorFlow then I converted it to Pytorch using [transformers-cli](https://huggingface.co/transformers/converting_tensorflow_models.html?highlight=cli).
-
-For TensorFlow version download here: https://www.dropbox.com/s/19cjaoqvv2jicq9/danish_bert_uncased_v2.zip?dl=1
-
-
-## Architecture
-
-```python
-from transformers import AutoModelForPreTraining
-
-model = AutoModelForPreTraining.from_pretrained("DJSammy/bert-base-danish-uncased_BotXO,ai")
-
-params = list(model.named_parameters())
-print('danish_bert_uncased_v2 has {:} different named parameters.\n'.format(len(params)))
-
-print('==== Embedding Layer ====\n')
-for p in params[0:5]:
-    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
-
-print('\n==== First Transformer ====\n')
-for p in params[5:21]:
-    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
-
-print('\n==== Last Transformer ====\n')
-for p in params[181:197]:
-    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
-
-print('\n==== Output Layer ====\n')
-for p in params[197:]:
-    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
-
-# danish_bert_uncased_v2 has 206 different named parameters.
-
-# ==== Embedding Layer ====
-
-# bert.embeddings.word_embeddings.weight                  (32000, 768)
-# bert.embeddings.position_embeddings.weight                (512, 768)
-# bert.embeddings.token_type_embeddings.weight                (2, 768)
-# bert.embeddings.LayerNorm.weight                              (768,)
-# bert.embeddings.LayerNorm.bias                                (768,)
-
-# ==== First Transformer ====
-
-# bert.encoder.layer.0.attention.self.query.weight          (768, 768)
-# bert.encoder.layer.0.attention.self.query.bias                (768,)
-# bert.encoder.layer.0.attention.self.key.weight            (768, 768)
-# bert.encoder.layer.0.attention.self.key.bias                  (768,)
-# bert.encoder.layer.0.attention.self.value.weight          (768, 768)
-# bert.encoder.layer.0.attention.self.value.bias                (768,)
-# bert.encoder.layer.0.attention.output.dense.weight        (768, 768)
-# bert.encoder.layer.0.attention.output.dense.bias              (768,)
-# bert.encoder.layer.0.attention.output.LayerNorm.weight        (768,)
-# bert.encoder.layer.0.attention.output.LayerNorm.bias          (768,)
-# bert.encoder.layer.0.intermediate.dense.weight           (3072, 768)
-# bert.encoder.layer.0.intermediate.dense.bias                 (3072,)
-# bert.encoder.layer.0.output.dense.weight                 (768, 3072)
-# bert.encoder.layer.0.output.dense.bias                        (768,)
-# bert.encoder.layer.0.output.LayerNorm.weight                  (768,)
-# bert.encoder.layer.0.output.LayerNorm.bias                    (768,)
-
-# ==== Last Transformer ====
-
-# bert.encoder.layer.11.attention.self.query.weight         (768, 768)
-# bert.encoder.layer.11.attention.self.query.bias               (768,)
-# bert.encoder.layer.11.attention.self.key.weight           (768, 768)
-# bert.encoder.layer.11.attention.self.key.bias                 (768,)
-# bert.encoder.layer.11.attention.self.value.weight         (768, 768)
-# bert.encoder.layer.11.attention.self.value.bias               (768,)
-# bert.encoder.layer.11.attention.output.dense.weight       (768, 768)
-# bert.encoder.layer.11.attention.output.dense.bias             (768,)
-# bert.encoder.layer.11.attention.output.LayerNorm.weight       (768,)
-# bert.encoder.layer.11.attention.output.LayerNorm.bias         (768,)
-# bert.encoder.layer.11.intermediate.dense.weight          (3072, 768)
-# bert.encoder.layer.11.intermediate.dense.bias                (3072,)
-# bert.encoder.layer.11.output.dense.weight                (768, 3072)
-# bert.encoder.layer.11.output.dense.bias                       (768,)
-# bert.encoder.layer.11.output.LayerNorm.weight                 (768,)
-# bert.encoder.layer.11.output.LayerNorm.bias                   (768,)
-
-# ==== Output Layer ====
-
-# bert.pooler.dense.weight                                  (768, 768)
-# bert.pooler.dense.bias                                        (768,)
-# cls.predictions.bias                                        (32000,)
-# cls.predictions.transform.dense.weight                    (768, 768)
-# cls.predictions.transform.dense.bias                          (768,)
-# cls.predictions.transform.LayerNorm.weight                    (768,)
-# cls.predictions.transform.LayerNorm.bias                      (768,)
-# cls.seq_relationship.weight                                 (2, 768)
-# cls.seq_relationship.bias                                       (2,)
-```
-
-## Example Pipeline
-
-```python
-from transformers import pipeline
-unmasker = pipeline('fill-mask', model='DJSammy/bert-base-danish-uncased_BotXO,ai')
-
-unmasker('København er [MASK] i Danmark.')
-
-# Copenhagen is the [MASK] of Denmark.
-# =>
-
-# [{'score': 0.788068950176239,
-#  'sequence': '[CLS] københavn er hovedstad i danmark. [SEP]',
-#  'token': 12610,
-#  'token_str': 'hovedstad'},
-# {'score': 0.07606703042984009,
-#  'sequence': '[CLS] københavn er hovedstaden i danmark. [SEP]',
-#  'token': 8108,
-#  'token_str': 'hovedstaden'},
-# {'score': 0.04299738258123398,
-#  'sequence': '[CLS] københavn er metropol i danmark. [SEP]',
-#  'token': 23305,
-#  'token_str': 'metropol'},
-# {'score': 0.008163209073245525,
-#  'sequence': '[CLS] københavn er ikke i danmark. [SEP]',
-#  'token': 89,
-#  'token_str': 'ikke'},
-# {'score': 0.006238455418497324,
-#  'sequence': '[CLS] københavn er ogsa i danmark. [SEP]',
-#  'token': 25253,
-#  'token_str': 'ogsa'}]
-```
--- a/model_cards/DeepPavlov/bert-base-bg-cs-pl-ru-cased/README.md
+++ b/model_cards/DeepPavlov/bert-base-bg-cs-pl-ru-cased/README.md
---
-language:
- bg
- cs
- pl
- ru
---
-
-# bert-base-bg-cs-pl-ru-cased
-
-SlavicBERT\[1\] \(Slavic \(bg, cs, pl, ru\), cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\) was trained on Russian News and four Wikipedias: Bulgarian, Czech, Polish, and Russian. Subtoken vocabulary was built using this data. Multilingual BERT was used as an initialization for SlavicBERT.
-
-
-\[1\]: Arkhipov M., Trofimova M., Kuratov Y., Sorokin A. \(2019\). [Tuning Multilingual Transformers for Language-Specific Named Entity Recognition](https://www.aclweb.org/anthology/W19-3712/). ACL anthology W19-3712.
--- a/model_cards/DeepPavlov/bert-base-cased-conversational/README.md
+++ b/model_cards/DeepPavlov/bert-base-cased-conversational/README.md
---
-language: en
---
-
-# bert-base-cased-conversational
-
-Conversational BERT \(English, cased, 12‑layer, 768‑hidden, 12‑heads, 110M parameters\) was trained on the English part of Twitter, Reddit, DailyDialogues\[1\], OpenSubtitles\[2\], Debates\[3\], Blogs\[4\], Facebook News Comments. We used this training data to build the vocabulary of English subtokens and took English cased version of BERT‑base as an initialization for English Conversational BERT.
-
-
-\[1\]: Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. IJCNLP 2017.
-
-\[2\]: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation \(LREC 2016\)
-
-\[3\]: Justine Zhang, Ravi Kumar, Sujith Ravi, Cristian Danescu-Niculescu-Mizil. Proceedings of NAACL, 2016.
-
-\[4\]: J. Schler, M. Koppel, S. Argamon and J. Pennebaker \(2006\). Effects of Age and Gender on Blogging in Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs.
--- a/model_cards/DeepPavlov/bert-base-multilingual-cased-sentence/README.md
+++ b/model_cards/DeepPavlov/bert-base-multilingual-cased-sentence/README.md
---
-language:
- multilingual
---
-
-# bert-base-multilingual-cased-sentence
-
-Sentence Multilingual BERT \(101 languages, cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\) is a representation‑based sentence encoder for 101 languages of Multilingual BERT. It is initialized with Multilingual BERT and then fine‑tuned on english MultiNLI\[1\] and on dev set of multilingual XNLI\[2\]. Sentence representations are mean pooled token embeddings in the same manner as in Sentence‑BERT\[3\].
-
-
-\[1\]: Williams A., Nangia N. & Bowman S. \(2017\) A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. arXiv preprint [arXiv:1704.05426](https://arxiv.org/abs/1704.05426)
-
-\[2\]: Williams A., Bowman S. \(2018\) XNLI: Evaluating Cross-lingual Sentence Representations. arXiv preprint [arXiv:1809.05053](https://arxiv.org/abs/1809.05053)
-
-\[3\]: N. Reimers, I. Gurevych \(2019\) Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv preprint [arXiv:1908.10084](https://arxiv.org/abs/1908.10084)
--- a/model_cards/DeepPavlov/rubert-base-cased-conversational/README.md
+++ b/model_cards/DeepPavlov/rubert-base-cased-conversational/README.md
---
-language:
- ru
---
-
-# rubert-base-cased-conversational
-
-Conversational RuBERT \(Russian, cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\) was trained on OpenSubtitles\[1\], [Dirty](https://d3.ru/), [Pikabu](https://pikabu.ru/), and a Social Media segment of Taiga corpus\[2\]. We assembled a new vocabulary for Conversational RuBERT model on this data and initialized the model with [RuBERT](../rubert-base-cased).
-
-
-\[1\]: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation \(LREC 2016\)
-
-\[2\]: Shavrina T., Shapovalova O. \(2017\) TO THE METHODOLOGY OF CORPUS CONSTRUCTION FOR MACHINE LEARNING: «TAIGA» SYNTAX TREE CORPUS AND PARSER. in proc. of “CORPORA2017”, international conference , Saint-Petersbourg, 2017.
--- a/model_cards/DeepPavlov/rubert-base-cased-sentence/README.md
+++ b/model_cards/DeepPavlov/rubert-base-cased-sentence/README.md
---
-language:
- ru
---
-
-# rubert-base-cased-sentence
-
-Sentence RuBERT \(Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters\) is a representation‑based sentence encoder for Russian. It is initialized with RuBERT and fine‑tuned on SNLI\[1\] google-translated to russian and on russian part of XNLI dev set\[2\]. Sentence representations are mean pooled token embeddings in the same manner as in Sentence‑BERT\[3\].
-
-
-\[1\]: S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning. \(2015\) A large annotated corpus for learning natural language inference. arXiv preprint [arXiv:1508.05326](https://arxiv.org/abs/1508.05326)
-
-\[2\]: Williams A., Bowman S. \(2018\) XNLI: Evaluating Cross-lingual Sentence Representations. arXiv preprint [arXiv:1809.05053](https://arxiv.org/abs/1809.05053)
-
-\[3\]: N. Reimers, I. Gurevych \(2019\) Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv preprint [arXiv:1908.10084](https://arxiv.org/abs/1908.10084)
--- a/model_cards/DeepPavlov/rubert-base-cased/README.md
+++ b/model_cards/DeepPavlov/rubert-base-cased/README.md
---
-language:
- ru
---
-
-# rubert-base-cased
-
-RuBERT \(Russian, cased, 12‑layer, 768‑hidden, 12‑heads, 180M parameters\) was trained on the Russian part of Wikipedia and news data. We used this training data to build a vocabulary of Russian subtokens and took a multilingual version of BERT‑base as an initialization for RuBERT\[1\].
-
-
-\[1\]: Kuratov, Y., Arkhipov, M. \(2019\). Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language. arXiv preprint [arXiv:1905.07213](https://arxiv.org/abs/1905.07213).
--- a/model_cards/Geotrend/bert-base-15lang-cased/README.md
+++ b/model_cards/Geotrend/bert-base-15lang-cased/README.md
---
-language: multilingual
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
- text: "Paris est la [MASK] de la France."
- text: "Paris est la capitale de la [MASK]."
- text: "L'élection américaine a eu [MASK] en novembre 2020."
- text: "تقع سويسرا في [MASK] أوروبا"
- text: "إسمي محمد وأسكن في [MASK]."
---
-
-# bert-base-15lang-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-The measurements below have been computed on a [Google Cloud n1-standard-1 machine (1 vCPU, 3.75 GB)](https://cloud.google.com/compute/docs/machine-types\#n1_machine_type):
-
-|             Model               | Num parameters |   Size   |  Memory  | Loading time |
-| ------------------------------- | -------------- | -------- | -------- | ------------ |
-| bert-base-multilingual-cased    |   178 million  |  714 MB  | 1400 MB  |    4.2 sec   |
-| Geotrend/bert-base-15lang-cased |   141 million  |  564 MB  | 1098 MB  |    3.1 sec   |
-
-Handled languages: en, fr, es, de, zh, ar, ru, vi, el, bg, th, tr, hi, ur and sw.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-15lang-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-15lang-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-ar-cased/README.md
+++ b/model_cards/Geotrend/bert-base-ar-cased/README.md
---
-language: ar
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "تقع سويسرا في [MASK] أوروبا"
- text: "إسمي محمد وأسكن في [MASK]."
---
-
-# bert-base-ar-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-ar-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-ar-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-bg-cased/README.md
+++ b/model_cards/Geotrend/bert-base-bg-cased/README.md
---
-language: bg
-
-datasets: wikipedia
-
-license: apache-2.0
---
-
-# bert-base-bg-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-bg-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-bg-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-de-cased/README.md
+++ b/model_cards/Geotrend/bert-base-de-cased/README.md
---
-language: de
-
-datasets: wikipedia
-
-license: apache-2.0
---
-
-# bert-base-de-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-de-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-de-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-el-cased/README.md
+++ b/model_cards/Geotrend/bert-base-el-cased/README.md
---
-language: el
-
-datasets: wikipedia
-
-license: apache-2.0
---
-
-# bert-base-el-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-el-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-el-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-en-ar-cased/README.md
+++ b/model_cards/Geotrend/bert-base-en-ar-cased/README.md
---
-language: multilingual
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
- text: "تقع سويسرا في [MASK] أوروبا"
- text: "إسمي محمد وأسكن في [MASK]."
---
-
-# bert-base-en-ar-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-ar-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-en-ar-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-en-bg-cased/README.md
+++ b/model_cards/Geotrend/bert-base-en-bg-cased/README.md
---
-language: multilingual
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
---
-
-# bert-base-en-bg-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-bg-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-en-bg-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-en-cased/README.md
+++ b/model_cards/Geotrend/bert-base-en-cased/README.md
---
-language: en
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
---
-
-# bert-base-en-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-en-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-en-de-cased/README.md
+++ b/model_cards/Geotrend/bert-base-en-de-cased/README.md
---
-language: multilingual
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
---
-
-# bert-base-en-de-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-de-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-en-de-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.
--- a/model_cards/Geotrend/bert-base-en-el-cased/README.md
+++ b/model_cards/Geotrend/bert-base-en-el-cased/README.md
---
-language: multilingual
-
-datasets: wikipedia
-
-license: apache-2.0
-
-widget:
- text: "Google generated 46 billion [MASK] in revenue."
- text: "Paris is the capital of [MASK]."
- text: "Algiers is the largest city in [MASK]."
---
-
-# bert-base-en-el-cased
-
-We are sharing smaller versions of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handle a custom number of languages.
-
-Unlike [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), our versions give exactly the same representations produced by the original model which preserves the original accuracy.
-
-For more information please visit our paper: [Load What You Need: Smaller Versions of Multilingual BERT](https://www.aclweb.org/anthology/2020.sustainlp-1.16.pdf).
-
-## How to use
-
-```python
-from transformers import AutoTokenizer, AutoModel
-
-tokenizer = AutoTokenizer.from_pretrained("Geotrend/bert-base-en-el-cased")
-model = AutoModel.from_pretrained("Geotrend/bert-base-en-el-cased")
-
-```
-
-To generate other smaller versions of multilingual transformers please visit [our Github repo](https://github.com/Geotrend-research/smaller-transformers).
-
-### How to cite
-
-```bibtex
-@inproceedings{smallermbert,
-  title={Load What You Need: Smaller Versions of Mutlilingual BERT},
-  author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
-  booktitle={SustaiNLP / EMNLP},
-  year={2020}
-}
-```
-
-## Contact 
-
-Please contact amine@geotrend.fr for any question, feedback or request.