*This model was released on 2020-04-06 and added to Hugging Face Transformers on 2020-11-16.*
PyTorch
# MobileBERT [MobileBERT](https://huggingface.co/papers/2004.02984) is a lightweight and efficient variant of BERT, specifically designed for resource-limited devices such as mobile phones. It retains BERT's architecture but significantly reduces model size and inference latency while maintaining strong performance on NLP tasks. MobileBERT achieves this through a bottleneck structure and carefully balanced self-attention and feedforward networks. The model is trained by knowledge transfer from a large BERT model with an inverted bottleneck structure. You can find the original MobileBERT checkpoint under the [Google](https://huggingface.co/google/mobilebert-uncased) organization. > [!TIP] > Click on the MobileBERT models in the right sidebar for more examples of how to apply MobileBERT to different language tasks. The example below demonstrates how to predict the `[MASK]` token with [`Pipeline`], [`AutoModel`], and from the command line. ```py import torch from transformers import pipeline pipeline = pipeline( task="fill-mask", model="google/mobilebert-uncased", dtype=torch.float16, device=0 ) pipeline("The capital of France is [MASK].") ``` ```py import torch from transformers import AutoModelForMaskedLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained( "google/mobilebert-uncased", ) model = AutoModelForMaskedLM.from_pretrained( "google/mobilebert-uncased", dtype=torch.float16, device_map="auto", ) inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model(**inputs) predictions = outputs.logits masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1] predicted_token_id = predictions[0, masked_index].argmax(dim=-1) predicted_token = tokenizer.decode(predicted_token_id) print(f"The predicted token is: {predicted_token}") ``` ```bash echo -e "The capital of France is [MASK]." | transformers run --task fill-mask --model google/mobilebert-uncased --device 0 ``` ## Notes - Inputs should be padded on the right because BERT uses absolute position embeddings. ## MobileBertConfig [[autodoc]] MobileBertConfig ## MobileBertTokenizer [[autodoc]] MobileBertTokenizer ## MobileBertTokenizerFast [[autodoc]] MobileBertTokenizerFast ## MobileBert specific outputs [[autodoc]] models.mobilebert.modeling_mobilebert.MobileBertForPreTrainingOutput ## MobileBertModel [[autodoc]] MobileBertModel - forward ## MobileBertForPreTraining [[autodoc]] MobileBertForPreTraining - forward ## MobileBertForMaskedLM [[autodoc]] MobileBertForMaskedLM - forward ## MobileBertForNextSentencePrediction [[autodoc]] MobileBertForNextSentencePrediction - forward ## MobileBertForSequenceClassification [[autodoc]] MobileBertForSequenceClassification - forward ## MobileBertForMultipleChoice [[autodoc]] MobileBertForMultipleChoice - forward ## MobileBertForTokenClassification [[autodoc]] MobileBertForTokenClassification - forward ## MobileBertForQuestionAnswering [[autodoc]] MobileBertForQuestionAnswering - forward