Commit ac66df47 authored by Myle Ott's avatar Myle Ott Committed by Facebook Github Bot
Browse files

Update README

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/826

Differential Revision: D16830402

Pulled By: myleott

fbshipit-source-id: 25afaa6d9de7b51cc884e3f417c8e6b349f5a7bc
parent 1d44cc85
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
https://arxiv.org/abs/1907.11692 https://arxiv.org/abs/1907.11692
### Introduction ## Introduction
RoBERTa iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data. See the associated paper for more details. RoBERTa iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data. See the associated paper for more details.
...@@ -10,7 +10,7 @@ RoBERTa iterates on BERT's pretraining procedure, including training the model l ...@@ -10,7 +10,7 @@ RoBERTa iterates on BERT's pretraining procedure, including training the model l
- August 2019: Added [tutorial for pretraining RoBERTa using your own data](README.pretraining.md). - August 2019: Added [tutorial for pretraining RoBERTa using your own data](README.pretraining.md).
### Pre-trained models ## Pre-trained models
Model | Description | # params | Download Model | Description | # params | Download
---|---|---|--- ---|---|---|---
...@@ -19,9 +19,10 @@ Model | Description | # params | Download ...@@ -19,9 +19,10 @@ Model | Description | # params | Download
`roberta.large.mnli` | `roberta.large` finetuned on [MNLI](http://www.nyu.edu/projects/bowman/multinli) | 355M | [roberta.large.mnli.tar.gz](https://dl.fbaipublicfiles.com/fairseq/models/roberta.large.mnli.tar.gz) `roberta.large.mnli` | `roberta.large` finetuned on [MNLI](http://www.nyu.edu/projects/bowman/multinli) | 355M | [roberta.large.mnli.tar.gz](https://dl.fbaipublicfiles.com/fairseq/models/roberta.large.mnli.tar.gz)
`roberta.large.wsc` | `roberta.large` finetuned on [WSC](wsc/README.md) | 355M | [roberta.large.wsc.tar.gz](https://dl.fbaipublicfiles.com/fairseq/models/roberta.large.wsc.tar.gz) `roberta.large.wsc` | `roberta.large` finetuned on [WSC](wsc/README.md) | 355M | [roberta.large.wsc.tar.gz](https://dl.fbaipublicfiles.com/fairseq/models/roberta.large.wsc.tar.gz)
### Results ## Results
##### Results on GLUE tasks (dev set, single model, single-task finetuning) **[GLUE (Wang et al., 2019)](https://gluebenchmark.com/)**
_(dev set, single model, single-task finetuning)_
Model | MNLI | QNLI | QQP | RTE | SST-2 | MRPC | CoLA | STS-B Model | MNLI | QNLI | QQP | RTE | SST-2 | MRPC | CoLA | STS-B
---|---|---|---|---|---|---|---|--- ---|---|---|---|---|---|---|---|---
...@@ -29,26 +30,51 @@ Model | MNLI | QNLI | QQP | RTE | SST-2 | MRPC | CoLA | STS-B ...@@ -29,26 +30,51 @@ Model | MNLI | QNLI | QQP | RTE | SST-2 | MRPC | CoLA | STS-B
`roberta.large` | 90.2 | 94.7 | 92.2 | 86.6 | 96.4 | 90.9 | 68.0 | 92.4 `roberta.large` | 90.2 | 94.7 | 92.2 | 86.6 | 96.4 | 90.9 | 68.0 | 92.4
`roberta.large.mnli` | 90.2 | - | - | - | - | - | - | - `roberta.large.mnli` | 90.2 | - | - | - | - | - | - | -
##### Results on SuperGLUE tasks (dev set, single model, single-task finetuning) **[SuperGLUE (Wang et al., 2019)](https://super.gluebenchmark.com/)**
_(dev set, single model, single-task finetuning)_
Model | BoolQ | CB | COPA | MultiRC | RTE | WiC | WSC Model | BoolQ | CB | COPA | MultiRC | RTE | WiC | WSC
---|---|---|---|---|---|---|--- ---|---|---|---|---|---|---|---
`roberta.large` | 86.9 | 98.2 | 94.0 | 85.7 | 89.5 | 75.6 | - `roberta.large` | 86.9 | 98.2 | 94.0 | 85.7 | 89.5 | 75.6 | -
`roberta.large.wsc` | - | - | - | - | - | - | 91.3 `roberta.large.wsc` | - | - | - | - | - | - | 91.3
##### Results on SQuAD (dev set) **[SQuAD (Rajpurkar et al., 2018)](https://rajpurkar.github.io/SQuAD-explorer/)**
_(dev set, no additional data used)_
Model | SQuAD 1.1 EM/F1 | SQuAD 2.0 EM/F1 Model | SQuAD 1.1 EM/F1 | SQuAD 2.0 EM/F1
---|---|--- ---|---|---
`roberta.large` | 88.9/94.6 | 86.5/89.4 `roberta.large` | 88.9/94.6 | 86.5/89.4
##### Results on Reading Comprehension (RACE, test set) **[RACE (Lai et al., 2017)](http://www.qizhexie.com/data/RACE_leaderboard.html)**
_(test set)_
Model | Accuracy | Middle | High Model | Accuracy | Middle | High
---|---|---|--- ---|---|---|---
`roberta.large` | 83.2 | 86.5 | 81.3 `roberta.large` | 83.2 | 86.5 | 81.3
### Example usage **[HellaSwag (Zellers et al., 2019)](https://rowanzellers.com/hellaswag/)**
_(test set)_
Model | Overall | In-domain | Zero-shot | ActivityNet | WikiHow
---|---|---|---|---|---
`roberta.large` | 85.2 | 87.3 | 83.1 | 74.6 | 90.9
**[Commonsense QA (Talmor et al., 2019)](https://www.tau-nlp.org/commonsenseqa)**
_(test set)_
Model | Accuracy
---|---
`roberta.large` (single model) | 72.1
`roberta.large` (ensemble) | 72.5
**[Winogrande (Sakaguchi et al., 2019)](https://arxiv.org/abs/1907.10641)**
_(test set)_
Model | Accuracy
---|---
`roberta.large` | 78.1
## Example usage
##### Load RoBERTa from torch.hub (PyTorch >= 1.1): ##### Load RoBERTa from torch.hub (PyTorch >= 1.1):
```python ```python
...@@ -124,7 +150,7 @@ roberta.cuda() ...@@ -124,7 +150,7 @@ roberta.cuda()
roberta.predict('new_task', tokens) # tensor([[-1.1050, -1.0672, -1.1245]], device='cuda:0', grad_fn=<LogSoftmaxBackward>) roberta.predict('new_task', tokens) # tensor([[-1.1050, -1.0672, -1.1245]], device='cuda:0', grad_fn=<LogSoftmaxBackward>)
``` ```
### Advanced usage ## Advanced usage
#### Filling masks: #### Filling masks:
...@@ -216,7 +242,7 @@ print('| Accuracy: ', float(ncorrect)/float(nsamples)) ...@@ -216,7 +242,7 @@ print('| Accuracy: ', float(ncorrect)/float(nsamples))
# Expected output: 0.9060 # Expected output: 0.9060
``` ```
### Finetuning ## Finetuning
- [Finetuning on GLUE](README.glue.md) - [Finetuning on GLUE](README.glue.md)
- [Finetuning on custom classification tasks (e.g., IMDB)](README.custom_classification.md) - [Finetuning on custom classification tasks (e.g., IMDB)](README.custom_classification.md)
...@@ -224,11 +250,11 @@ print('| Accuracy: ', float(ncorrect)/float(nsamples)) ...@@ -224,11 +250,11 @@ print('| Accuracy: ', float(ncorrect)/float(nsamples))
- [Finetuning on Commonsense QA (CQA)](commonsense_qa/README.md) - [Finetuning on Commonsense QA (CQA)](commonsense_qa/README.md)
- Finetuning on SQuAD: coming soon - Finetuning on SQuAD: coming soon
### Pretraining using your own data ## Pretraining using your own data
See the [tutorial for pretraining RoBERTa using your own data](README.pretraining.md). See the [tutorial for pretraining RoBERTa using your own data](README.pretraining.md).
### Citation ## Citation
```bibtex ```bibtex
@article{liu2019roberta, @article{liu2019roberta,
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
This tutorial will walk you through pretraining RoBERTa over your own data. This tutorial will walk you through pretraining RoBERTa over your own data.
### 1) Preprocess the data. ### 1) Preprocess the data
Data should be preprocessed following the [language modeling format](/examples/language_model). Data should be preprocessed following the [language modeling format](/examples/language_model).
......
...@@ -11,45 +11,57 @@ Model | Description | Dataset | Download ...@@ -11,45 +11,57 @@ Model | Description | Dataset | Download
## Training a new model on WMT'16 En-De ## Training a new model on WMT'16 En-De
Please first download the [preprocessed WMT'16 En-De data provided by Google](https://drive.google.com/uc?export=download&id=0B_bZck-ksdkpM25jRUN2X2UxMm8). First download the [preprocessed WMT'16 En-De data provided by Google](https://drive.google.com/uc?export=download&id=0B_bZck-ksdkpM25jRUN2X2UxMm8).
Then: Then:
1. Extract the WMT'16 En-De data: ##### 1. Extract the WMT'16 En-De data
```bash ```bash
TEXT=wmt16_en_de_bpe32k TEXT=wmt16_en_de_bpe32k
mkdir -p $TEXT mkdir -p $TEXT
tar -xzvf wmt16_en_de.tar.gz -C $TEXT tar -xzvf wmt16_en_de.tar.gz -C $TEXT
``` ```
2. Preprocess the dataset with a joined dictionary: ##### 2. Preprocess the dataset with a joined dictionary
```bash ```bash
fairseq-preprocess --source-lang en --target-lang de \ fairseq-preprocess \
--source-lang en --target-lang de \
--trainpref $TEXT/train.tok.clean.bpe.32000 \ --trainpref $TEXT/train.tok.clean.bpe.32000 \
--validpref $TEXT/newstest2013.tok.bpe.32000 \ --validpref $TEXT/newstest2013.tok.bpe.32000 \
--testpref $TEXT/newstest2014.tok.bpe.32000 \ --testpref $TEXT/newstest2014.tok.bpe.32000 \
--destdir data-bin/wmt16_en_de_bpe32k \ --destdir data-bin/wmt16_en_de_bpe32k \
--nwordssrc 32768 --nwordstgt 32768 \ --nwordssrc 32768 --nwordstgt 32768 \
--joined-dictionary --joined-dictionary \
--workers 20
``` ```
3. Train a model: ##### 3. Train a model
```bash ```bash
fairseq-train data-bin/wmt16_en_de_bpe32k \ fairseq-train \
data-bin/wmt16_en_de_bpe32k \
--arch transformer_vaswani_wmt_en_de_big --share-all-embeddings \ --arch transformer_vaswani_wmt_en_de_big --share-all-embeddings \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \ --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
--lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 \ --lr 0.0005 --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr 1e-07 \
--lr 0.0005 --min-lr 1e-09 \ --dropout 0.3 --weight-decay 0.0 \
--dropout 0.3 --weight-decay 0.0 --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
--max-tokens 3584 \ --max-tokens 3584 \
--fp16 --fp16
``` ```
Note that the `--fp16` flag requires you have CUDA 9.1 or greater and a Volta GPU. Note that the `--fp16` flag requires you have CUDA 9.1 or greater and a Volta GPU or newer.
If you want to train the above model with big batches (assuming your machine has 8 GPUs): If you want to train the above model with big batches (assuming your machine has 8 GPUs):
- add `--update-freq 16` to simulate training on 8*16=128 GPUs - add `--update-freq 16` to simulate training on 8x16=128 GPUs
- increase the learning rate; 0.001 works well for big batches - increase the learning rate; 0.001 works well for big batches
##### 4. Evaluate
```bash
fairseq-generate \
data-bin/wmt16_en_de_bpe32k \
--path checkpoints/checkpoint_best.pt \
--beam 4 --lenpen 0.6 --remove-bpe
```
## Citation ## Citation
```bibtex ```bibtex
......
# Neural Machine Translation # Neural Machine Translation
This README contains instructions for [using pretrained translation models](#example-usage-torchhub)
as well as [training new models](#training-a-new-model).
## Pre-trained models ## Pre-trained models
Model | Description | Dataset | Download Model | Description | Dataset | Download
...@@ -56,132 +59,119 @@ fairseq-score --sys /tmp/gen.out.sys --ref /tmp/gen.out.ref ...@@ -56,132 +59,119 @@ fairseq-score --sys /tmp/gen.out.sys --ref /tmp/gen.out.ref
# BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787) # BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787)
``` ```
## Preprocessing ## Training a new model
These scripts provide an example of pre-processing data for the NMT task.
### prepare-iwslt14.sh ### IWSLT'14 German to English (Transformer)
Provides an example of pre-processing for IWSLT'14 German to English translation task: ["Report on the 11th IWSLT evaluation campaign" by Cettolo et al.](http://workshop2014.iwslt.org/downloads/proceeding.pdf) The following instructions can be used to train a Transformer model on the [IWSLT'14 German to English dataset](http://workshop2014.iwslt.org/downloads/proceeding.pdf).
Example usage: First download and preprocess the data:
```bash ```bash
# Download and prepare the data
cd examples/translation/ cd examples/translation/
bash prepare-iwslt14.sh bash prepare-iwslt14.sh
cd ../.. cd ../..
# Binarize the dataset: # Preprocess/binarize the data
TEXT=examples/translation/iwslt14.tokenized.de-en TEXT=examples/translation/iwslt14.tokenized.de-en
fairseq-preprocess --source-lang de --target-lang en \ fairseq-preprocess --source-lang de --target-lang en \
--trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \ --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
--destdir data-bin/iwslt14.tokenized.de-en --destdir data-bin/iwslt14.tokenized.de-en \
--workers 20
```
# Train the model (better for a single GPU setup): Next we'll train a Transformer translation model over this data:
mkdir -p checkpoints/fconv ```bash
CUDA_VISIBLE_DEVICES=0 fairseq-train data-bin/iwslt14.tokenized.de-en \ CUDA_VISIBLE_DEVICES=0 fairseq-train \
--lr 0.25 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 \ data-bin/iwslt14.tokenized.de-en \
--arch transformer_iwslt_de_en --share-decoder-input-output-embed \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
--lr 5e-4 --lr-scheduler inverse_sqrt --warmup-updates 4000 \
--dropout 0.3 --weight-decay 0.0001 \
--criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
--lr-scheduler fixed --force-anneal 200 \ --max-tokens 4096
--arch fconv_iwslt_de_en --save-dir checkpoints/fconv
# Generate:
fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/fconv/checkpoint_best.pt \
--batch-size 128 --beam 5 --remove-bpe
``` ```
To train transformer model on IWSLT'14 German to English: Finally we can evaluate our trained model:
```bash ```bash
# Preparation steps are the same as for fconv model.
# Train the model (better for a single GPU setup):
mkdir -p checkpoints/transformer
CUDA_VISIBLE_DEVICES=0 fairseq-train data-bin/iwslt14.tokenized.de-en \
-a transformer_iwslt_de_en --optimizer adam --lr 0.0005 -s de -t en \
--label-smoothing 0.1 --dropout 0.3 --max-tokens 4000 \
--min-lr '1e-09' --lr-scheduler inverse_sqrt --weight-decay 0.0001 \
--criterion label_smoothed_cross_entropy --max-update 50000 \
--warmup-updates 4000 --warmup-init-lr '1e-07' \
--adam-betas '(0.9, 0.98)' --save-dir checkpoints/transformer
# Average 10 latest checkpoints:
python scripts/average_checkpoints.py --inputs checkpoints/transformer \
--num-epoch-checkpoints 10 --output checkpoints/transformer/model.pt
# Generate:
fairseq-generate data-bin/iwslt14.tokenized.de-en \ fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/transformer/model.pt \ --path checkpoints/checkpoint_best.pt \
--batch-size 128 --beam 5 --remove-bpe --batch-size 128 --beam 5 --remove-bpe
``` ```
### prepare-wmt14en2de.sh ### WMT'14 English to German (Convolutional)
The WMT English to German dataset can be preprocessed using the `prepare-wmt14en2de.sh` script.
By default it will produce a dataset that was modeled after ["Attention Is All You Need" (Vaswani et al., 2017)](https://arxiv.org/abs/1706.03762), but with news-commentary-v12 data from WMT'17.
To use only data available in WMT'14 or to replicate results obtained in the original ["Convolutional Sequence to Sequence Learning" (Gehring et al., 2017)](https://arxiv.org/abs/1705.03122) paper, please use the `--icml17` option. The following instructions can be used to train a Convolutional translation model on the WMT English to German dataset.
See the [Scaling NMT README](../scaling_nmt/README.md) for instructions to train a Transformer translation model on this data.
```bash The WMT English to German dataset can be preprocessed using the `prepare-wmt14en2de.sh` script.
bash prepare-wmt14en2de.sh --icml17 By default it will produce a dataset that was modeled after [Attention Is All You Need (Vaswani et al., 2017)](https://arxiv.org/abs/1706.03762), but with additional news-commentary-v12 data from WMT'17.
```
Example usage: To use only data available in WMT'14 or to replicate results obtained in the original [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](https://arxiv.org/abs/1705.03122) paper, please use the `--icml17` option.
```bash ```bash
# Download and prepare the data
cd examples/translation/ cd examples/translation/
# WMT'17 data:
bash prepare-wmt14en2de.sh bash prepare-wmt14en2de.sh
# or to use WMT'14 data:
# bash prepare-wmt14en2de.sh --icml17
cd ../.. cd ../..
# Binarize the dataset: # Binarize the dataset
TEXT=examples/translation/wmt17_en_de TEXT=examples/translation/wmt17_en_de
fairseq-preprocess --source-lang en --target-lang de \ fairseq-preprocess \
--source-lang en --target-lang de \
--trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \ --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
--destdir data-bin/wmt17_en_de --thresholdtgt 0 --thresholdsrc 0 --destdir data-bin/wmt17_en_de --thresholdtgt 0 --thresholdsrc 0 \
--workers 20
# Train the model: # Train the model
# If it runs out of memory, try to set --max-tokens 1500 instead
mkdir -p checkpoints/fconv_wmt_en_de mkdir -p checkpoints/fconv_wmt_en_de
fairseq-train data-bin/wmt17_en_de \ fairseq-train \
data-bin/wmt17_en_de \
--arch fconv_wmt_en_de \
--lr 0.5 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 \ --lr 0.5 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 \
--criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
--lr-scheduler fixed --force-anneal 50 \ --lr-scheduler fixed --force-anneal 50 \
--arch fconv_wmt_en_de --save-dir checkpoints/fconv_wmt_en_de --save-dir checkpoints/fconv_wmt_en_de
# Generate: # Evaluate
fairseq-generate data-bin/wmt17_en_de \ fairseq-generate data-bin/wmt17_en_de \
--path checkpoints/fconv_wmt_en_de/checkpoint_best.pt --beam 5 --remove-bpe --path checkpoints/fconv_wmt_en_de/checkpoint_best.pt \
--beam 5 --remove-bpe
``` ```
### prepare-wmt14en2fr.sh ### WMT'14 English to French
Provides an example of pre-processing for the WMT'14 English to French translation task.
Example usage:
```bash ```bash
# Download and prepare the data
cd examples/translation/ cd examples/translation/
bash prepare-wmt14en2fr.sh bash prepare-wmt14en2fr.sh
cd ../.. cd ../..
# Binarize the dataset: # Binarize the dataset
TEXT=examples/translation/wmt14_en_fr TEXT=examples/translation/wmt14_en_fr
fairseq-preprocess --source-lang en --target-lang fr \ fairseq-preprocess \
--source-lang en --target-lang fr \
--trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \ --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
--destdir data-bin/wmt14_en_fr --thresholdtgt 0 --thresholdsrc 0 --destdir data-bin/wmt14_en_fr --thresholdtgt 0 --thresholdsrc 0 \
--workers 60
# Train the model: # Train the model
# If it runs out of memory, try to set --max-tokens 1000 instead
mkdir -p checkpoints/fconv_wmt_en_fr mkdir -p checkpoints/fconv_wmt_en_fr
fairseq-train data-bin/wmt14_en_fr \ fairseq-train \
data-bin/wmt14_en_fr \
--lr 0.5 --clip-norm 0.1 --dropout 0.1 --max-tokens 3000 \ --lr 0.5 --clip-norm 0.1 --dropout 0.1 --max-tokens 3000 \
--criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
--lr-scheduler fixed --force-anneal 50 \ --lr-scheduler fixed --force-anneal 50 \
--arch fconv_wmt_en_fr --save-dir checkpoints/fconv_wmt_en_fr --arch fconv_wmt_en_fr \
--save-dir checkpoints/fconv_wmt_en_fr
# Generate:
fairseq-generate data-bin/fconv_wmt_en_fr \ # Evaluate
--path checkpoints/fconv_wmt_en_fr/checkpoint_best.pt --beam 5 --remove-bpe fairseq-generate \
data-bin/fconv_wmt_en_fr \
--path checkpoints/fconv_wmt_en_fr/checkpoint_best.pt \
--beam 5 --remove-bpe
``` ```
## Multilingual Translation ## Multilingual Translation
...@@ -253,7 +243,8 @@ grep ^H iwslt17.test.${SRC}-en.en.sys | cut -f3 \ ...@@ -253,7 +243,8 @@ grep ^H iwslt17.test.${SRC}-en.en.sys | cut -f3 \
| sacrebleu --test-set iwslt17 --language-pair ${SRC}-en | sacrebleu --test-set iwslt17 --language-pair ${SRC}-en
``` ```
### Argument format during inference ##### Argument format during inference
During inference it is required to specify a single `--source-lang` and During inference it is required to specify a single `--source-lang` and
`--target-lang`, which indicates the inference langauge direction. `--target-lang`, which indicates the inference langauge direction.
`--lang-pairs`, `--encoder-langtok`, `--decoder-langtok` have to be set to `--lang-pairs`, `--encoder-langtok`, `--decoder-langtok` have to be set to
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment