使用来自多个数据集的結合来微调模型,句子对的总数超过10亿个句子。因数据较多,这里仅用[stsbenchmark](https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/datasets/stsbenchmark.tsv.gz)、[Simple Wikipedia Version 1.0](https://cs.pomona.edu/~dkauchak/simplification/)数据集进行展示,数据集已在`datasets`中提供,详细数据请参考[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)模型中的Model card。
因数据较多,这里仅用[Simple Wikipedia Version 1.0](https://cs.pomona.edu/~dkauchak/simplification/)数据集进行展示,数据集已在 datasets/simple_wikipedia_v1 中提供,详细数据请参考[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)模型中的Model card。
This examples trains BERT (or any other transformer model like RoBERTa, DistilBERT etc.) for the STSbenchmark from scratch. It generates sentence embeddings
that can be compared using cosine-similarity to measure the similarity.