"...git@developer.sourcefind.cn:OpenDAS/ktransformers.git" did not exist on "e5694f91c0afbf3b7aa7ffda32cb8170cad18fc1"
Unverified Commit fb7330b3 authored by Jim Regan's avatar Jim Regan Committed by GitHub
Browse files

update with #s of sentences/tokens (#6546)

parent 63144701
...@@ -15,6 +15,8 @@ tags: ...@@ -15,6 +15,8 @@ tags:
* Newscrawl 300k portion of the [Leipzig Corpora](https://wortschatz.uni-leipzig.de/en/download/irish) * Newscrawl 300k portion of the [Leipzig Corpora](https://wortschatz.uni-leipzig.de/en/download/irish)
* Private news corpus crawled with [Corpus Crawler](https://github.com/google/corpuscrawler) * Private news corpus crawled with [Corpus Crawler](https://github.com/google/corpuscrawler)
(2125804 sentences, 47419062 tokens, as reckoned by wc)
``` ```
from transformers import pipeline from transformers import pipeline
fill_mask = pipeline("fill-mask", model="jimregan/BERTreach", tokenizer="jimregan/BERTreach") fill_mask = pipeline("fill-mask", model="jimregan/BERTreach", tokenizer="jimregan/BERTreach")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment