README.md 900 Bytes
Newer Older
Txus's avatar
Txus committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
language: catalan
---

# CALBERT: a Catalan Language Model

## Introduction

CALBERT is an open-source language model for聽Catalan based on the聽ALBERT architecture. 

It is now available on Hugging Face in its `base-uncased` version, and was pretrained on the [OSCAR dataset](https://traces1.inria.fr/oscar/).

For further information or requests, please go to the [GitHub repository](https://github.com/codegram/calbert)

## Pre-trained models

| Model                               |  Arch.           | Training data                     |
|-------------------------------------|------------------|-----------------------------------|
| `codegram` / `calbert-base-uncased` |  Base (uncased)  | OSCAR (4.3 GB of text)            |


## Authors 

CALBERT was trained and evaluated by [Txus Bach](https://twitter.com/txustice), as part of [Codegram](https://www.codegram.com)'s applied research.