Commit 18ff696d authored by chenzk's avatar chenzk
Browse files

v1.0.5

parents
Pipeline #2036 failed with stages
in 0 seconds
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.vscode
checkpoints/
wandb/
repos:
- repo: https://github.com/psf/black
rev: 22.12.0
hooks:
- id: black
language_version: python3
args:
- --line-length=119
- --check
- repo: https://github.com/charliermarsh/ruff-pre-commit
# Ruff version.
rev: 'v0.0.271'
hooks:
- id: ruff
args:
- --no-fix
- repo: local
hooks:
- id: pylint-nanotron
name: pylint nanotron core
entry: pylint --init-hook='import sys; sys.path.append(".")'
exclude: ^examples/.*$ # ignore examples as for each example we need to go in and look
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-dataloading
name: pylint example dataloading
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/dataloading/.*$
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-gpt2-mqa
name: pylint example gpt2_mqa
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/gpt2_mqa/.*$
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-gpt2
name: pylint example gpt2
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/gpt2/.*$
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-llama
name: pylint example llama
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/llama/.*$
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-p2p
name: pylint example p2p
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/p2p/.*$
language: system
types: [ python ]
args:
- --errors-only
- id: pylint-example-t5
name: pylint example t5
entry: pylint --init-hook='import sys; sys.path.append(".")'
files: ^examples/t5/.*$
language: system
types: [ python ]
args:
- --errors-only
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/psf/black
rev: 22.12.0
hooks:
- id: black
language_version: python3
args:
- --line-length=119
- repo: https://github.com/charliermarsh/ruff-pre-commit
# Ruff version.
rev: 'v0.0.271'
hooks:
- id: ruff
args:
- --fix
- --exit-non-zero-on-fix
# - repo: https://github.com/PyCQA/isort
# rev: 5.12.0
# hooks:
# - id: isort
# args:
# - --profile=black
# - --skip-glob=wandb/**/*
# - --thirdparty=wandb
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
args:
- -w
- --ignore-words-list=nd,reacher,thist,ths,magent,ba,fo,doesnt
[MASTER]
# Use multiprocessing for pylint
jobs=0
# List of members which are set dynamically and missed by Pylint inference
# system, and so shouldn't trigger E1101 when accessed.
ignore-paths=
load-plugins=linter.pylint.ban_rank,
[MESSAGES CONTROL]
# Disable list of rules
disable=
no-member, # E1101: Module 'torch' has no 'allclose' member (no-member)
no-name-in-module, # E0611: No name 'HFTensorBoardLogger' in module 'huggingface_hub' (no-name-in-module)
import-error, # E0401: Unable to import 'tensorboardX' (import-error)
relative-beyond-top-level # E0402: Attempted relative import beyond top-level package (relative-beyond-top-level)
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
feedback@huggingface.co.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.
<!---
Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# How to contribute to 🤗 Nanotron?
Everyone is welcome to contribute, and we value everybody's contribution. Code
is thus not the only way to help the community. Answering questions, helping
others, reaching out and improving the documentations are immensely valuable to
the community.
It also helps us if you spread the word: reference the library from blog posts
on the awesome projects it made possible, shout out on Twitter every time it has
helped you, or simply star the repo to say "thank you".
Whichever way you choose to contribute, please be mindful to respect our
[code of conduct](CODE_OF_CONDUCT.md).
## You can contribute in so many ways!
Some of the ways you can contribute to nanotron:
* Fixing outstanding issues with the existing code;
* Contributing to the examples or to the documentation;
* Submitting issues related to bugs or desired new features.
## Submitting a new issue or feature request
Do your best to follow these guidelines when submitting an issue or a feature
request. It will make it easier for us to come back to you quickly and with good
feedback.
### Did you find a bug?
The 🤗 Nanotron library is robust and reliable thanks to the users who notify us of
the problems they encounter. So thank you for reporting an issue.
First, we would really appreciate it if you could **make sure the bug was not
already reported** (use the search bar on Github under Issues).
Did not find it? :( So we can act quickly on it, please follow these steps:
* Include your **OS type and version**, the versions of **Python** and **PyTorch**.
* A short, self-contained, code snippet that allows us to reproduce the bug in
less than 30s;
* Provide your Nanotron configuration used for the run;
* Describe the expected behavior and the actual behavior;
### Do you want a new feature?
A good feature request addresses the following points:
1. Motivation first:
* Is it related to a problem/frustration with the library? If so, please explain
why. Providing a code snippet that demonstrates the problem is best.
* Is it related to something you would need for a project? We'd love to hear
about it!
* Is it something you worked on and think could benefit the community?
Awesome! Tell us what problem it solved for you.
2. Write a *full paragraph* describing the feature;
3. Provide a **code snippet** that demonstrates its future use;
4. In case this is related to a paper, please attach a link;
5. Attach any additional information (drawings, screenshots, etc.) you think may help.
If your issue is well written we're already 80% of the way there by the time you
post it.
## Submitting a pull request (PR)
Before writing code, we strongly advise you to search through the existing PRs or
issues to make sure that nobody is already working on the same thing. If you are
unsure, it is always a good idea to open an issue to get some feedback.
You will need basic `git` proficiency to be able to contribute to
🤗 Nanotron. `git` is not the easiest tool to use but it has the greatest
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
Follow these steps to start contributing:
1. Fork the [repository](https://github.com/huggingface/nanotron) by
clicking on the 'Fork' button on the repository's page. This creates a copy of the code
under your GitHub user account.
2. Clone your fork to your local disk, and add the base repository as a remote. The following command
assumes you have your public SSH key uploaded to GitHub. See the following guide for more
[information](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository).
```bash
$ git clone git@github.com:<your Github handle>/nanotron.git
$ cd nanotron
$ git remote add upstream https://github.com/huggingface/nanotron.git
```
3. Create a new branch to hold your development changes, and do this for every new PR you work on.
Start by synchronizing your `main` branch with the `upstream/main` branch (ore details in the [GitHub Docs](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/syncing-a-fork)):
```bash
$ git checkout main
$ git fetch upstream
$ git merge upstream/main
```
Once your `main` branch is synchronized, create a new branch from it:
```bash
$ git checkout -b a-descriptive-name-for-my-changes
```
**Do not** work on the `main` branch.
4. Set up a development environment by running the following command in a conda or a virtual environment you've created for working on this library:
```bash
$ pip install -e ".[dev]"
$ pip install -e ".[test]"
$ pre-commit install
```
(If nanotron was already installed in the virtual environment, remove
it with `pip uninstall nanotron` before reinstalling it in editable
mode with the `-e` flag.)
Alternatively, if you are using [Visual Studio Code](https://code.visualstudio.com/Download), the fastest way to get set up is by using
the provided Dev Container. Documentation on how to get started with dev containers is available [here](https://code.visualstudio.com/docs/remote/containers).
5. Develop the features on your branch.
As you work on the features, you should make sure that the test suite
passes. You should run the tests impacted by your changes like this (see
below an explanation regarding the environment variable):
```bash
$ pytest tests/<TEST_TO_RUN>.py
```
`nanotron` relies on `ruff` to format its source code
consistently. After you make changes, apply automatic style corrections and code verifications
that can't be automated in one go with:
This target is also optimized to only work with files modified by the PR you're working on.
If you prefer to run the checks one after the other, the following command apply the
style corrections:
```bash
$ pre-commit run --all-files
```
Once you're happy with your changes, add changed files using `git add` and
make a commit with `git commit` to record your changes locally:
```bash
$ git add modified_file.py
$ git commit
```
Please write [good commit messages](https://chris.beams.io/posts/git-commit/).
It is a good idea to sync your copy of the code with the original
repository regularly. This way you can quickly account for changes:
```bash
$ git fetch upstream
$ git rebase upstream/main
```
Push the changes to your account using:
```bash
$ git push -u origin a-descriptive-name-for-my-changes
```
6. Once you are satisfied (**and the checklist below is happy too**), go to the
webpage of your fork on GitHub. Click on 'Pull request' to send your changes
to the project maintainers for review.
7. It's ok if maintainers ask you for changes. It happens to core contributors
too! So everyone can see the changes in the Pull request, work in your local
branch and push the changes to your fork. They will automatically appear in
the pull request.
### Checklist
1. The title of your pull request should be a summary of its contribution;
2. If your pull request addresses an issue, please mention the issue number in
the pull request description to make sure they are linked (and people
consulting the issue know you are working on it);
3. To indicate a work in progress please prefix the title with `[WIP]`, or mark
the PR as a draft PR. These are useful to avoid duplicated work, and to differentiate
it from PRs ready to be merged;
4. Make sure existing tests pass;
5. Add high-coverage tests. No quality testing = no merge.
See an example of a good PR here: https://github.com/huggingface/nanotron/pull/155
### Tests
An extensive test suite is included to test the library behavior and several examples. Library tests can be found in
the [tests folder](https://github.com/huggingface/nanotron/tree/main/tests).
We use `pytest` in order to run the tests. From the root of the
repository, here's how to run tests with `pytest` for the library:
```bash
# Runs all tests (where 12 of which run in parallel)
$ pytest -n 12 tests
```
You can specify a smaller set of tests in order to test only the feature
you're working on.
---
library_name: transformers
datasets:
- HuggingFaceTB/cosmo2_training_data_subset_1M
---
# cosmo2-tokenizer
Tokenizer for the training of cosmo2. This tokenizer was trained on 1M samples from:
- FineWeb-Edu 70%
- Cosmopedia v2 15%
- StarCoderData 8%
- OpenWebMath 5%
- StackOverFlow 2%
\ No newline at end of file
This diff is collapsed.
{
"additional_special_tokens": [
"<|endoftext|>",
"<|im_start|>",
"<|im_end|>",
"<repo_name>",
"<reponame>",
"<file_sep>",
"<filename>",
"<gh_stars>",
"<issue_start>",
"<issue_comment>",
"<issue_closed>",
"<jupyter_start>",
"<jupyter_text>",
"<jupyter_code>",
"<jupyter_output>",
"<jupyter_script>",
"<empty_output>"
],
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"unk_token": "<|endoftext|>"
}
This diff is collapsed.
{
"add_prefix_space": false,
"added_tokens_decoder": {
"0": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"3": {
"content": "<repo_name>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"4": {
"content": "<reponame>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"5": {
"content": "<file_sep>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"6": {
"content": "<filename>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"7": {
"content": "<gh_stars>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"8": {
"content": "<issue_start>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"9": {
"content": "<issue_comment>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"10": {
"content": "<issue_closed>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"11": {
"content": "<jupyter_start>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"12": {
"content": "<jupyter_text>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"13": {
"content": "<jupyter_code>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"14": {
"content": "<jupyter_output>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"15": {
"content": "<jupyter_script>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"16": {
"content": "<empty_output>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<|endoftext|>",
"<|im_start|>",
"<|im_end|>",
"<repo_name>",
"<reponame>",
"<file_sep>",
"<filename>",
"<gh_stars>",
"<issue_start>",
"<issue_comment>",
"<issue_closed>",
"<jupyter_start>",
"<jupyter_text>",
"<jupyter_code>",
"<jupyter_output>",
"<jupyter_script>",
"<empty_output>"
],
"bos_token": "<|endoftext|>",
"chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<|endoftext|>",
"model_max_length": 1000000000000000019884624838656,
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|endoftext|>",
"vocab_size": 49152
}
This diff is collapsed.
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
This diff is collapsed.
# Run nanotron's tests and examples's tests
test:
pytest \
--color=yes \
--durations=0 \
--ignore tests/fp8 \
--verbose \
tests/
pip install -r examples/doremi/requirements.txt
pytest \
--color=yes \
--durations=0 \
--ignore tests/fp8 \
--verbose \
examples/doremi/tests/
pip install -r examples/llama/requirements.txt
pytest \
--color=yes \
--verbose \
examples/llama/tests/
This diff is collapsed.
# Llama
彻底开源预训练大模型,本项目能够预训练出超出qwen2.5、llama3效果的大语言模型,为一些人工智能大厂的训练代码。
目前各种SOTA NLP大模型算法都与Llama高度相似,故Llama适合作为算法研发的蓝底,本项目首次从数据集、预训练到调优完全开源大模型算法代码,帮助全世界所有算法研究人员共同研究以促进人类文明进步。
<div align=center>
<img src="./doc/llm.png"/>
</div>
## 论文
`Open and Efficient Foundation Language Models`
- https://arxiv.org/pdf/2302.13971
## 模型结构
Llama系列采用极简Decoder-only结构,Llama源自基本的transformer结构,主体为attention(QKV自点积)+ffn(全连接),最后外加一个softmax进行概率转换输出即可,为了使数据分布归一化方便训练收敛,在attention、ffn、softmax前分别再加一个RMS Norm。
总体而言,Llama系列模型结构高度相似,llama1在GPT基础上引入旋转矩阵解决之前绝对位置编码复杂的问题,引入RMSNorm解决LayerNorm计算量大的问题,llama2在llama1的基础上引入GQA进一步减小计算量, llama3在llama2的基础上引入蒸馏、剪枝、量化等再进一步减小计算量,模型中其它模块(如flash-attn2、KV cache)只是增加训练推理效率的模块,本项目兼容Llama系列,以下分别为读者提供Llama的简图和详图帮助读者全方位理解。
<div align=center>
<img src="./doc/llama3.png"/>
</div>
<div align=center>
<img src="./doc/llama3_detail.png"/>
</div>
Facebook官网最原始的llama3请参考代码:[`Llama3`](https://github.com/meta-llama/llama3/blob/main/llama/model.py),本项目中的llama结构在ffn等层上略有修改,其它不同点只是模型规模参数和实现方式,读者若需要纯原版llama3可自行修改。
## 算法原理
整个Llama算法都体现出大道至简的思想。
原理采用极简的纯矩阵计算,llama将输入embedding(将语句根据词汇量和词的位置、属性转换成数字化矩阵)后放入attention+ffn等提取特征,最后利用Softmax将解码器最后一层产生的未经归一化的分数向量(logits)转换为概率分布,其中每个元素表示生成对应词汇的概率,这使得模型可以生成一个分布,并从中选择最可能的词作为预测结果,然后一个字一个预测出来就是咱们看到的对话生成效果。
损失函数采用最简单方便的CE(cross entropy) loss便可。
<div align=center>
<img src="./doc/algorithm.png"/>
</div>
## 环境配置
```
mv nanotron_pytorch nanotron # 去框架名后缀
```
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-py3.10-dtk24.04.3-ubuntu20.04
# <your IMAGE ID>为以上拉取的docker的镜像ID替换,本镜像为:b272aae8ec72
docker run -it --shm-size=64G -v $PWD/nanotron:/home/nanotron -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name llama <your IMAGE ID> bash
cd /home/nanotron
pip install -r requirements.txt
pip install -e . #安装nanotron==0.4库
pip install whl/rotary_emb-0.1.0+das.opt2.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl # 安装rotary_emb==0.1.0
```
### Dockerfile(方法二)
```
cd cd /home/nanotron/docker
docker build --no-cache -t llama:latest .
docker run --shm-size=64G --name llama -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../nanotron:/home/nanotron -it llama bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
cd /home/nanotron
pip install -e . #安装nanotron==0.4库
pip install whl/rotary_emb-0.1.0+das.opt2.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl # 安装rotary_emb==0.1.0
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
- https://developer.hpccube.com/tool/
```
DTK驱动:dtk24.04.3
python:python3.10
torch:2.3.0
torchvision:0.18.1
torchaudio:2.1.2
triton:2.1.0
flash-attn:2.6.1
deepspeed:0.14.2
apex:1.3.0
xformers:0.0.25
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应。`
2、其它非特殊库参照requirements.txt安装
```
cd /home/nanotron
pip install -r requirements.txt
pip install -e . #安装nanotron==0.4库
pip install whl/rotary_emb-0.1.0+das.opt2.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl # 安装rotary_emb==0.1.0
```
## 数据集
实验性的迷你数据集[`openwebtext-10k`](./openwebtext-10k.tar.xz)源于openwebtext,仅供试验,实际训练中读者可在HF下载`*.parquet`开源数据集使用,或将自己的数据集按HF的官方说明制作成此类格式使用。
数据集在训练之前需要用tokenlizer处理成NLP模型的输入tokens,Facebook官方采用tiktoken库制作tockens便可训练出SOTA模型:[`llama3 tokenizer`](https://github.com/meta-llama/llama3/blob/main/llama/tokenizer.py),本项目可根据读者需求自由选择各种HF的开源tokenlizer,将其填写在`config``.yaml`中便可自动被项目调用。
`openwebtext-10k`用于tiny llama预训练示例,[`fineweb-edu-dedup`](http://113.200.138.88:18080/aidatasets/argilla-warehouse/fineweb-edu-dedup-filtered.git) 用于smollm预训练示例(HF公司自研人工智能模型),从SCNet快速下载通道下载后重命名即可,原始`fineweb-edu-dedup`数据(`*.parquet`)可通过以下命令转换成`fineweb-edu-dedup-ds`数据(`*.ds`),`datatrove`制作`*.ds`数据参考[`Nanosets`](./docs/nanoset.md):
```
sh convert_data_to_ds.sh
```
```
datatrove库 bug solve: Exception: Is a directory (os error 21)
vim /usr/local/lib/python3.10/site-packages/datatrove/utils/tokenization.py , line 19
modify:
# return Tokenizer.from_file(name_or_path)
return Tokenizer.from_file(name_or_path + "/tokenizer.json")
```
项目中已包含`tokenlizer`[`dummy`](./robot-test/dummy-tokenizer-wordlevel)[`cosmo2`](./HuggingFaceTB/cosmo2-tokenizer) ,其它tokenizer(如:llama3)根据读者需求可自行下载。
预训练数据的完整目录结构如下:
```
/home/nanotron
├── openwebtext-10k.tar.xz
├── stas/openwebtext-10k
├── dataset_infos.json
├── openwebtext-10k.py
├── process.txt
└── README.md
├── datasets/fineweb-edu-dedup
├── train-00000-of-00002.parquet
├── train-00001-of-00002.parquet
└── datasets/fineweb-edu-dedup-ds
├── 00000_unshuffled.ds
├── 00000_unshuffled.ds.index
├── 00000_unshuffled.ds.metadata
...
```
`备注:`本项目灵活度大,仅适于算法基础较好的研究人员使用,对算法基础和代码基础有一定的需求,其它人员可能存在一定的上手门槛,可参考光源上预训练项目[`allamo_pytorch`](http://developer.sourcefind.cn/codes/modelzoo/allamo_pytorch.git)中的简单预训练代码llama3__scratch进行上手学习。
## 训练
### 单机多卡
本项目的最大特点是完全开源、营造自由科研环境,项目中的算法、模型读者可自由修改、研发以提出自己的算法来为社会做贡献,在[`llama`](./src/nanotron/models/llama.py)修改模型文件,但为了方便介绍,本步骤说明以小规模模型tiny llama作为示例:
```
cd /home/nanotron
sh train.sh # 不同卡数的训练方式参照train.sh中的说明,完整规模llama3的训练方式可参考train.sh中的说明。
# 遇到Do you wish to run the custom code? [y/N],填y。
# 其它功能正在优化中,欢迎共同优化和拓展。
```
Facebook原版llama3的模型参数可参考[`Llama-3.1-8B`](./checkpoints/Nanotron-Llama-3.1-8B/model_config.json)[`Llama-3.2-3B`](./checkpoints/Nanotron-Llama-3.2-3B/model_config.json),这两个参数文件根据以下命令可获取:
```
sh convert_hf_to_nanotron.sh # Llama系列的基础模型皆支持转换
# 若已预训练完成某个模型,可转换成HF格式权重进行发布,以及用其它开源库继续微调,不同模型请读者根据具体参数修改此文件中的相应参数进行转换。
# sh convert_nanotron_to_hf.sh
```
为了方便读者借鉴HF官方的预训练方式,项目中还提供了`smollm`的预训练示例,参考文档[`pre-training`](https://github.com/huggingface/smollm/tree/main/pre-training)
```
sh train_smollm1_135M_demo.sh # Demo仅供试用,细节请自行研究,若读者具备超算集群,可参照launch.slurm编写自己具体的slurm脚本。
```
`Tips :`通过本项目获得自主研发模型的预训练权重后,后续微调模型放在[`LLaMA-Factory`](https://github.com/hiyouga/LLaMA-Factory.git)[`ollama`](https://github.com/ollama/ollama.git)、公开「后训练」一切的[`open-instruct`](https://github.com/allenai/open-instruct.git)等工具中进行更方便。
更多资料可参考源项目的[`README_origin`](./README_origin.md)
## 推理
```
sh infer.sh # 以checkpoints/10中的权重作为示例,其他权重请参照此示例修改权重路径。
```
更多资料可参考源项目的[`README_origin`](./README_origin.md)
## result
由于示例训练数据较少、模型为简化模型且训练时间短,此推理仅供参考显示效果,以方便读者了解项目使用方法,根据官方示例所得:
`输入: `
```
input: [CLS] the [UNK] [UNK] [UNK] is [SEP]
```
`输出:`
```
generation: [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP]
```
上述重复字符为示例Demo的tokenlizer过于简单导致的正常现象,更换一个复杂的tokenlizer便可输出正常结果,例如使用`cosmo2-tokenizer`预训练:
```
CUDA_DEVICE_MAX_CONNECTIONS=1 torchrun --nproc_per_node=4 run_train.py --config-file examples/config_tiny_llama_cosmo2tokenizer.yaml
```
由此可见,不同tokenlizer会对训练结果造成明显差别,故建议实际训练中选择设计更好的tokenlizer。
### 精度
DCU与GPU精度一致,推理框架:pytorch。
## 应用场景
### 算法类别
`对话问答`
### 热点应用行业
`制造,广媒,金融,能源,医疗,家居,教育`
## 预训练权重
预训练权重快速下载中心:[SCNet AIModels](http://113.200.138.88:18080/aimodels) ,项目中的预训练权重可从快速下载通道下载:[Llama-3.1-8B](http://113.200.138.88:18080/aimodels/meta-llama/Meta-Llama-3.1-8B.git)[Llama-3.2-3B](http://113.200.138.88:18080/aimodels/meta-llama/Llama-3.2-3B.git)
Hugging Face下载地址为:[meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B)[meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
## 源码仓库及问题反馈
- http://developer.sourcefind.cn/codes/modelzoo/nanotron_pytorch.git
## 参考资料
- https://github.com/huggingface/nanotron.git
- https://github.com/meta-llama/llama3.git
- https://github.com/hiyouga/LLaMA-Factory.git
- https://github.com/ollama/ollama.git
- https://github.com/allenai/open-instruct.git
<h1 align="center">⚡️ Nanotron</h1>
<p align="center">
<a href="https://github.com/huggingface/nanotron/releases">
<img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/nanotron.svg">
</a>
<a href="https://github.com/huggingface/nanotron/blob/master/LICENSE">
<img alt="License" src="https://img.shields.io/github/license/huggingface/nanotron.svg?color=green">
</a>
</p>
<h4 align="center">
<p>
<a href="#installation">Installation</a>
<a href="#quick-start">Quick Start</a>
<a href="#features">Features</a>
<a href="CONTRIBUTING.md">Contributing</a>
<p>
</h4>
<h3 align="center">
<a href="https://huggingface.co/nanotron"><img style="float: middle; padding: 10px 10px 10px 10px;" width="60" height="55" src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.png" /></a>
</h3>
<h3 align="center">
<p>Pretraining models made easy
</h3>
Nanotron is a library for pretraining transformer models. It provides a simple and flexible API to pretrain models on custom datasets. Nanotron is designed to be easy to use, fast, and scalable. It is built with the following principles in mind:
- **Simplicity**: Nanotron is designed to be easy to use. It provides a simple and flexible API to pretrain models on custom datasets.
- **Performance**: Optimized for speed and scalability, Nanotron uses the latest techniques to train models faster and more efficiently.
## Installation
```bash
# Requirements: Python>=3.10
git clone https://github.com/huggingface/nanotron
cd nanotron
pip install --upgrade pip
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
pip install -e .
# Install dependencies if you want to use the example scripts
pip install datasets transformers
pip install triton "flash-attn>=2.5.0" --no-build-isolation
```
> [!NOTE]
> If you get `undefined symbol: ncclCommRegister` error you should install torch 2.1.2 instead: `pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121`
> [!TIP]
> We log to wandb automatically if it's installed. For that you can use `pip install wandb`. If you don't want to use wandb, you can run `wandb disabled`.
## Quick Start
### Training a tiny Llama model
The following command will train a tiny Llama model on a single node with 8 GPUs. The model will be saved in the `checkpoints` directory as specified in the config file.
```bash
CUDA_DEVICE_MAX_CONNECTIONS=1 torchrun --nproc_per_node=8 run_train.py --config-file examples/config_tiny_llama.yaml
```
### Run generation from your checkpoint
```bash
torchrun --nproc_per_node=1 run_generate.py --ckpt-path checkpoints/10/ --tp 1 --pp 1
# We could set a larger TP for faster generation, and a larger PP in case of very large models.
```
### Custom examples
You can find more examples in the [`/examples`](/examples) directory:
<!-- Make a table of the examples we support -->
| Example | Description |
| --- | --- |
| `custom-dataloader` | Plug a custom dataloader to nanotron |
| `datatrove` | Use the datatrove library to load data |
| `doremi` | Use DoReMi to speed up training |
| `mamba` | Train an example Mamba model |
| `moe` | Train an example Mixture-of-Experts (MoE) model |
| `mup` | Use spectral µTransfer to scale up your model |
| `examples/config_tiny_llama_with_s3_upload.yaml` | For automatically uploading checkpoints to S3 |
We're working on adding more examples soon! Feel free to add a PR to add your own example. 🚀
## Features
We currently support the following features:
- [x] 3D parallelism (DP+TP+PP)
- [x] Expert parallelism for MoEs
- [x] AFAB and 1F1B schedules for PP
- [x] Explicit APIs for TP and PP which enables easy debugging
- [x] ZeRO-1 optimizer
- [x] FP32 gradient accumulation
- [x] Parameter tying/sharding
- [x] Custom module checkpointing for large models
- [x] Spectral µTransfer parametrization for scaling up neural networks
- [x] Mamba example
And we have on our roadmap:
- [ ] FP8 training
- [ ] ZeRO-3 optimizer (a.k.a FSDP)
- [ ] `torch.compile` support
- [ ] Ring attention
- [ ] Interleaved 1f1b schedule
## Credits
We would like to thank everyone working on LLMs, especially those sharing their work openly from which we took great inspiration: Nvidia for `Megatron-LM/apex`, Microsoft for `DeepSpeed`, HazyResearch for `flash-attn`..
python3 tools/preprocess_data.py \
--tokenizer-name-or-path HuggingFaceTB/cosmo2-tokenizer \
--output-folder datasets/fineweb-edu-dedup-ds \
--n-tasks 16 \
hf \
--dataset datasets/fineweb-edu-dedup \
torchrun --nproc-per-node 1 examples/llama/convert_hf_to_nanotron.py --checkpoint_path Meta-Llama-3.1-8B --save_path checkpoints/Nanotron-Llama-3.1-8B
# torchrun --nproc-per-node 1 examples/llama/convert_hf_to_nanotron.py --checkpoint_path Llama-3.2-3B --save_path checkpoints/Nanotron-Llama-3.2-3B
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment