Commit 955389a9 authored by A. Unique TensorFlower's avatar A. Unique TensorFlower
Browse files

Docstring fix: valid GitHub URLs for original ALBERT

before it was moved by https://github.com/google-research/google-research/commit/b05c90d1ce3f22445d23e536549b0ae123fdd81b

PiperOrigin-RevId: 332171466
parent 067e8ae3
...@@ -421,7 +421,7 @@ def preprocess_text(inputs, remove_space=True, lower=False): ...@@ -421,7 +421,7 @@ def preprocess_text(inputs, remove_space=True, lower=False):
"""Preprocesses data by removing extra space and normalize data. """Preprocesses data by removing extra space and normalize data.
This method is used together with sentence piece tokenizer and is forked from: This method is used together with sentence piece tokenizer and is forked from:
https://github.com/google-research/google-research/blob/master/albert/tokenization.py https://github.com/google-research/google-research/blob/e1f6fa00/albert/tokenization.py
Args: Args:
inputs: The input text. inputs: The input text.
...@@ -454,7 +454,7 @@ def encode_pieces(sp_model, text, sample=False): ...@@ -454,7 +454,7 @@ def encode_pieces(sp_model, text, sample=False):
"""Segements text into pieces. """Segements text into pieces.
This method is used together with sentence piece tokenizer and is forked from: This method is used together with sentence piece tokenizer and is forked from:
https://github.com/google-research/google-research/blob/master/albert/tokenization.py https://github.com/google-research/google-research/blob/e1f6fa00/albert/tokenization.py
Args: Args:
...@@ -496,7 +496,7 @@ def encode_ids(sp_model, text, sample=False): ...@@ -496,7 +496,7 @@ def encode_ids(sp_model, text, sample=False):
"""Segments text and return token ids. """Segments text and return token ids.
This method is used together with sentence piece tokenizer and is forked from: This method is used together with sentence piece tokenizer and is forked from:
https://github.com/google-research/google-research/blob/master/albert/tokenization.py https://github.com/google-research/google-research/blob/e1f6fa00/albert/tokenization.py
Args: Args:
sp_model: A spm.SentencePieceProcessor object. sp_model: A spm.SentencePieceProcessor object.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment