"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "0234de8418b253e843dda0a18a3e18476de52781"
Unverified Commit 5e2f2d7d authored by Sebastian Sosa's avatar Sebastian Sosa Committed by GitHub
Browse files

Better messaging and fix for incorrect shape when collating data. (#18119)

* More informative error message

* raise dynamic error

* remove_excess_nesting application

* incorrect shape assertion for collator & function to remove excess nesting from DatasetDict

* formatting

* eliminating datasets import

* removed and relocated remove_excess_nesting to the datasets library and updated docs accordingly

* independent assert instructions

* inform user of excess nesting
parent d23cf5b1
......@@ -733,8 +733,10 @@ class BatchEncoding(UserDict):
"Please see if a fast version of this tokenizer is available to have this feature available."
)
raise ValueError(
"Unable to create tensor, you should probably activate truncation and/or padding "
"with 'padding=True' 'truncation=True' to have batched tensors with the same length."
"Unable to create tensor, you should probably activate truncation and/or padding with"
" 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your"
f" features (`{key}` in this case) have excessive nesting (inputs type `list` where type `int` is"
" expected)."
)
return self
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment