• Nicolas Patry's avatar
    [TokenClassification] Label realignment for subword aggregation (#11680) · b88e0e01
    Nicolas Patry authored
    * [TokenClassification] Label realignment for subword aggregation
    
    Tentative to replace https://github.com/huggingface/transformers/pull/11622/files
    
    
    
    - Added `AggregationStrategy`
    - `ignore_subwords` and `grouped_entities` arguments are now fused
      into `aggregation_strategy`. It makes more sense anyway because
      `ignore_subwords=True` with `grouped_entities=False` did not have a
      meaning anyway.
    - Added 2 new ways to aggregate which are MAX, and AVERAGE
    - AVERAGE requires a bit more information than the others, for now this
    case is slightly specific, we should keep that in mind for future
    changes.
    - Testing has been modified to reflect new argument, and to check the
    correct deprecation and the new aggregation_strategy.
    - Put the testing argument and testing results for aggregation_strategy,
    close together, so that readers can understand what is supposed to
    happen.
    - `aggregate` is now only tested on a small model as it does not mean
    anything to test it globally for all models.
    - Previous tests are unchanged in desired output.
    - Added a new test case that showcases better the difference between the
      FIRST, MAX and AVERAGE strategies.
    
    * Wrong framework.
    
    * Addressing three issues.
    
    1- Tags might not follow B-, I- convention, so any tag should work now
    (assumed as B-TAG)
    2- Fixed an issue with average that leads to a substantial code change.
    3- The testing suite was not checking for the "index" key for "none"
    strategy. This is now fixed.
    
    The issue is that "O" could not be chosen by AVERAGE strategy because
    those tokens were filtered out beforehand, so their relative scores were
    not counted in the average. Now filtering on
    ignore_labels will happen at the very end of the pipeline fixing
    that issue.
    It's a bit hard to make sure this stays like that because we do
    not have a end-to-end test for that behavior
    
    * Formatting.
    
    * Adding formatting to code + cleaner handling of B-, I- tags.
    Co-authored-by: default avatarFrancesco Rubbo <rubbo.francesco@gmail.com>
    Co-authored-by: default avatarelk-cloner <rezakakhki.rk@gmail.com>
    
    * Typo.
    Co-authored-by: default avatarFrancesco Rubbo <rubbo.francesco@gmail.com>
    Co-authored-by: default avatarelk-cloner <rezakakhki.rk@gmail.com>
    b88e0e01
test_pipelines_token_classification.py 23 KB