• NielsRogge's avatar
    Add Audio Spectogram Transformer (#19981) · 4973d2a0
    NielsRogge authored
    
    
    * First draft
    
    * Make conversion script work
    
    * Add id2label mapping, run code quality
    
    * Fix copies
    
    * Add first draft of feature extractor
    
    * Update conversion script to use feature extractor
    
    * Make more tests pass
    
    * Add docs
    
    * update input_features to input_values + pad by default to max length
    
    * Fix doc tests
    
    * Add feature extractor tests
    
    * Add proper padding/truncation to feature extractor
    
    * Add support for conversion of all audioset checkpoints
    
    * Improve docs and extend conversion script
    
    * Fix README
    
    * Rename spectogram to spectrogram
    
    * Fix copies
    
    * Add integration test
    
    * Remove dummy conv
    
    * Update to ast
    
    * Update organization
    
    * Fix init
    
    * Rename model to AST
    
    * Add require_torchaudio annotator
    
    * Move import of ASTFeatureExtractor under a is_speech_available
    
    * Fix rebase
    
    * Add pipeline config
    
    * Update name of classifier head
    
    * Rename time_dimension and frequency_dimension for clarity
    
    * Remove print statement
    
    * Fix pipeline test
    
    * Fix pipeline test
    
    * Fix index table
    
    * Fix init
    
    * Fix conversion script
    
    * Rename to ForAudioClassification
    
    * Fix index table
    Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
    4973d2a0
README_zh-hant.md 68.6 KB