• Tolga Cangöz's avatar
    [`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998) · b88fef47
    Tolga Cangöz authored
    * Add initial template
    
    * Second template
    
    * feat: Add TextEmbeddingModule to AnyTextPipeline
    
    * feat: Add AuxiliaryLatentModule template to AnyTextPipeline
    
    * Add bert tokenizer from the anytext repo for now
    
    * feat: Update AnyTextPipeline's modify_prompt method
    
    This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.
    
    * Fill in the `forward` pass of `AuxiliaryLatentModule`
    
    * `make style && make quality`
    
    * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`
    
    * Update error handling to raise and logging
    
    * Add `create_glyph_lines` function into `TextEmbeddingModule`
    
    * make style
    
    * Up
    
    * Up
    
    * Up
    
    * Up
    
    * Remove several comments
    
    * refactor: Remove ControlNetConditioningEmbedding and update code accordingly
    
    * Up
    
    * Up
    
    * up
    
    * refactor: Update AnyTextPipeline to include new optional parameters
    
    * up
    
    * feat: Add OCR model and its components
    
    * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies
    
    * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task
    
    * `make style`
    
    * refactor: Update `AnyTextPipeline`'s docstring
    
    * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once
    
    * simplify
    
    * `make style`
    
    * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function
    
    * Simplify for now
    
    * `make style`
    
    * Up
    
    * feat: Add scripts to convert AnyText controlnet to diffusers
    
    * `make style`
    
    * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`
    
    * make style
    
    * Up
    
    * Simplify
    
    * Up
    
    * feat: Add safetensors module for loading model file
    
    * Fix device issues
    
    * Up
    
    * Up
    
    * refactor: Simplify
    
    * refactor: Simplify code for loading models and handling data types
    
    * `make style`
    
    * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule
    
    * refactor: Update dtype in embedding_manager.py to match proj.weight
    
    * Up
    
    * Add attribution and adaptation information to pipeline_anytext.py
    
    * Update usage example
    
    * Will refactor `controlnet_cond_embedding` initialization
    
    * Add `AnyTextControlNetConditioningEmbedding` template
    
    * Refactor organization
    
    * style
    
    * style
    
    * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`
    
    * Follow one-file policy
    
    * style
    
    * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel
    
    * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py
    
    * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py
    
    * Refactor AnyTextControlNet to use configurable conditioning embedding channels
    
    * Complete control net conditioning embedding in AnyTextControlNetModel
    
    * up
    
    * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel
    
    * up
    
    * up
    
    * style
    
    * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline
    
    * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity
    
    * down
    
    * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing
    
    * update pillow
    
    * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity
    
    * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file
    
    * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency
    
    * 🆙
    
    
    
    * style
    
    * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py
    
    * style
    
    * Update examples/research_projects/anytext/README.md
    Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
    
    * Remove commented-out image preparation code in AnyTextPipeline
    
    * Remove unnecessary blank line in README.md
    b88fef47
README.md 2.04 KB