• Daniel Stancl's avatar
    Add TF implementation of GPT-J (#15623) · ed2ee373
    Daniel Stancl authored
    * Initial commit
    
    * Add TFGPTJModel
    
    * Fix a forward pass
    
    * Add TFGPTJCausalLM
    
    * Add TFGPTJForSequenceClassification
    
    * Add TFGPTJForQuestionAnswering
    
    * Fix docs
    
    * Deal with TF dynamic shapes
    
    * Add Loss parents to models
    
    * Adjust split and merge heads to handle 4 and 5-dim tensors
    
    * Update outputs for @tooslow tests
    ed2ee373
gptj.mdx 5.51 KB