• Jared Casper's avatar
    Update arguments checks. · 8044c7b4
    Jared Casper authored
    hidden_size % attention_heads == 0 is handled above when dealing with kv_channels.
    
    Adding check for decoder sequence length.
    8044c7b4
arguments.py 37.4 KB