"test/git@developer.sourcefind.cn:change/sglang.git" did not exist on "cba50273324e9770c587754a4abbea974a4124ba"
Unverified Commit 070dcf1c authored by Diganta Misra's avatar Diganta Misra Committed by GitHub
Browse files

Added Mish Activation Function

Mish is a new activation function proposed here - https://arxiv.org/abs/1908.08681
It has seen some recent success and has been adopted in SpaCy, Thic, TensorFlow Addons and FastAI-dev. 
All benchmarks recorded till now (including against ReLU, Swish and GELU) is present in the repository - https://github.com/digantamisra98/Mish
Might be a good addition to experiment with especially in the Bert Model.
parent 1c542df7
...@@ -138,7 +138,11 @@ def swish(x): ...@@ -138,7 +138,11 @@ def swish(x):
return x * torch.sigmoid(x) return x * torch.sigmoid(x)
ACT2FN = {"gelu": gelu, "relu": torch.nn.functional.relu, "swish": swish, "gelu_new": gelu_new} def mish(x):
return x * torch.tanh(nn.functional.softplus(x))
ACT2FN = {"gelu": gelu, "relu": torch.nn.functional.relu, "swish": swish, "gelu_new": gelu_new, "mish": mish}
BertLayerNorm = torch.nn.LayerNorm BertLayerNorm = torch.nn.LayerNorm
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment