feat(model parallelism): moving the labels to the same device as the logits...
feat(model parallelism): moving the labels to the same device as the logits for gpt2 and bart (#22591)
Showing
Please register or sign in to comment
feat(model parallelism): moving the labels to the same device as the logits for gpt2 and bart (#22591)