"...git@developer.sourcefind.cn:modelzoo/qwen_lmdeploy.git" did not exist on "e0c7f51b86d98a0b002d6f389d7850b74b55725e"
Commit 18ff696d authored by chenzk's avatar chenzk
Browse files

v1.0.5

parents
Pipeline #2036 failed with stages
in 0 seconds
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
from typing import List
import torch
from torch.utils.data import Dataset
@torch.jit.script
def masked_mean(loss: torch.Tensor, label_mask: torch.Tensor, dtype: torch.dtype) -> torch.Tensor:
return (loss * label_mask).sum(dtype=dtype) / label_mask.sum()
def compute_domain_weights_based_on_token_count(datasets: List[Dataset]) -> torch.Tensor:
num_samples_per_domain = [len(d) for d in datasets]
total_samples = sum(num_samples_per_domain)
weights = torch.tensor([num_sample / total_samples for num_sample in num_samples_per_domain])
return weights
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment