Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Paraformer_FunASR_pytorch
Commits
70a8a9e0
Commit
70a8a9e0
authored
Oct 03, 2024
by
wangwei990215
Browse files
initial commit
parents
Pipeline
#1738
failed with stages
in 0 seconds
Changes
827
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
860 additions
and
0 deletions
+860
-0
FunASR/examples/aishell/transformer/conf/transformer_12e_6d_2048_256.yaml
...aishell/transformer/conf/transformer_12e_6d_2048_256.yaml
+104
-0
FunASR/examples/aishell/transformer/demo_infer.sh
FunASR/examples/aishell/transformer/demo_infer.sh
+2
-0
FunASR/examples/aishell/transformer/demo_train_or_finetune.sh
...SR/examples/aishell/transformer/demo_train_or_finetune.sh
+2
-0
FunASR/examples/aishell/transformer/local/aishell_data_prep.sh
...R/examples/aishell/transformer/local/aishell_data_prep.sh
+66
-0
FunASR/examples/aishell/transformer/local/download_and_untar.sh
.../examples/aishell/transformer/local/download_and_untar.sh
+105
-0
FunASR/examples/aishell/transformer/run.sh
FunASR/examples/aishell/transformer/run.sh
+204
-0
FunASR/examples/aishell/transformer/utils
FunASR/examples/aishell/transformer/utils
+2
-0
FunASR/examples/common_voice/whisper_lid/demo_funasr.py
FunASR/examples/common_voice/whisper_lid/demo_funasr.py
+19
-0
FunASR/examples/common_voice/whisper_lid/demo_modelscope.py
FunASR/examples/common_voice/whisper_lid/demo_modelscope.py
+22
-0
FunASR/examples/deepspeed_conf/ds_stage1.json
FunASR/examples/deepspeed_conf/ds_stage1.json
+33
-0
FunASR/examples/deepspeed_conf/ds_stage2.json
FunASR/examples/deepspeed_conf/ds_stage2.json
+33
-0
FunASR/examples/deepspeed_conf/ds_stage3.json
FunASR/examples/deepspeed_conf/ds_stage3.json
+41
-0
FunASR/examples/deepspeed_conf/ds_z0_config.json
FunASR/examples/deepspeed_conf/ds_z0_config.json
+29
-0
FunASR/examples/deepspeed_conf/ds_z2_config.json
FunASR/examples/deepspeed_conf/ds_z2_config.json
+29
-0
FunASR/examples/deepspeed_conf/ds_z2_offload_config.json
FunASR/examples/deepspeed_conf/ds_z2_offload_config.json
+33
-0
FunASR/examples/deepspeed_conf/ds_z3_config.json
FunASR/examples/deepspeed_conf/ds_z3_config.json
+31
-0
FunASR/examples/deepspeed_conf/ds_z3_offload_config.json
FunASR/examples/deepspeed_conf/ds_z3_offload_config.json
+39
-0
FunASR/examples/industrial_data_pretraining/bicif_paraformer/demo.py
...ples/industrial_data_pretraining/bicif_paraformer/demo.py
+20
-0
FunASR/examples/industrial_data_pretraining/bicif_paraformer/demo.sh
...ples/industrial_data_pretraining/bicif_paraformer/demo.sh
+18
-0
FunASR/examples/industrial_data_pretraining/bicif_paraformer/export.py
...es/industrial_data_pretraining/bicif_paraformer/export.py
+28
-0
No files found.
Too many changes to show.
To preserve performance only
827 of 827+
files are displayed.
Plain diff
Email patch
FunASR/examples/aishell/transformer/conf/transformer_12e_6d_2048_256.yaml
0 → 100644
View file @
70a8a9e0
# This is an example that demonstrates how to configure a model file.
# You can modify the configuration according to your own requirements.
# to print the register_table:
# from funasr.register import tables
# tables.print()
# network architecture
model
:
Transformer
model_conf
:
ctc_weight
:
0.3
lsm_weight
:
0.1
# label smoothing option
length_normalized_loss
:
false
# encoder
encoder
:
TransformerEncoder
encoder_conf
:
output_size
:
256
# dimension of attention
attention_heads
:
4
linear_units
:
2048
# the number of units of position-wise feed forward
num_blocks
:
12
# the number of encoder blocks
dropout_rate
:
0.1
positional_dropout_rate
:
0.1
attention_dropout_rate
:
0.0
input_layer
:
conv2d
# encoder architecture type
normalize_before
:
true
# decoder
decoder
:
TransformerDecoder
decoder_conf
:
attention_heads
:
4
linear_units
:
2048
num_blocks
:
6
dropout_rate
:
0.1
positional_dropout_rate
:
0.1
self_attention_dropout_rate
:
0.0
src_attention_dropout_rate
:
0.0
# frontend related
frontend
:
WavFrontend
frontend_conf
:
fs
:
16000
window
:
hamming
n_mels
:
80
frame_length
:
25
frame_shift
:
10
lfr_m
:
1
lfr_n
:
1
specaug
:
SpecAug
specaug_conf
:
apply_time_warp
:
true
time_warp_window
:
5
time_warp_mode
:
bicubic
apply_freq_mask
:
true
freq_mask_width_range
:
-
0
-
30
num_freq_mask
:
2
apply_time_mask
:
true
time_mask_width_range
:
-
0
-
40
num_time_mask
:
2
train_conf
:
accum_grad
:
1
grad_clip
:
5
max_epoch
:
150
keep_nbest_models
:
10
log_interval
:
50
optim
:
adam
optim_conf
:
lr
:
0.002
scheduler
:
warmuplr
scheduler_conf
:
warmup_steps
:
30000
dataset
:
AudioDataset
dataset_conf
:
index_ds
:
IndexDSJsonl
batch_sampler
:
EspnetStyleBatchSampler
batch_type
:
length
# example or length
batch_size
:
25000
# if batch_type is example, batch_size is the numbers of samples; if length, batch_size is source_token_len+target_token_len;
max_token_length
:
2048
# filter samples if source_token_len+target_token_len > max_token_length,
buffer_size
:
1024
shuffle
:
True
num_workers
:
4
preprocessor_speech
:
SpeechPreprocessSpeedPerturb
preprocessor_speech_conf
:
speed_perturb
:
[
0.9
,
1.0
,
1.1
]
tokenizer
:
CharTokenizer
tokenizer_conf
:
unk_symbol
:
<unk>
ctc_conf
:
dropout_rate
:
0.0
ctc_type
:
builtin
reduce
:
true
ignore_nan_grad
:
true
normalize
:
null
FunASR/examples/aishell/transformer/demo_infer.sh
0 → 120000
View file @
70a8a9e0
../paraformer/demo_infer.sh
\ No newline at end of file
FunASR/examples/aishell/transformer/demo_train_or_finetune.sh
0 → 120000
View file @
70a8a9e0
../paraformer/demo_train_or_finetune.sh
\ No newline at end of file
FunASR/examples/aishell/transformer/local/aishell_data_prep.sh
0 → 100755
View file @
70a8a9e0
#!/bin/bash
# Copyright 2017 Xingyu Na
# Apache 2.0
#. ./path.sh || exit 1;
if
[
$#
!=
3
]
;
then
echo
"Usage:
$0
<audio-path> <text-path> <output-path>"
echo
"
$0
/export/a05/xna/data/data_aishell/wav /export/a05/xna/data/data_aishell/transcript data"
exit
1
;
fi
aishell_audio_dir
=
$1
aishell_text
=
$2
/aishell_transcript_v0.8.txt
output_dir
=
$3
train_dir
=
$output_dir
/data/local/train
dev_dir
=
$output_dir
/data/local/dev
test_dir
=
$output_dir
/data/local/test
tmp_dir
=
$output_dir
/data/local/tmp
mkdir
-p
$train_dir
mkdir
-p
$dev_dir
mkdir
-p
$test_dir
mkdir
-p
$tmp_dir
# data directory check
if
[
!
-d
$aishell_audio_dir
]
||
[
!
-f
$aishell_text
]
;
then
echo
"Error:
$0
requires two directory arguments"
exit
1
;
fi
# find wav audio file for train, dev and test resp.
find
$aishell_audio_dir
-iname
"*.wav"
>
$tmp_dir
/wav.flist
n
=
`
cat
$tmp_dir
/wav.flist |
wc
-l
`
[
$n
-ne
141925
]
&&
\
echo
Warning: expected 141925 data data files, found
$n
grep
-i
"wav/train"
$tmp_dir
/wav.flist
>
$train_dir
/wav.flist
||
exit
1
;
grep
-i
"wav/dev"
$tmp_dir
/wav.flist
>
$dev_dir
/wav.flist
||
exit
1
;
grep
-i
"wav/test"
$tmp_dir
/wav.flist
>
$test_dir
/wav.flist
||
exit
1
;
rm
-r
$tmp_dir
# Transcriptions preparation
for
dir
in
$train_dir
$dev_dir
$test_dir
;
do
echo
Preparing
$dir
transcriptions
sed
-e
's/\.wav//'
$dir
/wav.flist |
awk
-F
'/'
'{print $NF}'
>
$dir
/utt.list
paste
-d
' '
$dir
/utt.list
$dir
/wav.flist
>
$dir
/wav.scp_all
utils/filter_scp.pl
-f
1
$dir
/utt.list
$aishell_text
>
$dir
/transcripts.txt
awk
'{print $1}'
$dir
/transcripts.txt
>
$dir
/utt.list
utils/filter_scp.pl
-f
1
$dir
/utt.list
$dir
/wav.scp_all |
sort
-u
>
$dir
/wav.scp
sort
-u
$dir
/transcripts.txt
>
$dir
/text
done
mkdir
-p
$output_dir
/data/train
$output_dir
/data/dev
$output_dir
/data/test
for
f
in
wav.scp text
;
do
cp
$train_dir
/
$f
$output_dir
/data/train/
$f
||
exit
1
;
cp
$dev_dir
/
$f
$output_dir
/data/dev/
$f
||
exit
1
;
cp
$test_dir
/
$f
$output_dir
/data/test/
$f
||
exit
1
;
done
echo
"
$0
: AISHELL data preparation succeeded"
exit
0
;
FunASR/examples/aishell/transformer/local/download_and_untar.sh
0 → 100755
View file @
70a8a9e0
#!/usr/bin/env bash
# Copyright 2014 Johns Hopkins University (author: Daniel Povey)
# 2017 Xingyu Na
# Apache 2.0
remove_archive
=
false
if
[
"
$1
"
==
--remove-archive
]
;
then
remove_archive
=
true
shift
fi
if
[
$#
-ne
3
]
;
then
echo
"Usage:
$0
[--remove-archive] <data-base> <url-base> <corpus-part>"
echo
"e.g.:
$0
/export/a05/xna/data www.openslr.org/resources/33 data_aishell"
echo
"With --remove-archive it will remove the archive after successfully un-tarring it."
echo
"<corpus-part> can be one of: data_aishell, resource_aishell."
fi
data
=
$1
url
=
$2
part
=
$3
if
[
!
-d
"
$data
"
]
;
then
echo
"
$0
: no such directory
$data
"
exit
1
;
fi
part_ok
=
false
list
=
"data_aishell resource_aishell"
for
x
in
$list
;
do
if
[
"
$part
"
==
$x
]
;
then
part_ok
=
true
;
fi
done
if
!
$part_ok
;
then
echo
"
$0
: expected <corpus-part> to be one of
$list
, but got '
$part
'"
exit
1
;
fi
if
[
-z
"
$url
"
]
;
then
echo
"
$0
: empty URL base."
exit
1
;
fi
if
[
-f
$data
/
$part
/.complete
]
;
then
echo
"
$0
: data part
$part
was already successfully extracted, nothing to do."
exit
0
;
fi
# sizes of the archive files in bytes.
sizes
=
"15582913665 1246920"
if
[
-f
$data
/
$part
.tgz
]
;
then
size
=
$(
/bin/ls
-l
$data
/
$part
.tgz |
awk
'{print $5}'
)
size_ok
=
false
for
s
in
$sizes
;
do if
[
$s
==
$size
]
;
then
size_ok
=
true
;
fi
;
done
if
!
$size_ok
;
then
echo
"
$0
: removing existing file
$data
/
$part
.tgz because its size in bytes
$size
"
echo
"does not equal the size of one of the archives."
rm
$data
/
$part
.tgz
else
echo
"
$data
/
$part
.tgz exists and appears to be complete."
fi
fi
if
[
!
-f
$data
/
$part
.tgz
]
;
then
if
!
command
-v
wget
>
/dev/null
;
then
echo
"
$0
: wget is not installed."
exit
1
;
fi
full_url
=
$url
/
$part
.tgz
echo
"
$0
: downloading data from
$full_url
. This may take some time, please be patient."
cd
$data
||
exit
1
if
!
wget
--no-check-certificate
$full_url
;
then
echo
"
$0
: error executing wget
$full_url
"
exit
1
;
fi
fi
cd
$data
||
exit
1
if
!
tar
-xvzf
$part
.tgz
;
then
echo
"
$0
: error un-tarring archive
$data
/
$part
.tgz"
exit
1
;
fi
touch
$data
/
$part
/.complete
if
[
$part
==
"data_aishell"
]
;
then
cd
$data
/
$part
/wav
||
exit
1
for
wav
in
./
*
.tar.gz
;
do
echo
"Extracting wav from
$wav
"
tar
-zxf
$wav
&&
rm
$wav
done
fi
echo
"
$0
: Successfully downloaded and un-tarred
$data
/
$part
.tgz"
if
$remove_archive
;
then
echo
"
$0
: removing
$data
/
$part
.tgz file since --remove-archive option was supplied."
rm
$data
/
$part
.tgz
fi
exit
0
;
FunASR/examples/aishell/transformer/run.sh
0 → 100755
View file @
70a8a9e0
#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES
=
"0,1"
# general configuration
feats_dir
=
"../DATA"
#feature output dictionary
exp_dir
=
`
pwd
`
lang
=
zh
token_type
=
char
stage
=
0
stop_stage
=
5
# feature configuration
nj
=
32
inference_device
=
"cuda"
#"cpu"
inference_checkpoint
=
"model.pt.avg10"
inference_scp
=
"wav.scp"
inference_batch_size
=
1
# data
raw_data
=
../raw_data
data_url
=
www.openslr.org/resources/33
# exp tag
tag
=
"exp1"
workspace
=
`
pwd
`
master_port
=
12345
.
utils/parse_options.sh
||
exit
1
;
# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set
-e
set
-u
set
-o
pipefail
train_set
=
train
valid_set
=
dev
test_sets
=
"dev test"
config
=
transformer_12e_6d_2048_256.yaml
model_dir
=
"baseline_
$(
basename
"
${
config
}
"
.yaml
)
_
${
lang
}
_
${
token_type
}
_
${
tag
}
"
if
[
${
stage
}
-le
-1
]
&&
[
${
stop_stage
}
-ge
-1
]
;
then
echo
"stage -1: Data Download"
mkdir
-p
${
raw_data
}
local
/download_and_untar.sh
${
raw_data
}
${
data_url
}
data_aishell
local
/download_and_untar.sh
${
raw_data
}
${
data_url
}
resource_aishell
fi
if
[
${
stage
}
-le
0
]
&&
[
${
stop_stage
}
-ge
0
]
;
then
echo
"stage 0: Data preparation"
# Data preparation
local
/aishell_data_prep.sh
${
raw_data
}
/data_aishell/wav
${
raw_data
}
/data_aishell/transcript
${
feats_dir
}
for
x
in
train dev
test
;
do
cp
${
feats_dir
}
/data/
${
x
}
/text
${
feats_dir
}
/data/
${
x
}
/text.org
paste
-d
" "
<
(
cut
-f
1
-d
" "
${
feats_dir
}
/data/
${
x
}
/text.org
)
<
(
cut
-f
2-
-d
" "
${
feats_dir
}
/data/
${
x
}
/text.org |
tr
-d
" "
)
\
>
${
feats_dir
}
/data/
${
x
}
/text
utils/text2token.py
-n
1
-s
1
${
feats_dir
}
/data/
${
x
}
/text
>
${
feats_dir
}
/data/
${
x
}
/text.org
mv
${
feats_dir
}
/data/
${
x
}
/text.org
${
feats_dir
}
/data/
${
x
}
/text
# convert wav.scp text to jsonl
scp_file_list_arg
=
"++scp_file_list='[
\"
${
feats_dir
}
/data/
${
x
}
/wav.scp
\"
,
\"
${
feats_dir
}
/data/
${
x
}
/text
\"
]'"
python ../../../funasr/datasets/audio_datasets/scp2jsonl.py
\
++data_type_list
=
'["source", "target"]'
\
++jsonl_file_out
=
${
feats_dir
}
/data/
${
x
}
/audio_datasets.jsonl
\
${
scp_file_list_arg
}
done
fi
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
echo
"stage 1: Feature and CMVN Generation"
python ../../../funasr/bin/compute_audio_cmvn.py
\
--config-path
"
${
workspace
}
/conf"
\
--config-name
"
${
config
}
"
\
++train_data_set_list
=
"
${
feats_dir
}
/data/
${
train_set
}
/audio_datasets.jsonl"
\
++cmvn_file
=
"
${
feats_dir
}
/data/
${
train_set
}
/cmvn.json"
\
fi
token_list
=
${
feats_dir
}
/data/
${
lang
}
_token_list/
$token_type
/tokens.txt
echo
"dictionary:
${
token_list
}
"
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
echo
"stage 2: Dictionary Preparation"
mkdir
-p
${
feats_dir
}
/data/
${
lang
}
_token_list/
$token_type
/
echo
"make a dictionary"
echo
"<blank>"
>
${
token_list
}
echo
"<s>"
>>
${
token_list
}
echo
"</s>"
>>
${
token_list
}
utils/text2token.py
-s
1
-n
1
--space
""
${
feats_dir
}
/data/
$train_set
/text |
cut
-f
2-
-d
" "
|
tr
" "
"
\n
"
\
|
sort
|
uniq
|
grep
-a
-v
-e
'^\s*$'
|
awk
'{print $0}'
>>
${
token_list
}
echo
"<unk>"
>>
${
token_list
}
fi
# LM Training Stage
if
[
${
stage
}
-le
3
]
&&
[
${
stop_stage
}
-ge
3
]
;
then
echo
"stage 3: LM Training"
fi
# ASR Training Stage
if
[
${
stage
}
-le
4
]
&&
[
${
stop_stage
}
-ge
4
]
;
then
echo
"stage 4: ASR Training"
mkdir
-p
${
exp_dir
}
/exp/
${
model_dir
}
current_time
=
$(
date
"+%Y-%m-%d_%H-%M"
)
log_file
=
"
${
exp_dir
}
/exp/
${
model_dir
}
/train.log.txt.
${
current_time
}
"
echo
"log_file:
${
log_file
}
"
export
CUDA_VISIBLE_DEVICES
=
$CUDA_VISIBLE_DEVICES
gpu_num
=
$(
echo
$CUDA_VISIBLE_DEVICES
|
awk
-F
","
'{print NF}'
)
torchrun
\
--nnodes
1
\
--nproc_per_node
${
gpu_num
}
\
--master_port
${
master_port
}
\
../../../funasr/bin/train.py
\
--config-path
"
${
workspace
}
/conf"
\
--config-name
"
${
config
}
"
\
++train_data_set_list
=
"
${
feats_dir
}
/data/
${
train_set
}
/audio_datasets.jsonl"
\
++valid_data_set_list
=
"
${
feats_dir
}
/data/
${
valid_set
}
/audio_datasets.jsonl"
\
++tokenizer_conf.token_list
=
"
${
token_list
}
"
\
++frontend_conf.cmvn_file
=
"
${
feats_dir
}
/data/
${
train_set
}
/am.mvn"
\
++output_dir
=
"
${
exp_dir
}
/exp/
${
model_dir
}
"
&>
${
log_file
}
fi
# Testing Stage
if
[
${
stage
}
-le
5
]
&&
[
${
stop_stage
}
-ge
5
]
;
then
echo
"stage 5: Inference"
if
[
${
inference_device
}
==
"cuda"
]
;
then
nj
=
$(
echo
$CUDA_VISIBLE_DEVICES
|
awk
-F
","
'{print NF}'
)
else
inference_batch_size
=
1
CUDA_VISIBLE_DEVICES
=
""
for
JOB
in
$(
seq
${
nj
}
)
;
do
CUDA_VISIBLE_DEVICES
=
$CUDA_VISIBLE_DEVICES
"-1,"
done
fi
for
dset
in
${
test_sets
}
;
do
inference_dir
=
"
${
exp_dir
}
/exp/
${
model_dir
}
/inference-
${
inference_checkpoint
}
/
${
dset
}
"
_logdir
=
"
${
inference_dir
}
/logdir"
echo
"inference_dir:
${
inference_dir
}
"
mkdir
-p
"
${
_logdir
}
"
data_dir
=
"
${
feats_dir
}
/data/
${
dset
}
"
key_file
=
${
data_dir
}
/
${
inference_scp
}
split_scps
=
for
JOB
in
$(
seq
"
${
nj
}
"
)
;
do
split_scps+
=
"
${
_logdir
}
/keys.
${
JOB
}
.scp"
done
utils/split_scp.pl
"
${
key_file
}
"
${
split_scps
}
gpuid_list_array
=(
${
CUDA_VISIBLE_DEVICES
//,/
}
)
for
JOB
in
$(
seq
${
nj
}
)
;
do
{
id
=
$((
JOB-1
))
gpuid
=
${
gpuid_list_array
[
$id
]
}
export
CUDA_VISIBLE_DEVICES
=
${
gpuid
}
python ../../../funasr/bin/inference.py
\
--config-path
=
"
${
exp_dir
}
/exp/
${
model_dir
}
"
\
--config-name
=
"config.yaml"
\
++init_param
=
"
${
exp_dir
}
/exp/
${
model_dir
}
/
${
inference_checkpoint
}
"
\
++tokenizer_conf.token_list
=
"
${
token_list
}
"
\
++frontend_conf.cmvn_file
=
"
${
feats_dir
}
/data/
${
train_set
}
/am.mvn"
\
++input
=
"
${
_logdir
}
/keys.
${
JOB
}
.scp"
\
++output_dir
=
"
${
inference_dir
}
/
${
JOB
}
"
\
++device
=
"
${
inference_device
}
"
\
++ncpu
=
1
\
++disable_log
=
true
\
++batch_size
=
"
${
inference_batch_size
}
"
&>
${
_logdir
}
/log.
${
JOB
}
.txt
}
&
done
wait
mkdir
-p
${
inference_dir
}
/1best_recog
for
f
in
token score text
;
do
if
[
-f
"
${
inference_dir
}
/
${
JOB
}
/1best_recog/
${
f
}
"
]
;
then
for
JOB
in
$(
seq
"
${
nj
}
"
)
;
do
cat
"
${
inference_dir
}
/
${
JOB
}
/1best_recog/
${
f
}
"
done
|
sort
-k1
>
"
${
inference_dir
}
/1best_recog/
${
f
}
"
fi
done
echo
"Computing WER ..."
python utils/postprocess_text_zh.py
${
inference_dir
}
/1best_recog/text
${
inference_dir
}
/1best_recog/text.proc
python utils/postprocess_text_zh.py
${
data_dir
}
/text
${
inference_dir
}
/1best_recog/text.ref
python utils/compute_wer.py
${
inference_dir
}
/1best_recog/text.ref
${
inference_dir
}
/1best_recog/text.proc
${
inference_dir
}
/1best_recog/text.cer
tail
-n
3
${
inference_dir
}
/1best_recog/text.cer
done
fi
\ No newline at end of file
FunASR/examples/aishell/transformer/utils
0 → 120000
View file @
70a8a9e0
../paraformer/utils
\ No newline at end of file
FunASR/examples/common_voice/whisper_lid/demo_funasr.py
0 → 100644
View file @
70a8a9e0
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)
from
funasr
import
AutoModel
multilingual_wavs
=
[
"example_zh-CN.mp3"
,
"example_en.mp3"
,
"example_ja.mp3"
,
"example_ko.mp3"
,
]
model
=
AutoModel
(
model
=
"iic/speech_whisper-large_lid_multilingual_pytorch"
)
for
wav_id
in
multilingual_wavs
:
wav_file
=
f
"
{
model
.
model_path
}
/examples/
{
wav_id
}
"
res
=
model
.
generate
(
input
=
wav_file
,
data_type
=
"sound"
,
inference_clip_length
=
250
)
print
(
"detect sample {}: {}"
.
format
(
wav_id
,
res
))
FunASR/examples/common_voice/whisper_lid/demo_modelscope.py
0 → 100644
View file @
70a8a9e0
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)
from
modelscope.pipelines
import
pipeline
from
modelscope.utils.constant
import
Tasks
multilingual_wavs
=
[
"https://www.modelscope.cn/api/v1/models/iic/speech_whisper-large_lid_multilingual_pytorch/repo?Revision=master&FilePath=examples/example_zh-CN.mp3"
,
"https://www.modelscope.cn/api/v1/models/iic/speech_whisper-large_lid_multilingual_pytorch/repo?Revision=master&FilePath=examples/example_en.mp3"
,
"https://www.modelscope.cn/api/v1/models/iic/speech_whisper-large_lid_multilingual_pytorch/repo?Revision=master&FilePath=examples/example_ja.mp3"
,
"https://www.modelscope.cn/api/v1/models/iic/speech_whisper-large_lid_multilingual_pytorch/repo?Revision=master&FilePath=examples/example_ko.mp3"
,
]
inference_pipeline
=
pipeline
(
task
=
Tasks
.
auto_speech_recognition
,
model
=
"iic/speech_whisper-large_lid_multilingual_pytorch"
)
for
wav
in
multilingual_wavs
:
rec_result
=
inference_pipeline
(
input
=
wav
,
inference_clip_length
=
250
)
print
(
rec_result
)
FunASR/examples/deepspeed_conf/ds_stage1.json
0 → 100644
View file @
70a8a9e0
{
"train_micro_batch_size_per_gpu"
:
1
,
"gradient_accumulation_steps"
:
1
,
"steps_per_print"
:
100
,
"gradient_clipping"
:
5
,
"fp16"
:
{
"enabled"
:
false
,
"auto_cast"
:
false
,
"loss_scale"
:
0
,
"initial_scale_power"
:
16
,
"loss_scale_window"
:
1000
,
"hysteresis"
:
2
,
"consecutive_hysteresis"
:
false
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
true
},
"zero_force_ds_cpu_optimizer"
:
false
,
"zero_optimization"
:
{
"stage"
:
1
,
"offload_optimizer"
:
{
"device"
:
"none"
,
"pin_memory"
:
true
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
}
}
FunASR/examples/deepspeed_conf/ds_stage2.json
0 → 100644
View file @
70a8a9e0
{
"train_micro_batch_size_per_gpu"
:
1
,
"gradient_accumulation_steps"
:
1
,
"steps_per_print"
:
100
,
"gradient_clipping"
:
5
,
"fp16"
:
{
"enabled"
:
false
,
"auto_cast"
:
false
,
"loss_scale"
:
0
,
"initial_scale_power"
:
16
,
"loss_scale_window"
:
1000
,
"hysteresis"
:
2
,
"consecutive_hysteresis"
:
false
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
true
},
"zero_force_ds_cpu_optimizer"
:
false
,
"zero_optimization"
:
{
"stage"
:
2
,
"offload_optimizer"
:
{
"device"
:
"none"
,
"pin_memory"
:
true
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
false
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
}
}
FunASR/examples/deepspeed_conf/ds_stage3.json
0 → 100644
View file @
70a8a9e0
{
"train_micro_batch_size_per_gpu"
:
1
,
"gradient_accumulation_steps"
:
1
,
"steps_per_print"
:
100
,
"gradient_clipping"
:
5
,
"fp16"
:
{
"enabled"
:
false
,
"auto_cast"
:
false
,
"loss_scale"
:
0
,
"initial_scale_power"
:
16
,
"loss_scale_window"
:
1000
,
"hysteresis"
:
2
,
"consecutive_hysteresis"
:
false
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
true
},
"zero_force_ds_cpu_optimizer"
:
false
,
"zero_optimization"
:
{
"stage"
:
3
,
"offload_optimizer"
:
{
"device"
:
"none"
,
"pin_memory"
:
true
},
"offload_param"
:
{
"device"
:
"none"
,
"pin_memory"
:
true
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"stage3_max_live_parameters"
:
1e9
,
"stage3_max_reuse_distance"
:
1e9
,
"stage3_prefetch_bucket_size"
:
5e8
,
"stage3_param_persistence_threshold"
:
1e5
}
}
FunASR/examples/deepspeed_conf/ds_z0_config.json
0 → 100644
View file @
70a8a9e0
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
0
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
FunASR/examples/deepspeed_conf/ds_z2_config.json
0 → 100644
View file @
70a8a9e0
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
2
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
FunASR/examples/deepspeed_conf/ds_z2_offload_config.json
0 → 100644
View file @
70a8a9e0
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
2
,
"offload_optimizer"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
true
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
"round_robin_gradients"
:
true
}
}
\ No newline at end of file
FunASR/examples/deepspeed_conf/ds_z3_config.json
0 → 100644
View file @
70a8a9e0
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
3
,
"overlap_comm"
:
true
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
"stage3_prefetch_bucket_size"
:
"auto"
,
"stage3_param_persistence_threshold"
:
"auto"
,
"stage3_max_live_parameters"
:
1e9
,
"stage3_max_reuse_distance"
:
1e9
,
"stage3_gather_16bit_weights_on_model_save"
:
true
}
}
\ No newline at end of file
FunASR/examples/deepspeed_conf/ds_z3_offload_config.json
0 → 100644
View file @
70a8a9e0
{
"train_batch_size"
:
"auto"
,
"train_micro_batch_size_per_gpu"
:
"auto"
,
"gradient_accumulation_steps"
:
"auto"
,
"gradient_clipping"
:
"auto"
,
"zero_allow_untested_optimizer"
:
true
,
"fp16"
:
{
"enabled"
:
"auto"
,
"loss_scale"
:
0
,
"loss_scale_window"
:
1000
,
"initial_scale_power"
:
16
,
"hysteresis"
:
2
,
"min_loss_scale"
:
1
},
"bf16"
:
{
"enabled"
:
"auto"
},
"zero_optimization"
:
{
"stage"
:
3
,
"offload_optimizer"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"offload_param"
:
{
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"overlap_comm"
:
true
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
"stage3_prefetch_bucket_size"
:
"auto"
,
"stage3_param_persistence_threshold"
:
"auto"
,
"stage3_max_live_parameters"
:
1e9
,
"stage3_max_reuse_distance"
:
1e9
,
"stage3_gather_16bit_weights_on_model_save"
:
true
}
}
\ No newline at end of file
FunASR/examples/industrial_data_pretraining/bicif_paraformer/demo.py
0 → 100644
View file @
70a8a9e0
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)
from
funasr
import
AutoModel
model
=
AutoModel
(
model
=
"iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
,
vad_model
=
"iic/speech_fsmn_vad_zh-cn-16k-common-pytorch"
,
punc_model
=
"iic/punc_ct-transformer_cn-en-common-vocab471067-large"
,
# spk_model="iic/speech_campplus_sv_zh-cn_16k-common",
)
res
=
model
.
generate
(
input
=
"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_vad_punc_example.wav"
,
batch_size_s
=
300
,
batch_size_threshold_s
=
60
,
)
print
(
res
)
FunASR/examples/industrial_data_pretraining/bicif_paraformer/demo.sh
0 → 100644
View file @
70a8a9e0
model
=
"iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
vad_model
=
"iic/speech_fsmn_vad_zh-cn-16k-common-pytorch"
#punc_model="iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch"
punc_model
=
"iic/punc_ct-transformer_cn-en-common-vocab471067-large"
spk_model
=
"iic/speech_campplus_sv_zh-cn_16k-common"
python funasr/bin/inference.py
\
+model
=
${
model
}
\
+vad_model
=
${
vad_model
}
\
+punc_model
=
${
punc_model
}
\
+spk_model
=
${
spk_model
}
\
+input
=
"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_vad_punc_example.wav"
\
+output_dir
=
"./outputs/debug"
\
+device
=
"cpu"
\
+batch_size_s
=
300
\
+batch_size_threshold_s
=
60
FunASR/examples/industrial_data_pretraining/bicif_paraformer/export.py
0 → 100644
View file @
70a8a9e0
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)
# method1, inference from model hub
from
funasr
import
AutoModel
model
=
AutoModel
(
model
=
"iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
,
device
=
"cpu"
,
)
res
=
model
.
export
(
type
=
"torchscript"
,
quantize
=
False
)
print
(
res
)
# # method2, inference from local path
# from funasr import AutoModel
# model = AutoModel(
# model="/Users/zhifu/.cache/modelscope/hub/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
# device="cpu",
# )
# res = model.export(type="onnx", quantize=False)
# print(res)
Prev
1
…
3
4
5
6
7
8
9
10
11
…
42
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment