Skip to content

Commit 12496e5

Browse files
LauraGPTlyblsgoR1ckShi
authored
streaming bugfix (#1271)
* funasr1.0 funetine * funasr1.0 pbar * update with main (#1260) * Update websocket_protocol_zh.md * update --------- Co-authored-by: Yabin Li <[email protected]> Co-authored-by: shixian.shi <[email protected]> * update with main (#1264) * Funasr1.0 (#1261) * funasr1.0 funetine * funasr1.0 pbar * update with main (#1260) * Update websocket_protocol_zh.md * update --------- Co-authored-by: Yabin Li <[email protected]> Co-authored-by: shixian.shi <[email protected]> --------- Co-authored-by: Yabin Li <[email protected]> Co-authored-by: shixian.shi <[email protected]> * bug fix --------- Co-authored-by: Yabin Li <[email protected]> Co-authored-by: shixian.shi <[email protected]> * funasr1.0 sanm scama * funasr1.0 infer_after_finetune * funasr1.0 fsmn-vad bug fix * funasr1.0 fsmn-vad bug fix * funasr1.0 fsmn-vad bug fix --------- Co-authored-by: Yabin Li <[email protected]> Co-authored-by: shixian.shi <[email protected]>
1 parent b28f3c9 commit 12496e5

File tree

3 files changed

+5
-5
lines changed
  • examples/industrial_data_pretraining/paraformer_streaming
  • funasr/models

3 files changed

+5
-5
lines changed

examples/industrial_data_pretraining/paraformer_streaming/demo.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
decoder_chunk_look_back = 1 #number of encoder chunks to lookback for decoder cross-attention
1111

1212
model = AutoModel(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online", model_revision="v2.0.2")
13-
cache = {}
1413
res = model.generate(input="https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav",
1514
chunk_size=chunk_size,
1615
encoder_chunk_look_back=encoder_chunk_look_back,

funasr/models/fsmn_vad_streaming/model.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -501,7 +501,9 @@ def forward(self, feats: torch.Tensor, waveform: torch.tensor, cache: dict = {},
501501
# self.AllResetDetection()
502502
return segments
503503

504+
504505
def init_cache(self, cache: dict = {}, **kwargs):
506+
505507
cache["frontend"] = {}
506508
cache["prev_samples"] = torch.empty(0)
507509
cache["encoder"] = {}
@@ -528,7 +530,7 @@ def inference(self,
528530
cache: dict = {},
529531
**kwargs,
530532
):
531-
533+
532534
if len(cache) == 0:
533535
self.init_cache(cache, **kwargs)
534536

@@ -583,7 +585,7 @@ def inference(self,
583585

584586
cache["prev_samples"] = audio_sample[:-m]
585587
if _is_final:
586-
cache = {}
588+
self.init_cache(cache)
587589

588590
ibest_writer = None
589591
if ibest_writer is None and kwargs.get("output_dir") is not None:

funasr/models/paraformer_streaming/model.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -502,8 +502,7 @@ def inference(self,
502502
logging.info("enable beam_search")
503503
self.init_beam_search(**kwargs)
504504
self.nbest = kwargs.get("nbest", 1)
505-
506-
505+
507506
if len(cache) == 0:
508507
self.init_cache(cache, **kwargs)
509508

0 commit comments

Comments
 (0)