What does it mean that these cache-aware streaming conformer methods simulate cache-aware streaming? #10388

maxeduc · 2024-09-08T00:40:35Z

maxeduc
Sep 8, 2024

See these sources:

Line 236 in cda2a63

    
           To simulate cache-aware streaming, you may use the script at ``<NeMo_git_root>/examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py``. It can simulate streaming in single stream or multi-stream mode (in batches) for an ASR model.

NeMo/nemo/collections/asr/parts/mixins/mixins.py

Line 590 in cda2a63

It simulates a forward step with caching for streaming purposes.

NeMo/nemo/collections/asr/parts/mixins/mixins.py

Line 714 in cda2a63

def transcribe_simulate_cache_aware_streaming(

Does it "simulate" cache-aware streaming or does it perform it? Models trained natively with cache-aware streaming are available, e.g. here. Does running functions such as conformer_stream_step() repeatedly, like it's done in the notebook here, actually perform the streaming step with the appropriate optimizations? Is it that it somehow logically produces the same output as cache-aware streaming but unoptimized, like you're still feeding in large batches of context into the model or something and they're just thrown out to produce the same output as optimized cache-aware streaming?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does it mean that these cache-aware streaming conformer methods simulate cache-aware streaming? #10388

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

What does it mean that these cache-aware streaming conformer methods *simulate* cache-aware streaming? #10388

maxeduc Sep 8, 2024

Replies: 0 comments

What does it mean that these cache-aware streaming conformer methods simulate cache-aware streaming? #10388

maxeduc
Sep 8, 2024