You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Numbers suggests that EncDecCTCModelBPE has RAM memory leaks.
I'm running inference with batch_size=1 on a directory with 10k small audio files with total duration of around 9.5 hours.
Does any body know where this excessive RAM usage comes from?
Here is a sample script:
import time
from pathlib import Path
import psutil
import torch
from nemo.collections.asr.models import EncDecCTCModelBPE
torch.set_num_threads(1)
proc_usage = psutil.Process()
cur_usage_rss = proc_usage.memory_info().rss / 1024 / 1024
print(f"Initial rss usage:{cur_usage_rss}")
dir_name = "/mnt/data/subset_10k"
audio_paths = Path(dir_name)
def find_leaks():
model = EncDecCTCModelBPE.from_pretrained(model_name="stt_en_conformer_ctc_medium")
res_file_path = f"memory_usage3_start_{time.time()}.txt"
fd = open(res_file_path, "w")
files = audio_paths.glob("*.wav")
for idx, a_path in enumerate(files):
res = model.transcribe(str(a_path), verbose=False)
if idx % 100 == 0:
cur_usage_rss = proc_usage.memory_info().rss / 1024 / 1024
msg = f"{idx: <10} rss usage:{cur_usage_rss}"
print(msg)
print(msg, file=fd)
fd.close()
if __name__ == "__main__":
find_leaks()
Numbers suggests that
EncDecCTCModelBPE
has RAM memory leaks.I'm running inference with
batch_size=1
on a directory with 10k small audio files with total duration of around 9.5 hours.Does any body know where this excessive RAM usage comes from?
Here is a sample script:
And a sample output:
Expected behavior
Constant memory consumption over time.
Environment overview (please complete the following information)
python -m pip install git+https://github.com/NVIDIA/NeMo.git@{2.0.0rc0}#egg=nemo_toolkit[all]
Environment details
The text was updated successfully, but these errors were encountered: