Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Memory Usage when Iterating over cursor with many document #490

Open
myitinos opened this issue Jun 7, 2024 · 1 comment
Open

High Memory Usage when Iterating over cursor with many document #490

myitinos opened this issue Jun 7, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@myitinos
Copy link

myitinos commented Jun 7, 2024

Bug

I have a use case where I need to query a collection that returns about ~1M object, which when my scripts is running I see memory usage upward of 1G. I thought this shouldn't happen as I am iterating over the cursor and not reading all object to memory at once.

Current Behavior

async for data in ENGINE.find(Data):
    # do something with data

this will loads all object to memory because of __aiter__ method in AIOCursor :

async def __aiter__(self) -> AsyncGenerator[ModelType, None]:
    if self._results is not None:
        for res in self._results:
            yield res
        return
    results = []
    async for raw_doc in self._cursor:
        instance = self._parse_document(raw_doc)
        results.append(instance)
        yield instance
    self._results = results

Expected behavior

shouldn't __aiter__ yield each instance without caching it to memory since I'm iterating over the cursor not reading all object to memory?

ex:

async def __aiter__(self) -> AsyncGenerator[ModelType, None]:
    async for raw_doc in self._cursor:
        instance = self._parse_document(raw_doc)
        yield instance

Environment

  • ODMantic version: 1.0.2
  • MongoDB version: 7.0.9
  • Pydantic infos (output of python -c "import pydantic.utils; print(pydantic.utils.version_info())):
             pydantic version: 2.7.3
        pydantic-core version: 2.18.4
          pydantic-core build: profile=release pgo=true
                 install path: /home/<redacted>/.venv/lib/python3.12/site-packages/pydantic
               python version: 3.12.3 (main, Apr 27 2024, 19:00:26) [GCC 9.4.0]
                     platform: Linux-5.4.0-182-generic-x86_64-with-glibc2.31
             related packages: mypy-1.10.0 typing_extensions-4.12.1 pydantic-settings-2.3.1 fastapi-0.111.0
                       commit: unknown

Additional context

Curios why it needs to cache instance to a private variable when I'm iterating over the cursor.

@myitinos myitinos added the bug Something isn't working label Jun 7, 2024
@z0z0r4
Copy link

z0z0r4 commented Dec 31, 2024

I have the same question, it's easy to catch oom when any find operation fetch a large mount of documents...

I am not sure it's reasonable, maybe there are some mistakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants