-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: too many open files #731
Comments
It's a system limit to number of open files or connections, you can refer to https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/ to increase it. |
i tried, but it's not working, is that possible to limit the multi-process number in xorbitsai
天天高兴!
***@***.***
…------------------ 原始邮件 ------------------
发件人: "xorbitsai/xorbits" ***@***.***>;
发送时间: 2023年10月7日(星期六) 上午10:30
***@***.***>;
***@***.******@***.***>;
主题: Re: [xorbitsai/xorbits] BUG: too many open files (Issue #731)
It's a system limit to number of open files or connections, you can refer to https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/ to increase it.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, could you please provide the complete error stack and message for us to debug? Thanks. |
Also, you could try to use
Keep your codes and just add this line when xorbits init. |
Thx, i'll have a try
…---Original---
From: "Chengjie ***@***.***>
Date: Tue, Oct 17, 2023 19:23 PM
To: ***@***.***>;
Cc: "Optimus ***@***.******@***.***>;
Subject: Re: [xorbitsai/xorbits] BUG: too many open files (Issue #731)
i tried, but it's not working, is that possible to limit the multi-process number in xorbitsai 天天高兴! @.***
…
------------------ 原始邮件 ------------------ 发件人: "xorbitsai/xorbits" @.>; 发送时间: 2023年10月7日(星期六) 上午10:30 _@**._>; ***@***.***@._>; 主题: Re: [xorbitsai/xorbits] BUG: too many open files (Issue #731) It's a system limit to number of open files or connections, you can refer to https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/ to increase it. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>
Hi, could you please provide the complete error stack and message for us to debug? Thanks.
Also, you could try to use mmap backend to run your code. This way to enable mmap backend:
import xorbits xorbits.init(storage_config={"mmap": {"root_dirs": "<your dir>"}})
Keep your codes and just add this line when xorbits init.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I met this issue again. Please help to solve. ===================================
running code is as below: import xorbits.pandas as pd
from xorbits.experimental import dedup
data_lst = [{'query': 'xxx', 'ori_txt': 'xxx'}, {'query': 'xxx', 'ori_txt': 'xxx'}]
df = pd.DataFrame(data_lst])
res = dedup(df, col="query", method="minhash", threshold=threshold,
num_perm=num_perm, min_length=min_length, ngrams=ngrams, seed=seed,
verbose=True)
print('ori len: ', len(data_lst))
print('dedup len: ', len(res['query'].tolist())) data_lst contains 10k+ data. =================================== Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/forkserver.py", line 258, in main
fds = reduction.recvfds(s, MAXFDS_TO_SEND + 1)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/reduction.py", line 159, in recvfds
raise EOFError
EOFError
Traceback (most recent call last):
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1954, in get_default_or_create
session = new_session("127.0.0.1", init_local=True, **kwargs)
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1924, in new_session
session = SyncSession.init(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1550, in init
isolated_session = fut.result()
File "/home//miniconda3/envs/train_py310/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/home//miniconda3/envs/train_py310/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/session.py", line 775, in init
await new_cluster_in_isolation(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/local.py", line 101, in new_cluster_in_isolation
await cluster.start()
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/local.py", line 344, in start
await self._start_worker_pools()
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/local.py", line 386, in _start_worker_pools
worker_pool = await create_worker_actor_pool(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xorbits/_mars/deploy/oscar/pool.py", line 310, in create_worker_actor_pool
return await create_actor_pool(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/api.py", line 179, in create_actor_pool
return await get_backend(scheme).create_actor_pool(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/backend.py", line 49, in create_actor_pool
return await create_actor_pool(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/pool.py", line 1585, in create_actor_pool
pool: MainActorPoolType = await pool_cls.create(
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/pool.py", line 1282, in create
processes, ext_addresses = await cls.wait_sub_pools_ready(tasks)
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py", line 221, in wait_sub_pools_ready
process, status = await task
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py", line 213, in start_sub_pool
return await create_pool_task
File "/home//miniconda3/envs/train_py310/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py", line 203, in start_pool_in_process
process.start()
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/context.py", line 300, in _Popen
return Popen(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_forkserver.py", line 35, in __init__
super().__init__(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_forkserver.py", line 58, in _launch
f.write(buf.getbuffer())
BrokenPipeError: [Errno 32] Broken pipe
2024-04-12 13:59:09,343 asyncio 671836 ERROR Task was destroyed but it is pending!
task: <Task pending name='Task-10' coro=<MainActorPoolBase.monitor_sub_pools() running at /home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/pool.py:1458> wait_for=<Future pending cb=[Task.task_wakeup()]>>
2024-04-12 13:59:09,344 asyncio 671836 ERROR Task exception was never retrieved
future: <Task finished name='Task-402' coro=<MainActorPool.start_sub_pool() done, defined at /home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py:180> exception=OSError(24, 'Too many open files')>
Traceback (most recent call last):
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py", line 213, in start_sub_pool
return await create_pool_task
File "/home//miniconda3/envs/train_py310/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home//miniconda3/envs/train_py310/lib/python3.10/site-packages/xoscar/backends/indigen/pool.py", line 203, in start_pool_in_process
process.start()
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/context.py", line 300, in _Popen
return Popen(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_forkserver.py", line 35, in __init__
super().__init__(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/popen_forkserver.py", line 51, in _launch
self.sentinel, w = forkserver.connect_to_new_process(self._fds)
File "/home//miniconda3/envs/train_py310/lib/python3.10/multiprocessing/forkserver.py", line 87, in connect_to_new_process
with socket.socket(socket.AF_UNIX) as client:
File "/home//miniconda3/envs/train_py310/lib/python3.10/socket.py", line 232, in __init__
_socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 24] Too many open files =================================== sudo sysctl -w fs.file-max=100000
ulimit -S -n 1048576 It still gives out error. Error comes from len(res['query'].tolist()), so how to parse the result. Thanks for your reply. |
Describe the bug
I'm process a very large file (25G, each line with max 100,000 long str), I'm using dedup function, it gives out this error
To Reproduce
To help us to reproduce this bug, please provide information below:
Your Python version: 3.10
The version of Xorbits you use: 0.6.3
Versions of crucial packages, such as numpy, scipy and pandas: numpy 1.26.0, scipy 1.11.3, pandas 2.1.1
4. Full stack of the error.
5. Minimized code to reproduce the error.
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: