Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbound repeatedly crashes due to Python mount issues in chroot environment #8439

Open
2 tasks done
Joel7homas opened this issue Mar 14, 2025 · 0 comments
Open
2 tasks done

Comments

@Joel7homas
Copy link

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

Unbound DNS service repeatedly crashes due to Python mounting issues within the chroot environment. Service restarts are required multiple times per day to maintain DNS resolution.

Core symptoms:

  • Unbound service fails 1-2 times daily requiring manual restart
  • Python mount errors in logs related to the chroot environment
  • Issue persists even with DNS Blocklist functionality disabled in UI

Logs consistently show these errors:

  • mount_nullfs: /var/unbound/usr/local/lib/python3.11: Resource deadlock avoided
  • mount_nullfs: /var/unbound/usr/local/lib/python3.11: Device busy

Logs show that Unbound is being stopped and started every one minute, as follows. Perhaps the crashes are related to occasional failures starting after being stopped by this automation. 

plugins_configure unbound_stop (1)
plugins_configure unbound_start (1)
The command '/bin/kill -'TERM' 'PID''(pid:/var/run/unbound.pid) returned exit code '1', the output was 'kill: PID: No such process'

Additionally, Python initialization errors occur:

[9897:0] fatal error: failed to init modules
[9897:0] error: module init for module python failed
[9897:0] error: python exception in Py_InitializeFromConfig: init_fs_encoding: failed to get the Python codec of the filesystem encoding

To Reproduce

The issue occurs spontaneously after several hours of operation, and I haven't found a specific trigger. The problem has been persistent across reboots and OPNsense updates and appears to happen regardless of network load.

Expected behavior

Unbound DNS service should run continuously without requiring manual intervention.

Describe alternatives you considered

  1. Disabled DNS Blocklist feature through the UI, but Python dependencies continue to load and cause failures
  2. Attempted to unmount Python directories manually, but they remain busy with Python processes that continually respawn when killed

Relevant environment information

  • OPNsense version: 25.1.3 (amd64)
  • FreeBSD: 14.2-RELEASE-p2 FreeBSD 14.2-RELEASE-p2 stable/25.1-n269701-7c59d89f8cd SMP amd64

Relevant configuration

From unbound.conf:

module-config: "python validator iterator"

Python module configuration:

python:
python-script: dnsbl_module.py

Relevant log entries

Repeated Python mount failures:

<11>1 2025-03-14T00:43:35-06:00 lavash.7homas.com opnsense 80116 - [meta sequenceId="3309"] /usr/local/sbin/pluginctl: The command '/sbin/mount -r -t nullfs '/usr/local/lib/python3.11' '/var/unbound/usr/local/lib/python3.11'' returned exit code '1', the output was 'mount_nullfs: /var/unbound/usr/local/lib/python3.11: Resource deadlock avoided'

<11>1 2025-03-14T03:03:55-06:00 lavash.7homas.com opnsense 11022 - [meta sequenceId="13335"] /usr/local/sbin/pluginctl: The command '/sbin/mount -r -t nullfs '/usr/local/lib/python3.11' '/var/unbound/usr/local/lib/python3.11'' returned exit code '1', the output was 'mount_nullfs: /var/unbound/usr/local/lib/python3.11: Device busy'

Extensive service flapping (occurring every minute):

<13>1 2025-03-14T11:03:11-06:00 lavash.7homas.com opnsense 33021 - [meta sequenceId="437"] /usr/local/sbin/pluginctl: plugins_configure unbound_stop (execute task : unbound_service_stop(1))
<13>1 2025-03-14T11:03:12-06:00 lavash.7homas.com opnsense 35508 - [meta sequenceId="443"] /usr/local/sbin/pluginctl: plugins_configure unbound_start (1)
<11>1 2025-03-14T11:03:12-06:00 lavash.7homas.com opnsense 35508 - [meta sequenceId="445"] /usr/local/sbin/pluginctl: The command '/bin/kill -'TERM' '55249''(pid:/var/run/unbound.pid)  returned exit code '1', the output was 'kill: 55249: No such process'

Additional information

Python processes remain persistent and respawn immediately after being killed:

root    17256  12.0  1.2   174832   94116  -  S    11:18      0:00.49 /usr/local/bin/python3 /usr/local/opnsense/scripts/unbound/logger.py (python3.11)
root    16859   4.5  0.2    32980   17204  -  Ss   11:18      0:00.42 /usr/local/bin/python3 /usr/local/opnsense/scripts/dhcp/unbound_watcher.py --domain 7homas.com (python3.11)

Despite the service showing as not running, Python processes related to unbound continue to run:

unbound is not running.

/usr/local/lib/python3.11 on /var/unbound/usr/local/lib/python3.11 (nullfs, local, noatime, read-only, nfsv4acls)

I think the issue first appeared after updating to OPNsense 25.1.

DNS resolution generally works despite the flapping, but it does cause dropped connections with API sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant