Skip to content
T-X edited this page Jan 31, 2018 · 7 revisions

Introduction

On embedded devices we are regularly dealing with a very limited amount of flash and RAM. This page serves the purpose of helping with and tracking the status of the latter, RAM issues.

The motivation of this page is ticket #1243 in particular.

Helpful knowledge, links and articles

Collect any external things that might help other people to understand and debug OOM issues here.

  • [Add some links here, explaining userspace vs. kernelspace allocations, kmalloc(), kmem_cache_alloc(), vmalloc(), /proc/slabinfo, /proc/vmallocinfo, /proc/vmstat, echo 'm' > /proc/sysrq-trigger,...]

How to Debug

  • Build Gluon with
  • On OOM and after reboot, get crash report from /sys/kernel/debug/crashlog
  • Try to find a reproducable, isolate setup!
  • Observe:
    • /proc/slabinfo
    • /proc/vmallocinfo
    • /proc/vmstat
    • echo 'm' > /proc/sysrq-trigger; dmesg
  • Helpful tools:
    • Traffic monitoring: tcpdump, wireshark, etc.
    • Traffic generators: mausezahn, iperf, tcpreplay, etc.
  • ...
  • Profit

Current Issues, Observations and Status

Out-of-memory due to kernel allocations

Status: Unsolved

Issue: OOM due to allocations in kernelspace.

Related tickets: #1243, #1306

How to trigger: In networks with a high number of nodes?


Observations so far:

  • First observed after the switch the first Gluon releases based on LEDE
  • Nothing suspicious in /proc/slabinfo on crash ** Seems to outrule the Linux bridge or batman-adv as a potential cause
  • Setting 'echo fq_memory_limit 200 > /sys/kernel/debug/ieee80211/phy0/aqm' (seemingly?) had a positive effect

Tasks:

  • Finding a setup to reproduce the issue in an isolated configuration.

OOM on IP Fragments

Status: Unsolved

Issue: IPv4+v6 fragmentation buffers may buffer packets of up to a size of 8MB in total (4MB per address family)

Related tickets: -

How to trigger: An OOM was easily triggered via iperf3 running on a node, if packets were fragmented ($ iperf3 -l 1500). However should potentially be triggerable with no extra tools on the node and just external traffic, too?


Should be easily fixable by trimming /proc/sys/net/ipv6/ip6frag_{low,high}thresh and /proc/sys/net/ipv4/ipfrag{low,high}_thresh. Additional firewall rules might be considered, too.

Clone this wiki locally