Skip to content
asperous edited this page Oct 18, 2012 · 9 revisions

Troubleshoot CFEngine

Please follow the steps below before submitting a bug.

CFEngine appears to hang

It is important to determine what is hanging, i.e. whether it is CFEngine itself or something that CFEngine is interacting with. Run the program with -v and -d flags to see if it has gone into an infinite loop, or if it is waiting for something.

Common causes of hanging processes:

  • Berkeley DB database corruption: try to delete *_db files in /var/cfengine
  • Command processes that do not properly close their file descriptiors of child processes. Try running the commands with a shell enabled (use_shell => "yes") and use </dev/null >/dev/null to close the descriptors.

CFEngine generates a segmentation fault

Segfaults in CFEngine may be caused by the incorrect build environment or by bugs in CFEngine or the libraries it uses.

  • Install gdb.
  • Download gdbscript and save it to /var/cfengine/bin/gdbscript.
  • Prepend /var/cfengine/bin/gdbscript to the command line to run the faulting component, ** for all components, add --verbose option. ** for daemons, add --no-fork option. For example: `

/var/cfengine/bin/gdbscript /var/cfengine/bin/cf-serverd --no-fork &

[1] 18790

outputting trace to '/root/cfgdb-cf-serverd.txt'

[1]+ Done /var/cfengine/bin/gdbscript cf-serverd --no-fork

grep SIGSEGV /root/cfgdb-cf-serverd.txt

Program received signal SIGSEGV, Segmentation fault. `

Obtain the trace from /root/cfgdb-*.txt

Memory leak

  • Install Valgrind.
  • Run the leaking CFEngine component inside valgrind ** for all components, add --verbose option. ** for daemons, ad --no-fork option.

For example: valgrind --leak-check=full \ /var/cfengine/bin/cf-serverd --no-fork 2>/root/valgrind-cf-serverd &

  • If you are debugging a daemon, let it run for such a long time that you are confident that the consumed memory is a bug (remember that valgrind also consumes memory).
  • Send SIGINT to the valgrind process (prefixed memcheck) `

ps -e|grep mem

2194 pts/0 00:00:24 memcheck-x86-li

kill -SIGINT 2194

`

After a successful memory trace has been obtained in /root/valgrind-*.txt, check the end of the trace to verify that at least 10-20MB are lost. Otherwise, rerun the tracing for a longer period of time to gather enough data.

Lots of cf-agents are piling up in the process table

Probably CFEngine is getting stuck on the long task. Kill the existing processes, and run cf-agent -v to see what's going on.

Promises are not evaluated on second run

This is not a bug, but a feature. Have a look at [["Locks" section|https://cfengine.com/manuals/cf3-reference#When-and-where-are-promises-made_003f]] in Reference Manual.

Clone this wiki locally