Skip to content

Debugging

Matthias Mayr edited this page Sep 27, 2017 · 22 revisions

In general all classic debugging strategies remain valid to debug MueLu but some specific flags, functions and debugger instructions can help speed up the process!

Debugging in serial

Run GDB with input arguments:
gdb --args ./MueLu_UnitTests.exe --test-name=MakeTentative

Break points

If you need to break somewhere try:
(gdb) b myFile.cpp:lineWhereToStop
sometime this might not work due to dynamic libraries that are not loaded, so you can try putting a first break point in main.cpp and then adding more specific break points from there. You can also make a specific break points conditional on an expression with the following syntax:
cond breakpoint# conditional-expression
If you are using multiple breakpoints in the same run you can purposely set a variable to skip later conditional break points based on early program behavior, this can be done like this for a string called a:
call a.assign("ok");

To catch exceptions use:
(gdb) catch throw

Trilinos specific debugging

Teuchos has some advanced printing capabilities and MueLu relies on it to print with various levels of details, see src/MueCentral/MueLu_VerbosityLevel.hpp for more details. I personally like to use the following in my code:

    // Print from all processes
    RCP<Teuchos::FancyOStream> fancy = Teuchos::fancyOStream(Teuchos::rcpFromRef(std::cout));
    fancy->setShowAllFrontMatter(false).setShowProcRank(true);
    Teuchos::FancyOStream& out  = *fancy;
    // // Print from a particular rank
    // const int procRank = Teuchos::GlobalMPISession::getRank();
    // Teuchos::oblackholestream blackhole;
    // std::ostream& out = ( procRank == 0 ? std::cout : blackhole );
    // // Do not print anything
    // Teuchos::oblackholestream blackhole;
    // std::ostream& out = blackhole;

You can uncomment the appropriate lines to get the desired print behavior during run time. You could be fancier and use an environment variable to activate the right print level without recompiling.
Quite often objects and data are wrapped with Teuchos::RCP for convenient memory management, however this might lead to very large error messages that make debugging hard. You can currently find here a manual for Teuchos::RCP with information on debugging in section 5.11. I am pasting here what is in my opinion the most useful piece of code regarding RPC debugging:

***
*** Warning! The following Teuchos::RCPNode objects were created but have
*** not been destroyed yet.  A memory checking tool may complain that these
*** objects are not destroyed correctly.
***
*** There can be many possible reasons that this might occur including:
***
***   a) The program called abort() or exit() before main() was finished.
***      All of the objects that would have been freed through destructors
***      are not freed but some compilers (e.g. GCC) will still call the
***      destructors on static objects (which is what causes this message
***      to be printed).
***
***   b) The program is using raw new/delete to manage some objects and
***      delete was not called correctly and the objects not deleted hold
***      other objects through reference-counted pointers.
***
***   c) This may be an indication that these objects may be involved in
***      a circular dependency of reference-counted managed objects.
***

0: RCPNode (map_key_void_ptr=0x4a3ff50)
Information = {T=A, ConcreteT=A, p=0x4a3ff50, has_ownership=1}
RCPNode address = 0x4a3ffa8
insertionNumber = 23
1: RCPNode (map_key_void_ptr=0x4a40548)
Information = {T=B, ConcreteT=B, p=0x4a40548, has_ownership=1}
RCPNode address = 0x4a405f0
insertionNumber = 24

NOTE: To debug issues, open a debugger, and set a break point in the function where the
the RCPNode object is first created to determine the context where the object first
gets created.  Each RCPNode object is given a unique insertionNumber to allow setting
breakpoints in the code.  For example, in GDB one can perform:

1) Open the debugger (GDB) and run the program again to get updated object addresses

2) Set a breakpoint in the RCPNode insertion routine when the desired RCPNode is first
inserted.  In GDB, to break when the RCPNode with insertionNumber==3 is added, do:
(gdb) b ’Teuchos::RCPNodeTracer::addNewRCPNode( [TAB] [ENTER]
(gdb) cond 1 insertionNumber==3 [ENTER]

3) Run the program in the debugger.  In GDB, do:
(gdb) run [ENTER]

4) Examine the call stack when the prgoram breaks in the function addNewRCPNode(...)

Using Eclipse

  1. Click 'Run->Debug Configurations...'
  2. Select 'C/C++ Application' and add a new application
  3. Configure the new application
  • Add a unique name
  • Tab 'Main':
    • select the project and the debug executable
    • select 'Disable auto build'
  • Tab 'Arguments':
    • Add all command line arguments to 'Program arguments'
    • Select working directory (i.e. the directory with the executable)
  1. Click 'Apply'
  2. Click 'Run'

To re-run the debugger, just execute the debug configuration again.

Debugging in parallel

Using gdb

Note, you must use bash. Be sure that the SHELL environment is set to bin/bash.

mpirun -n 2 xterm -hold -e gdb --args ./MueLu_ScalingTestParamList.exe [test args]
or this is if you want to run some commands at startup:

echo -e "catch throw\nrun" > /tmp/gdb-cmd
mpirun -np 2 xterm -hold -e gdb -x /tmp/gdb-cmd --args ./MueLu_ScalingTestParamList.exe [test args]

Check your version of gdb. According to this post, for gcc 4.8.5 and higher, you must use gdb 7.5 or compile with the additional option "-gdwarf-2".

Using Valgrind

To spawn a gdb session in a new xterm when valgrind detects an error:

mpirun -np 2 valgrind --leak-check=full --show-reachable=yes --db-attach=yes --db-command="xterm -geometry 120x50 -hold -e gdb -nw %f %p" ./MueLu_Driver.exe --nx=5 --ny=5 --notimings --xml=tiny.xml

If you use the gdb Text User Interface (TUI), the only way I know of setting a reasonably sized window is to specify it with the -geometry flag.

Using Eclipse

Configure Eclipse to allow attaching Eclipse's GUI for gdb to your parallel programme:

  1. Click 'Run->Debug Configurations...'
  2. Select 'C/C++ Attach to Application' and add a new application
  3. Configure the new application
  • Add a unique name
  • Tab 'Main':
    • select the project and the debug executable
    • select 'Disable auto build'
  • Tab 'Debugger':
    • check 'Non-stop mode'
    • check 'Automatically debug forked processes'
  1. Click 'Apply'

Run the debugger:

  1. Insert MueLu::Utilities<Scalar,LocalOrdinal,GlobalOrdinal,Node>::PauseForDebugger() in your main() routine to stop for attaching the debugger. Compile!
  2. Execute programme from command line as usual. It will pause to attach the debugger.
  3. Start your debug configuration in Eclipse from 'Run->Debug Configurations...'. Wait for it to be fully setup.
  4. Hit any key in the terminal.
Clone this wiki locally