wiki:TroubleShooting

Version 8 (modified by heinkonijn, 16 years ago) ( diff )

--

Trouble-Shooting mindi

Launch mindi using the verbose option of the shell:

bash -x /usr/sbin/mindi 2>&1 | tee /tmp/mindi.log

If you want to have those mindi traces from mondoarchive, then change the mindi script by adding at the begining :

set -x

Trouble-Shooting mondo

mondo basically consists of two C programs, mondoarchive and mondorestore. To trouble-shoot mondo therefore may mean to debug. This sounds scarier than it is - just read on. ;-)

Creating Backtraces

Backtraces can be very helpful when trouble-shooting issues like segmentation faults. To create a useful backtrace, you need gdb (the GNU Debugger) installed and an application (and possibly libraries) with debugging symbols built in. The following will explain how to do this.

gdb

gdb should be part of your distribution just use your favourite way to install the package, e.g.

apt-get install gdb

for Debian and friends (such as Ubuntu) or

yum install gdb

for Fedora/RedHat/Mandriva

mondoarchive/mondorescue with debugging symbols

To get mondoarchive and mondorescue with debugging symbols built in, you need to build from the source.

Get the latest stable mondo source package from ftp://ftp.mondorescue.org/src/, e.g. mondo-2.0.9.tar.gz, unpack:

tar xvzf mondo-2.0.9.tar.gz

enter into the new directory and build using make:

cd mondo-2.0.9
./configure --prefix=/usr
make

You will end up with binary in the following locations which are non-stripped, i.e. they contain debugging symbols:

file mondo/mondoarchive/mondoarchive
mondo/mondoarchive/mondoarchive: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped

and

file mondo/mondorestore/mondorestore
mondo/mondorestore/mondorestore: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped

Make backups of the original mondoarchive and mondorestore binaries and copy they newly created over the original ones.

Trouble-Shooting mondoarchive

The best approach is to run the mondoarchive binary you just created with debugging symbols built in from its location in the built directory under gdb. So:

cd mondo/mondoarchive
gdb ./mondoarchive
run <usual arguments you use>

Note: Running it from within its build directory makes it so that more valuable information about source lines will be avilable in the backtrace.

When the segmentation fault happens, enter:

bt

and send the output to the list.

Another possibility is to run valgrind mondoarchive [params] (if you have valgrind installed on your system) as it will give even more information on potential related memory issues.

Trouble-Shooting mondorestore

If you do partial restore onto a live system, the same approach as described for mondoarchive can be used.

However, more likely you will experience a segmentation fault during restore. To run a backtrace in that situation proceeed as follows:

First, you need a mondorestore binary with debugging symbols. This should already been taken care of if you copied the newly compiled binaries as described above. Next, you need to make sure that gdb is available on your restore media. To achieve this, add this to /etc/mindi/deplist.txt before doing a mondoarchive run:

gdb
libthread_db.so.1

Boot the restore media into expert mode. Then start mondorestore like this:

gdb /usr/sbin/mondorestore
run

As described previously, once the segmentation fault happens, do:

bt

and send the output to the list. (If you can't get the backtrace copied as text, you can use a photo of the screen as the last resort.

Advanced Topics

Attaching to Running Processes

You can attach to a running process using:

gdb /usr/sbin/mondorestore <pid>

where <pid> is the process ID.

This can be particulary useful when running mondoarchive with the '-g' or when running mondorestore.

Using libraries with debugging symbols

The libraries used by a binary can be determined using the ldd command, e.g.:

ldd /usr/sbin/mondoarchive
                libmondo.so.2 => /usr/lib/libmondo.so.2 (0xb7f8d000)
        libmondo-newt.so.1 => /usr/lib/libmondo-newt.so.1 (0xb7f82000)
        libnewt.so.0.51 => /usr/lib/libnewt.so.0.51 (0xb7f71000)
        libdl.so.2 => /lib/tls/libdl.so.2 (0xb7f6e000)
        libpthread.so.0 => /lib/tls/libpthread.so.0 (0xb7f5f000)
        libc.so.6 => /lib/tls/libc.so.6 (0xb7e2a000)
        libslang.so.1-UTF8 => /lib/libslang.so.1-UTF8 (0xb7db7000)
        libm.so.6 => /lib/tls/libm.so.6 (0xb7d94000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)

some of those libraries may come with debugging symbols built in as an alternative package, others can be buiolt from scratch and installed with debugging sysmbols installed. Using your distribtuion's standard built process is probably a good idea for this.

Worthwhile gcc Flags

-Wextra -Wshadow -Wstack-protector -fstack-protector

Getting the entire kernel log on restore media

The kernel ring buffer that dmesg reads defaults to 32k on recent kernels. This is not enough to capture the entire sequence of kernel message when Mondo Rescue boots off a restore media.

To increase the kernel ring buffer to 128k at boot time (and without recompilation) add the following kernel boot parameter:

log_buf_len=128k

e.g.

expert log_buf_len=128k

dmesg needs to be told what buffer size to use to ensure that everything is displayed from the start. The -s parameter can be used for this like this:

dmesg -s 131072 | less

Unable to boot restored server

When a restored server fails to boot to the grub menu (seen on SLES10) do: Boot from SLES10 DVD and choose the recovery option and let it boot. Mount your boot partion e.g.

mount /dev/sda1 /boot

Run

grub-install

and reboot.

Note: See TracWiki for help on using the wiki.