Version 28 (modified by 7 years ago) ( diff ) | ,
---|
Trouble-Shooting mindi
Launch mindi using the verbose option of the shell:
bash -x /usr/sbin/mindi 2>&1 | tee /tmp/mindi.log
If you want to have those mindi traces from mondoarchive, then change the mindi script by adding at the begining :
set -x
Trouble-Shooting mondo
mondo basically consists of two C programs, mondoarchive and mondorestore. To trouble-shoot mondo therefore may mean to debug. This sounds scarier than it is - just read on. ;-)
Creating Backtraces
Backtraces can be very helpful when trouble-shooting issues like segmentation faults. To create a useful backtrace, you need gdb (the GNU Debugger) installed and an application (and possibly libraries) with debugging symbols built in. The following will explain how to do this.
gdb
gdb should be part of your distribution just use your favourite way to install the package, e.g.
apt-get install gdb
for Debian and friends (such as Ubuntu) or
yum install gdb
mondoarchive/mondorescue with debugging symbols
To get mondoarchive and mondorescue with debugging symbols built in, you need to build from the source.
Get the latest stable mondo source package from ftp://ftp.mondorescue.org/src/, e.g. mondo-2.0.9.tar.gz, unpack:
tar xvzf mondo-2.0.9.tar.gz
enter into the new directory and build using make:
cd mondo-2.0.9 ./configure --prefix=/usr make
You will end up with binary in the following locations which are non-stripped, i.e. they contain debugging symbols:
file mondo/mondoarchive/mondoarchive mondo/mondoarchive/mondoarchive: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped
and
file mondo/mondorestore/mondorestore mondo/mondorestore/mondorestore: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped
Make backups of the original mondoarchive and mondorestore binaries and copy they newly created over the original ones.
On the ftp://ftp.mondorescue.org server, you'll also find rpm packages with the debug symbol in them that you could use alongside your normal packages to add debug support to your environment.
Trouble-Shooting mondoarchive
The best approach is to run the mondoarchive binary you just created with debugging symbols built in from its location in the built directory under gdb. So:
cd mondo/mondoarchive gdb ./mondoarchive set logging on # Which will generate a gdb.txt output file run <usual arguments you use>
Note: Running it from within its build directory makes it so that more valuable information about source lines will be available in the backtrace.
When the segmentation fault happens, enter:
bt
and send the output to the list.
Another possibility is to run valgrind mondoarchive [params] (if you have valgrind installed on your system) as it will give even more information on potential related memory issues.
Trouble-Shooting partition issue
With new Linux distributions harddisk partition mount points are stored in /etc/fstab as by-id e.g:
/dev/disk/by-id/scsi-SATA_SAMSUNG_HM120JIS09GJ30LB01772-part2 / ext3 acl,user_xattr 1 1
When by-id is used mindi doesn't seem to see all partitions and is thus unable to resque the system afterwards.
The solution is to change the by-id to static mounts, for SLES10 and SLES10 this is described in:
Also please help fixing that issue by filling #406
Trouble-Shooting mondorestore
If you do partial restore onto a live system, the same approach as described for mondoarchive can be used.
However, more likely you will experience a segmentation fault during restore. To run a backtrace in that situation proceeed as follows:
First, you need a mondorestore binary with debugging symbols. This should already been taken care of if you copied the newly compiled binaries as described above. Next, you need to make sure that gdb is available on your restore media. To achieve this, add this to /etc/mindi/deplist.txt before doing a mondoarchive run:
gdb libthread_db.so.1
Boot the restore media into expert mode. Then start mondorestore like this:
gdb /usr/sbin/mondorestore set logging on # Which will generate a gdb.txt output file run
As described previously, once the segmentation fault happens, do:
bt
and send the output to the list. (If you can't get the backtrace copied as text, you can use a photo of the screen as the last resort.
Advanced Topics
Troubleshoot mondorestore on RHEL via valgrind in NFS recipe
If you encouter a crash of mondorestore during restoration, a way to help the dev team fix the issue is by reporting information on the crash using the debug environment.
We suppose that you're at the prompt after the crash. Prepare on your NFS server the needed conten for debugging the case:
First download both the normal and the debug mondo packages:
# cd /dir/exported/to/mondo # wget ftp://ftp.mondorescue.org/test/rhel/5/mondo-2.2.9-0.20090729004531.rhel5.x86_64.rpm # wget ftp://ftp.mondorescue.org/test/rhel/5/mondo-debuginfo-2.2.9-0.20090729004531.rhel5.x86_64.rpm
Then on the original platform make the backup the way you're used to using the downloaded mondo package (and mindi of course). On the same platform you also have to install the valgrind package. Then create a tar file containing the mondo debig info and valgrind content that you make available on your NFS server:
# mkdir tmp # cd tmp # rpm2cpio ../mondo-debuginfo-2.2.9-0.20090729004531.rhel5.x86_64.rpm | cpio -idum # tar czf ../mondo.tgz . # cd .. # rm -rf tmp # tar czf valgrind.tgz /usr/bin/valgrind /usr/lib*/valgrind
Then on the restored client, at the prompt you can extract the content, and use it:
# cd / # tar xzf /tmp/isodir/valgrind.tgz # tar xzf /tmp/isodir/mondo.tgz # valgrind --log-file=/tmp/valg.log --show-reachable=yes --track-origins=yes --leak-check=full mondorestore -K 99 -Z interactive
Then send those files to the dev team with a picture of the crash:
/var/log/mondorestore.log /tmp/valg.log
Attaching to Running Processes
You can attach to a running process using:
gdb /usr/sbin/mondorestore <pid>
where <pid> is the process ID.
This can be particulary useful when running mondoarchive with the '-g' or when running mondorestore.
Using libraries with debugging symbols
The libraries used by a binary can be determined using the ldd command, e.g.:
ldd /usr/sbin/mondoarchive libmondo.so.2 => /usr/lib/libmondo.so.2 (0xb7f8d000) libmondo-newt.so.1 => /usr/lib/libmondo-newt.so.1 (0xb7f82000) libnewt.so.0.51 => /usr/lib/libnewt.so.0.51 (0xb7f71000) libdl.so.2 => /lib/tls/libdl.so.2 (0xb7f6e000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0xb7f5f000) libc.so.6 => /lib/tls/libc.so.6 (0xb7e2a000) libslang.so.1-UTF8 => /lib/libslang.so.1-UTF8 (0xb7db7000) libm.so.6 => /lib/tls/libm.so.6 (0xb7d94000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)
some of those libraries may come with debugging symbols built in as an alternative package, others can be buiolt from scratch and installed with debugging sysmbols installed. Using your distribtuion's standard built process is probably a good idea for this.
Worthwhile gcc Flags
-Wextra -Wshadow -Wstack-protector -fstack-protector
Getting the mondorestore.log file on restored media
The file /var/log/mondorestore.log is on the machine restored.
During restore, Alt-F2 (Alt-F3... Alt-F6) allow to switch to busybox shells.
You can save /var/log/mondorestore.log on a mounted USB key. If you connected an USB key before making a choice in the "mondorestore choice menu" (interactive, compare, expert...), through Alt-F2 you should be able to switch to a busybox shell, then mount the vfat USB key on it, and copy the /var/log/mondorestore.log file to the USB key.
Otherwise, you can copy /var/log/mondorestore.log to a tftp server through tftp command, once the network has been correctly setup.
Alt-F1 allows to return to mondorestore GUI screen. (Tip documented by Victor Gattegno on MondoRescue ML)
Getting the entire kernel log on restore media
The kernel ring buffer that dmesg reads defaults to 32k on recent kernels. This is not enough to capture the entire sequence of kernel message when Mondo Rescue boots off a restore media.
To increase the kernel ring buffer to 128k at boot time (and without recompilation) add the following kernel boot parameter:
log_buf_len=128k
e.g.
export log_buf_len=128k
dmesg needs to be told what buffer size to use to ensure that everything is displayed from the start. The -s parameter can be used for this like this:
dmesg -s 131072 | less
Unable to boot restored server
When a restored server fails to boot to the grub menu (seen on SLES10) do: Boot from SLES10 DVD and choose the recovery option and let it boot. Mount your boot partition e.g.
mount /dev/sda1 /boot
Run
grub-install
and reboot.
Note: before using grub-install, check that URL.
Note from mindi NEWS file: "try standard grub-install in grub-MR restore script before trying anything fancy (Andree Leidenfrost)".