= Trouble-Shooting mindi = Launch mindi using the verbose option of the shell: {{{ bash -x /usr/sbin/mindi 2>&1 | tee /tmp/mindi.log }}} If you want to have those mindi traces from mondoarchive, then change the mindi script by adding at the begining : {{{ set -x }}} = Trouble-Shooting mondo = mondo basically consists of two C programs, '''mondoarchive''' and '''mondorestore'''. To trouble-shoot mondo therefore may mean to debug. This sounds scarier than it is - just read on. ;-) == Creating Backtraces == Backtraces can be very helpful when trouble-shooting issues like segmentation faults. To create a useful backtrace, you need gdb (the GNU Debugger) installed and an application (and possibly libraries) with debugging symbols built in. The following will explain how to do this. === gdb === gdb should be part of your distribution just use your favourite way to install the package, e.g. {{{ apt-get install gdb }}} for Debian and friends (such as Ubuntu) or {{{ yum install gdb }}} for Fedora/RedHat/Mandriva === mondoarchive/mondorescue with debugging symbols === To get '''mondoarchive''' and '''mondorescue''' with debugging symbols built in, you need to build from the source. Get the latest stable mondo source package from '''ftp://ftp.mondorescue.org/src/''', e.g. '''mondo-2.0.9.tar.gz''', unpack: {{{ tar xvzf mondo-2.0.9.tar.gz }}} enter into the new directory and build using make: {{{ cd mondo-2.0.9 ./configure --prefix=/usr make }}} You will end up with binary in the following locations which are non-stripped, i.e. they contain debugging symbols: {{{ file mondo/mondoarchive/mondoarchive mondo/mondoarchive/mondoarchive: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped }}} and {{{ file mondo/mondorestore/mondorestore mondo/mondorestore/mondorestore: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped }}} Make backups of the original '''mondoarchive''' and '''mondorestore''' binaries and copy they newly created over the original ones. == Trouble-Shooting mondoarchive == The best approach is to run the '''mondoarchive''' binary you just created with debugging symbols built in from its location in the built directory under gdb. So: {{{ cd mondo/mondoarchive gdb ./mondoarchive set logging on # Which will generate a gdb.txt output file run }}} '''Note:''' Running it from within its build directory makes it so that more valuable information about source lines will be available in the backtrace. When the segmentation fault happens, enter: {{{ bt }}} and send the output to the list. Another possibility is to run valgrind mondoarchive [params] (if you have valgrind installed on your system) as it will give even more information on potential related memory issues. == Trouble-Shooting partition issue == With new Linux distributions harddisk partition mount points are stored in /etc/fstab as by-id e.g: {{{ /dev/disk/by-id/scsi-SATA_SAMSUNG_HM120JIS09GJ30LB01772-part2 / ext3 acl,user_xattr 1 1 }}} When by-id is used mindi doesn't seem to see all partitions and is thus unable to resque the system afterwards. The solution is to change the by-id to static mounts, for SLES10 and SLES10 this is described in: http://www.novell.com/support/search.do?cmd=displayKC&docType=kc&externalId=3580082&sliceId=SAL_Public&dialogID=54562329&stateId=0%200%2054564189 Also please help fixing that issue by filling #234 == Trouble-Shooting mondorestore == If you do partial restore onto a live system, the same approach as described for '''mondoarchive''' can be used. However, more likely you will experience a segmentation fault during restore. To run a backtrace in that situation proceeed as follows: First, you need a '''mondorestore''' binary with debugging symbols. This should already been taken care of if you copied the newly compiled binaries as described above. Next, you need to make sure that gdb is available on your restore media. To achieve this, add this to '''/etc/mindi/deplist.txt''' before doing a mondoarchive run: {{{ gdb libthread_db.so.1 }}} Boot the restore media into '''expert''' mode. Then start '''mondorestore''' like this: {{{ gdb /usr/sbin/mondorestore set logging on # Which will generate a gdb.txt output file run }}} As described previously, once the segmentation fault happens, do: {{{ bt }}} and send the output to the list. (If you can't get the backtrace copied as text, you can use a photo of the screen as the last resort. == Advanced Topics == === Attaching to Running Processes === You can attach to a running process using: {{{ gdb /usr/sbin/mondorestore }}} where is the process ID. This can be particulary useful when running '''mondoarchive''' with the '-g' or when running '''mondorestore'''. === Using libraries with debugging symbols === The libraries used by a binary can be determined using the ldd command, e.g.: {{{ ldd /usr/sbin/mondoarchive libmondo.so.2 => /usr/lib/libmondo.so.2 (0xb7f8d000) libmondo-newt.so.1 => /usr/lib/libmondo-newt.so.1 (0xb7f82000) libnewt.so.0.51 => /usr/lib/libnewt.so.0.51 (0xb7f71000) libdl.so.2 => /lib/tls/libdl.so.2 (0xb7f6e000) libpthread.so.0 => /lib/tls/libpthread.so.0 (0xb7f5f000) libc.so.6 => /lib/tls/libc.so.6 (0xb7e2a000) libslang.so.1-UTF8 => /lib/libslang.so.1-UTF8 (0xb7db7000) libm.so.6 => /lib/tls/libm.so.6 (0xb7d94000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000) }}} some of those libraries may come with debugging symbols built in as an alternative package, others can be buiolt from scratch and installed with debugging sysmbols installed. Using your distribtuion's standard built process is probably a good idea for this. === Worthwhile gcc Flags === -Wextra -Wshadow -Wstack-protector -fstack-protector = Getting the entire kernel log on restore media = The kernel ring buffer that '''dmesg''' reads defaults to 32k on recent kernels. This is not enough to capture the entire sequence of kernel message when Mondo Rescue boots off a restore media. To increase the kernel ring buffer to 128k at boot time (and without recompilation) add the following kernel boot parameter: {{{ log_buf_len=128k }}} e.g. {{{ export log_buf_len=128k }}} '''dmesg''' needs to be told what buffer size to use to ensure that everything is displayed from the start. The '''-s''' parameter can be used for this like this: {{{ dmesg -s 131072 | less }}} = Unable to boot restored server = When a restored server fails to boot to the grub menu (seen on SLES10) do: Boot from SLES10 DVD and choose the recovery option and let it boot. Mount your boot partion e.g. {{{ mount /dev/sda1 /boot }}} Run {{{ grub-install }}} and reboot.