Opened 18 years ago

Closed 17 years ago

#10 closed defect (wontfix)

Kernel bug at slab.c:815!

Reported by: difdif@… Owned by: Bruno Cornec
Priority: normal Milestone:
Component: mondo Version: 2.0.8
Severity: normal Keywords:
Cc:

Description

mondoarchive v2.06-266
mindi-1.0.7-r454
Running on Slackware 9.1

Backup works perfectly. Restoring from CD fails during booting the first disc with error messages saying "Kernel bug at slab.c:815!". This is followed by several lines of numbers & codes, and the last line is about uhci.c USB Universal host controller interface.

I have a screenshot of all this if it helps. The bug looks like one mentioned on Source forge, for which a solution was to restrict the modules being archived. This patch was given as a fix, but I have not investigated further. The current code of verson 1.07 appears to be the same here as the 1.04 code before the patch.

------------------------------------------------------------------------------
root@wdmtst:/usr/local/src/mondo/mindi-1.04_cvs_20050503# diff -u
mindi.0 mindi
--- mindi.0 2005-08-26 11:20:12.000000000 -0400
+++ mindi 2005-08-26 11:21:38.000000000 -0400
@@ -1275,7 +1275,7 @@
else
infile="/etc/modules.conf"
fi
- for module in $list_to_echo $EXTRA_MODS ; do
+ for module in $list_to_echo ; do
params=`sed -n "s/^options \\+$module \\+//p" $infile`
modpaths=`FindSpecificModuleInPath $searchpath $module`
for i in $modpaths ; do
-------------------------------------------------------------------------------


Using the same hardware for backup and restore, I was able to do a full restore without errors using mindi-0.87 and mondo-1.67.

Change History (12)

comment:1 by difdif@…, 18 years ago

Screenshot of errors is here:

http://www.megapico.co.uk/dscf0002.jpg[[BR]]
It happens after a stage with messages about unpacking 4 archives and softlinking (I think).

comment:2 by Bruno Cornec, 18 years ago

This patch is really aggressive ;-)
It will only work for a small subset of configurations, as it removes nearly all modules from the init script execution.

I know it's a long task, but could you try to boot your rescue media using init=/bin/sh and trying to load manually the same modules so that we know which one causes problem and that we could make a correct patch, or document it.

comment:3 by Bruno Cornec, 18 years ago

It' seems related to a USB issue.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-05/0363.html
Could you give the result of lsmod on the running system ?
Could you try using -k FAILSAFE with mondoarchive ?

This really seems to be a bug coming from busybox (insmoding a module which creates the issue. As far as I can tell it's more around kernel aspects than mondo aspects.
Could you also try a newer version of mondo/mindi ?

comment:4 by difdif@…, 18 years ago

Thanks for looking at this - sorry for my delay in coming back with more information. Here is the lsmod output from the normal running system, hopefully still readable after pasting into this form:

Module Size Used by Not tainted
fat 31480 0
snd-pcm-oss 37092 0 (unused)
snd-mixer-oss 12016 0 [snd-pcm-oss]
ipt_REJECT 3192 4 (autoclean)
ipt_pkttype 472 4 (autoclean)
ipt_LOG 3448 9 (autoclean)
ipt_state 504 14 (autoclean)
ipt_TOS 952 12 (autoclean)
iptable_mangle 2072 1 (autoclean)
ip_nat_irc 2128 0 (unused)
ip_nat_tftp 1616 0 (unused)
ip_nat_ftp 2704 0 (unused)
iptable_nat 15800 3 [ip_nat_irc ip_nat_tftp ip_nat_ftp]
ip_conntrack_irc 3024 1
ip_conntrack_tftp 1776 1
ip_conntrack_ftp 3856 1
ip_conntrack 19944 3 [ipt_state ip_nat_irc ip_nat_tftp ip_nat_ftp iptable_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp]
iptable_filter 1644 1
ip_tables 12480 10 [ipt_REJECT ipt_pkttype ipt_LOG ipt_state ipt_TOS iptable_mangle iptable_nat iptable_filter]
uhci 24528 0 (unused)
usbcore 58976 1 [uhci]
snd-via82xx 12032 0
gameport 1452 0 [snd-via82xx]
snd-pcm 56064 0 [snd-pcm-oss snd-via82xx]
snd-timer 13444 0 [snd-pcm]
snd-ac97-codec 38264 0 [snd-via82xx]
snd-page-alloc 6004 0 [snd-via82xx snd-pcm]
snd-mpu401-uart 3136 0 [snd-via82xx]
snd-rawmidi 12672 0 [snd-mpu401-uart]
snd-seq-device 3920 0 [snd-rawmidi]
snd 29956 0 [snd-pcm-oss snd-mixer-oss snd-via82xx snd-pcm snd-timer snd-ac97-codec snd-mpu401-uart snd-rawmidi snd-seq-device]
soundcore 3332 3 [snd]
via-rhine 12560 1
mii 2304 0 [via-rhine]
crc32 2880 0 [via-rhine]
ide-scsi 9424 0
agpgart 44100 0 (unused)
softdog 1884 1

I'm running the FAILSAFE backup now, and will then look at the very latest versions of the software.

comment:5 by difdif@…, 18 years ago

The backup using the FAILSAFE kernel finished ok and I burned the CDs. Booting from the CD started ok, and the "Kernel bug at slab.c:815!" did not occur.

However, the boot stopped at some different error messages. These asked me to start copying all the archive files onto floppies and inserting them one after another. I tried different start-up options (compare, expert, textonly) and all of them did the same thing. A (literal) screenshot is here:

http://www.megapico.co.uk/dscf0051.jpg
The backup might be on the CD, but restore is not working.

I'll try a more recent verion of Mondo next.

comment:6 by difdif@…, 18 years ago

With latest version of Mondo and Mindi
mindi v1.0.8-r650
mondoarchive v2.0.8-650

the same "Kernel bug at slab.c:815!" occurs when trying to restore the archive. I've tried the CD on two different computers to see if there is some hardware effect, but both behave the same.

A screenshot of the crash is here:

http://www.megapico.co.uk/dscf0055.jpg
It looks the same as the original one I posted. Linux kernal on the system to be backed up is 2.4.26, which is the current version for Slackware 9.1.

comment:7 by Bruno Cornec, 18 years ago

First coming back to the original lsmod result, could you try to do the backup having removed first usbcore and uhci with rmmod ?

That way we could see if these are the modules causing the slab error.

TIA, Bruno.

comment:8 by Bruno Cornec, 18 years ago

Also could you try the latest version of mondo/mindi you have + FAILSAFE kernel and report again ?
Which mindi-kernel do you use ?

comment:9 by Bruno Cornec, 18 years ago

Status: newassigned

comment:10 by david@…, 18 years ago

I've investigated as described below and found that module jbd.o is causing this problem.

Start up with

interactive init=/bin/sh

Use vi to edit /sbin/init so that you can see what is going on with each module load

Line 600

insert-all-my-modules > $LOGFILE 2> $LOGFILE

to

insert-all-my-modules

edit /sbin/insert-all-my-modules

Remove this line at the bottom of the file

MyInsmod /lib/modules/2.4.26/kernel/fs/jbd/jbd.o > /dev/null 2> /dev/null

Type init to initialise as normal, and the mondo rescue console will start as expected.

This is Journal Block Device support, currently used by the ext3 filesystem. Will it safely drop back to a ext2 system if this module is not loaded?

comment:11 by Bruno Cornec, 18 years ago

So it seems definitively a problem due to te Slackware kernel used + busybox + this module.

Could you try with the latest version + FAILSAFE kernel from ftp://ftp.mondorescue.org/src ?

I hope it will be a good solution.

Another possibility is to search for a kernel upgrade for your Slackware.

comment:12 by Bruno Cornec, 17 years ago

Resolution: wontfix
Status: assignedclosed
Note: See TracTickets for help on using tickets.