#600 closed defect (fixed)
At boot: occur "no space left on device" errors
Reported by: | victor gattegno | Owned by: | Bruno Cornec |
---|---|---|---|
Priority: | highest | Milestone: | 3.0.2 |
Component: | mindi | Version: | 3.0.1 |
Severity: | blocker | Keywords: | |
Cc: |
Description
With a RHEL/CentOS 5 mondo backup, mindi boot fails with a lot of "no space left of device".
I don't reproduce that problem with RHEL 6.
Five users signaled that problem with mindi 2.2.1 and mondo 3.0.1 to mondo-devel mailing-list in march 2012.
In fact the errors are generated by RHEL 5 /sbin/start_udev shell-script file, which is started at boot by mindi rcS.
If start_udev is replaced by the one of RHEL 6 there is no more "no space left of device" error message.
Nevertheless, the replacement by the one of RHEL 6 is not a solution because some items are missing in RHEL 5 for it.
For more details, check, in the mailing-list archive, the 27 march 2012 discussion with the subject: [Mondo-devel] mondorescue: no space left on device
I attach to this ticket two screenshots that a user has taken.
Attachments (11)
Change History (27)
by , 13 years ago
Attachment: | mindi-1.png added |
---|
by , 13 years ago
Attachment: | mindi-2.png added |
---|
the rest of error messages, "no space left on device"
comment:1 by , 13 years ago
Summary: | At boot: occur "no space left of device" errors → At boot: occur "no space left on device" errors |
---|
comment:2 by , 13 years ago
The error is "no space left on device", and not "no space left of device".
comment:3 by , 13 years ago
Today a user reported in the mailing-list that, when he downgraded mindi-2.1.1 to mindi.2.1.0 on RHEL56 and RHEL58, there were no "cp write error no space left on device".
comment:4 by , 13 years ago
Priority: | normal → high |
---|---|
Severity: | major → blocker |
Status: | new → assigned |
comment:5 by , 13 years ago
I modified the tmpfs mount section of the RHEL5 start_udev script, it's now like the tmpfs mount section of RHEL 6 start_udev.
I attach the modified start_udev here. A user tested my start_udev modified on RHEL 5 and it worked well, mindi.iso boots now fine with mindi-2.1.1 package.
Diff between the /sbin/start_udev;
# diff /sbin/start_udev /sbin/start_udev_ori 136c136 < LANG=C awk "\$2 == \"${udev_root%/}\" && ( \$3 == \"devtmpfs\" || \$3 == \"tmpfs\" ) { exit 1 }" /proc/mounts && { --- > LANG=C awk "\$2 == \"${udev_root%/}\" && \$3 == \"tmpfs\" { exit 1 }" > /proc/mounts && { 145,147c145 < # First try to mount a devtmpfs on $udev_root < mount -n -o mode=0755 -t devtmpfs none "$udev_root" 2>/dev/null \ < || mount -n -o mode=0755 -t tmpfs none "$udev_root" --- > mount -n -o mode=0755 -t tmpfs none "$udev_root"
by , 13 years ago
Attachment: | start_udev-RHEL-5.7-modified added |
---|
RHEL 5 start_udev (that I modified) working well with mindi 2.1.1
by , 13 years ago
Attachment: | start_udev-RHEL-6.1 added |
---|
RHEL 6 start_udev (just for info), working well with mindi 2.1.1
comment:6 by , 13 years ago
Two users said that they still get "cp write error no space left on device" with mindi 2.1.1 and with the RHEL 5 start_udev that I modified.
I didn't tried it with RHEL 5, because I already upgraded to RHEL 6.
comment:7 by , 13 years ago
The major diffrence I see between 2.1.0 and 2.1.1 for mindi is that more file are copied onto the boot media. So we may fill the ramdrive whereas before we didn't.
Could be worth changing the ramdrive_size at boot time so the EXTRA_SPACE variable in mindi to see if that improves stuff
comment:8 by , 13 years ago
I recommended Alan to add "set -xv" in start_udev script
Thanks for the -xv results, it sheds some light on the bug.
I see in the "debux-xv make_extra_nodes.doc" file:
+ pushd /lib/udev/devices + set README THIS-IS-A-RAMDISK ataraid.tgz bin cciss.tgz dev dev.static + dm.tgz etc i20.tgz ida,tgz init lib lib64 linuxrc lost+found mnt + nst.tgz proc raw.tgz rd.tgz root sbin symlinks.tgz sys tmp usr var + vc.tgz [ read != * ]
It's why the start_udev script tries to copy all that (README, etc.) to /dev through the "cp -ar "$@" $udev_root/" command, and it's why user then get the "cp: write error: no space left on device" messages.
Normally, as Alan /lib/udev/devices directory was empty, he should have get:
+ pushd /etc/udev/devices /etc/udev/devices ~/test5 + set '*' + '[' '*' '!=' '*' ']'
Here I get "~/test5" too because I ran a test shell-script from ~/test5 directory.
In Alan case, it seems that the "pushd /lib/udev/devices" was not successful, so /lib/udev/devices was not added to the list of currently remembered directories.
It's strange, because, with Alan "ls -al /lib/udev/devices" we see that it exists, so the "pushd /lib/udev/devices" should be successful.
If in start_udev script there was:
pushd $devdir
instead of:
pushd $devdir &> "$udev_root/null"
Maybe we could see:
+ pushd /etc/udev/devices
..... pushd: dir: No such file or directory
You'll find attached my tests on a RHEL 4.
In testfordevdir-result.txt you'll see what you should get:
+ pushd /etc/udev/devices /etc/udev/devices ~/test5 + set '*' + '[' '*' '!=' '*' ']'
In test2-result.txt you'll see that:
- I used "dir" instead of /etc/udev/devices
- and I added a "pushd /"
so I got:
+ pushd / / ~/test5 + pushd dir ./test2.sh: line 8: pushd: dir: No such file or directory + set audit bin boot dev dir1 etc home initrd lib lost+found media misc mnt opt proc root sbin selinux srv sys test2.sh test4popd.sh testhrea.410 tftpboot tmp usr var + '[' audit '!=' '*' ']'
Which is not a bug, because on my server I have no "dir" directory under /. But it looks similar to the bug.
I asked Alan to check it he's sure that, after the line:
+ pushd /lib/udev/devices
he saw no line before this one?
+ set README THIS-IS-A-RAMDISK .......
If there was nothing, maybe the list of currently remembered directories is empty...
Then could be added in start_udev script a "pushd /" just before the line:
pushd $devdir &> "$udev_root/null"
You'll find attached the start_udev script modified that way (I added a "pushd /", and I added a popd at the end of the for loop).
by , 13 years ago
Attachment: | debux-xv make_extra_nodes.doc added |
---|
Alan start_udev "set -xv" results
by , 13 years ago
Attachment: | testfordevdir.sh added |
---|
by , 13 years ago
Attachment: | testfordevdir-result.txt added |
---|
by , 13 years ago
by , 13 years ago
Attachment: | test2-result.txt added |
---|
by , 13 years ago
Attachment: | start_udev-RHEL-5.7-added-pushd added |
---|
start_udev script modified (I added a "pushd /", and I added a popd at the end of the for loop)
comment:9 by , 13 years ago
Priority: | high → highest |
---|
On the second image provided, there is a mention of recursion in cp. Maybe that's an area we need to explore more. A recursive link could be a problem here.
comment:10 by , 13 years ago
I submitted the idea of a shell problem, because pushd didn't worked in the original RHEL 5 start_udev script.
Moreover, in RHEL 6 "#!/bin/sh" is replaced by "#!/bin/bash" in start_udev script.
So, in the RHEL 5 original start_udev script, Alan replaced "#!/bin/sh" by "#!/bin/bash".
Then all worked fine ; he successfully created a new archive DVD and restored the server.
comment:11 by , 13 years ago
I found the problem:
- in RHEL 5, 6, SLES 10, etc, /bin/sh is a soft link to /bin/bash, so no problem.
- in mondo boot, /bin/sh is a soft link to busybox, so it calls the tiny shell embedded in busybox.
And busybox sh doesn't have pushd (nor popd) embedded, if I start a /bin/sh under busybox and if I type "pushd /", I get "pushd: not found" ; the same for popd.
If I start a /bin/sh under busybox and if I type "pushd /" it works, popd works too.
It's why I didn't got the errors with RHEL 6 start_udev, which uses /bin/bash shell instead of /bin/sh shell (used by RHEL 5 start_udev).
comment:12 by , 13 years ago
Some good feedback provided by Stefan Heijmans:
I noticed that in mindi 2.1.0 /sbin/MAKEDEV is not there and in mindi 2.1.0 it is. /sbin/MAKEDEV is also used in /sbin/start_udev -> line 180 -> make_extra_nodes
So this made me wonder why this happens, so I did a diff on the mindi script between 2.1.0 and 2.1.1; Showing this, first part is 2.1.0 and second part is 2.1.1;
2488c2477,2493 < --- > > # Handle the case where busybox and mount are dynamically linked > file $MINDI_LIB/rootfs/bin/busybox 2>&1 | grep -q "dynamically" > if [ $? -eq 0 ]; then > # We want to use the real mount and all the supported variants (nfs, cifs, ...) > rm -f bin/mount $MINDI_TMP/busy.lis > mountlis=`grep -E "mount|fuse|ssh" $DEPLIST_FILE $DEPLIST_DIR/* | grep -v " *#.*" | cut -d: -f2 | sort -u` > LocateDeps $MINDI_LIB/rootfs/bin/busybox $mountlis >> $MINDI_TMP/busy.lis > # Special for libs > for f in `grep -E "libnss" $DEPLIST_FILE $DEPLIST_DIR/* | grep -v " *#.*" | cut -d: -f2`; do > echo "`ReadAllLink $f`" >> $MINDI_TMP/busy.lis > done > # Initial / are trucated by tar > tar cf - $mountlis `sort -u $MINDI_TMP/busy.lis` 2>> $MINDI_TMP/$$.log | tar xf - || LogIt "Problem in mount analysis" $MINDI_TMP/$$.log > rm -f $MINDI_TMP/busy.lis > fi > 2500,2521c2505 < # Handle the case where busybox and mount are dynamically linked < file $MINDI_LIB/rootfs/bin/busybox 2>&1 | grep -q "dynamically" < if [ $? -eq 0 ]; then < # We want to use the real mount and all the supported variants (nfs, cifs, ...) < rm -f bin/mount < fi < < # Copy of files from the minimal env needed as per the deplist.d/minimal.conf file (which includes all busybox deps) < minimallis=`grep -Ev '^#' $DEPLIST_DIR/minimal.conf` < rm -f $MINDI_TMP/minimal.lis < for f in $MINDI_LIB/rootfs/bin/busybox $minimallis; do < echo $f >> $MINDI_TMP/minimal.lis < done < LocateDeps $MINDI_LIB/rootfs/bin/busybox $minimallis >> $MINDI_TMP/minimal.lis < for f in `cat $MINDI_TMP/minimal.lis`; do < echo "`ReadAllLink $f`" >> $MINDI_TMP/minimal.lis < done < # Initial / are trucated by tar < tar cf - `sort -u $MINDI_TMP/minimal.lis` 2>> $MINDI_TMP/$$.log | tar xf - || LogIt "Problem in minimal analysis" $MINDI_TMP/$$.log < rm -f $MINDI_TMP/minimal.lis < < # Avoids an issue on some distro (RHEL5) ---
In mindi 2.1.1 the $DEPLIST_DIR/minimal.conf is processed and in mindi 2.1.0 only the binaries for "mount|fuse|ssh". So putting this back into mindi 2.1.1, like;
# Handle the case where busybox and mount are dynamically linked file $MINDI_LIB/rootfs/bin/busybox 2>&1 | grep -q "dynamically" if [ $? -eq 0 ]; then # We want to use the real mount and all the supported variants (nfs, cifs, ...) rm -f bin/mount fi # Copy of files from the minimal env needed as per the deplist.d/minimal.conf file (which includes all busybox deps) minimallis=`grep -Ev '^#' $DEPLIST_DIR/minimal.conf` mountlis=`grep -E "mount|fuse|ssh" $DEPLIST_FILE $DEPLIST_DIR/* | grep -v " *#.*" | cut -d: -f2 | sort -u` <== extra line rm -f $MINDI_TMP/minimal.lis for f in $MINDI_LIB/rootfs/bin/busybox $mountlis; do <== edited line echo $f >> $MINDI_TMP/minimal.lis done LocateDeps $MINDI_LIB/rootfs/bin/busybox $mountlis >> $MINDI_TMP/minimal.lis <== edited line for f in `cat $MINDI_TMP/minimal.lis`; do echo "`ReadAllLink $f`" >> $MINDI_TMP/minimal.lis done # Initial / are trucated by tar tar cf - `sort -u $MINDI_TMP/minimal.lis` 2>> $MINDI_TMP/$$.log | tar xf - || LogIt "Problem in minimal analysis" $MINDI_TMP/$$.log rm -f $MINDI_TMP/minimal.lis
I created the mindi iso and it booted fine into the prompt.
comment:13 by , 13 years ago
I think that the better solution is to have at boot: /bin/sh a soft link to /bin/bash, instead of a soft link to busybox binary.
comment:14 by , 13 years ago
That's indeed a solution. But I'd like to know why this is the right solution in 5.7 where it was not in 5.2 e.g. I think that the fact that /sbin/MAKEDEV is now included whereas in 2.1.0 it wasn't is the cause of the problem. I didn't had time to look at its content to be sure,
When it's here, then it's called to create some devices, which seems to make the cp afterwards failing. Which is not the case when we just skip that step in start_udev.
It still needs some digging so that we can document why this is happening. But I'm like you tempted to systematically use bash as the main shell becasue as we use more and more distribution scripts, we will have that type of issue aain in the future probably.
comment:15 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:16 by , 13 years ago
This can be tested with the new beta version ftp://ftp.mondorescue.org/test/rhel/5/x86_64/mindi-2.1.220120421020707-0.rhel5.x86_64.rpm
On mindi-1.png the beginning of the problem, see mindi-2.png for the rest