Opened 9 years ago

Closed 9 years ago

#425 closed defect (fixed)

Mindi unmounts OTHER filesystems

Reported by: dxh Owned by: bruno
Priority: highest Milestone: 2.2.9.4
Component: mindi Version: 2.2.9.3
Severity: critical Keywords:
Cc:

Description

FYI, wanted to let you guys know about a VERY nasty bug we ran into with mindi, where it actually was unmounting our production ORACLE file systems.

We had a couple of cases where some oracle fileystems were just out of the blue unmounted during when mondo ran.

Well, it turns out that this is caused by THIS line in the mindi code:

my_partitions=mount | grep -F $$ | cut -f1 -d' ' [ "$my_partitions" != "" ] && umount $my_partitions

In moderately-plain english this what the developer was thinking :

I mounted any filesystems for temporary use during this run, I embedded my current

Process ID ($$) in the name so they would be easy to find. All I have to do is look through the list any that contain my PID, extract the filesystem names, and then unmount them.

But there's a horrible problem with this approach. mindi will attempt to unmount ANY and ALL filesystems that just happen to contain the PID of the running process - which are essentially random but on Linux can be a number roughly between 300 and 32767.

Since our Oracle filesystems are all named something like /u001/oradata/p657, /a001/oradata/p657, /b001/oradata/p657, then if by sheer bad luck mindi runs with the PID of 657... well then we're hosed because mindi will try to unmount them thinking they match its own pid.

Oh, and it turns out that this same fragment of code exists in previous versions of mindi as well, so this must be a pretty rare occurrence on a box. But we have run into it twice in the last week.

Looking through the code for mindi I am not totally sure why these lines are even in there. The temporary $mountpoint is already unmounted further up in the code, and at best this is some sort of last ditch effort that someone put in there during the MindiExit?() exit/cleanup functions.

I believe it should either be taken out all together(since the temp mount point is already unmounted elsewhere in mindi, or at the very LEAST this patch should be added to mindi which will make sure it only looks for the temp mountpoints that it actually creates which are all in the format of $MINDI_TMP/mountpoint.$$

255c255 < my_partitions=mount | grep -F $$ | cut -f1 -d' ' ---

my_partitions=mount | grep -F mountpoint.$$ | cut -f1 -d' '

This should be treated as a CRITICAL high priority fix to put into the next version. While it does require having other mountpoints on the system that just happen to have the same number in their name as the mindi pid, it can (and has to us) happen and cause major problems with other non-mondo related stuff on the system.

Change History (1)

comment:1 Changed 9 years ago by bruno

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in rev [2640]. Releasing 2.2.9.4 today due to that.

Thanks for the report and analysis.

Note: See TracTickets for help on using tickets.