Opened 15 years ago

Closed 12 years ago

Last modified 12 years ago

#358 closed defect (fixed)

mondo never takes HP LTO SAS drive out of OBDR mode during restore

Reported by: tastle73 Owned by: Bruno Cornec
Priority: normal Milestone: 3.0.0
Component: mondo Version: 2.2.9
Severity: normal Keywords:
Cc:

Description

I was testing OBDR resore in mondo/mindi with my new LTO-2 drive attached via SAS and it boot ok right up until where I think it has to switch the drive out of OBDR mode. It panics saying that /dev/nst0 is not an extended data drive

If I powercycle the drive at that point and then let it get rediscovered and type in the /dev/nst0 device, it proceeds and finishes.

Centos 5.3 x86_64 HP SAS LTO-2 drive

Attachments (3)

mondorestore.log (382.5 KB ) - added by tastle73 15 years ago.
mondorestore.log
hpsa_obdr_mode.c (15.1 KB ) - added by Bruno Cornec 12 years ago.
Source code for the hpsa_obdr_mode program
Makefile (571 bytes ) - added by Bruno Cornec 12 years ago.
Makefile to build the hpsa_obdr_mode program

Download all attachments as: .zip

Change History (17)

comment:1 by Bruno Cornec, 15 years ago

Status: newassigned

Can you get the /var/log/mondorestore.log file generated during that restore. Even better if you can launch manually mondorestore -Z 99 to generate a more verbose one.

by tastle73, 15 years ago

Attachment: mondorestore.log added

mondorestore.log

comment:2 by tastle73, 15 years ago

Version: 2.2.82.2.9

comment:3 by Bruno Cornec, 14 years ago

Ok, I see where the problem hriives.

It's not a panic. It's just that some commands to not succeed ,and the init process ask you to give the correct device instead. Of course, in your case that's the name that needs to be given.

So to fix htis issue, we need to know why the following commands are failing at that point (after the OBDR boot):

mt -f /dev/nst0 rewind
mt -f /dev/nst0 fsf 2
dd if=/dev/nst0 bs=32k count=1024 | tar -zx

I know another perso nwho did the following to its tape drive in order to make it work (but context completely different):

I have reconfigured the tape drive in non OBDR mode, in this way:

1) power cycle tape drive
2) from another shell (Alt+F2)
   - echo "scsi remove-single-device 1 0 3 0" > /proc/scsi/scsi
   - modprobe st
   - echo "scsi add-single-device 1 0 3 0" > / proc/scsi/scsi
   to reconfigure the tape drive in Sequencial-Access mode

Does it also work in your case ?

comment:4 by Bruno Cornec, 13 years ago

Milestone: 2.2.102.2.9.8

I have now access to a similar HW configuration, and will be doing tests next week to try to reproduce it.

comment:5 by Bruno Cornec, 12 years ago

Milestone: 3.0.03.0.1

I can confirm I see the same problem both with Firmware WS92 and WS95 on my HP DAT 160 SAS connected to a Smart Array P812 with FW 3.66 and 5.12.

In your case, you still see the drive after the boot:

  Vendor: HP        Model: Ultrium 2-SCSI    Rev: T61D
  Type:   CD-ROM                             ANSI SCSI revision: 05

which could allow for detection of this case and try doing something.

In my case, there is no device available to discuss with the Hardware. Nothing at all in /proc/scsi/scsi, nor in the dmesg output. As I have an external drive connected to a Smart Array controller, if I turn it off, then on, and do

rmmod cciss
rmmod hpsa
rmmod st
modprobe hpsa
modprobe st

then I can dialog with my tape drives, and it loads the rest from the tape.

However, if you have an internal drive, there is no way to do that !

Remains to see if I can find a software way to reset the tape from the CLI, which I've not been able to find up to now.

comment:6 by Bruno Cornec, 12 years ago

On another case I'm working on I find:

scsi2 : cciss
  Vendor: HP        Model: DAT160            Rev: WS95
  Type:   CD-ROM                             ANSI SCSI revision: 03
sr1: scsi-1 drive

This could be due to a driver difference between SLES 10 (2.6.16.60-0.77.1-smp) with cciss and RHEL 6 (2.6.32-131.17.1.el6) with hpsa which doesn't show the device in CD-ROM mode at all.

comment:7 by Bruno Cornec, 12 years ago

Booting the RHEL 6.1 server with hpsa having the tape in boot mode without tape, allows to boot on the native RHEL 6.1 and check that the behaviour is similar (nothing in /proc/scsi/scsi, no message detecting the tape in hpsa loading).

Using hpacucli doesn't seem to help reseting the tape in the sequential mode either (needs more research). Next step is to use the USB drive with the same OBDR tape to check what happens, and to check with another distro (SLES 10 SP3) to see whether it could be better with a different driver (cciss in that case).

comment:8 by Bruno Cornec, 12 years ago

Booting the RHEL 6.1 server with usb_storage having the tape in boot mode allows to boot on the native RHEL 6.1 and then the tape is put back into sequential mode and the rest of the data can be accessed in that configuration (with the exact same tape that doesn't work with the SAS drive).

comment:9 by Bruno Cornec, 12 years ago

Milestone: 3.0.13.0.0

comment:10 by Bruno Cornec, 12 years ago

I'm working with an HP colleague to get a piece of software that will solve this issue, and can be called from the init script of mindi to put back the tape drive in Sequential mode, in case it's still in CD-ROM mode (and the reverse as well, so will allow to perform fully automated DR with OBDR in that version)

Will be handled in 3.0.0, and that additinal program should be available soon as well.

comment:11 by Bruno Cornec, 12 years ago

This is now fixed with rev [2915] and [2913] at least for SLES 10 (cciss driver). Will check next week for RHEL 6 as well (hpsa driver). It fixes this issue by using an external program (hpsa_obdr_mode) which can set the mode of the tape to CD-ROM or Sequential at will.

That program will have to be downloaded from http://cciss.sf.net

comment:12 by Bruno Cornec, 12 years ago

Here is an example script showing how to setup the tape correctly completely from the CLI:

modprobe st
hpsa_obdr_mode -m tape /dev/cciss/c1d0
mondoarchive -d /dev/st0 -o -O -t -N -g
hpsa_obdr_mode -m cd /dev/cciss/c1d0
reboot

The machine is backed up and then rebooted in the OBDR mode ... without using the button ;-)

Version 0, edited 12 years ago by Bruno Cornec (next)

comment:13 by Bruno Cornec, 12 years ago

Resolution: fixed
Status: assignedclosed

I can now confirm that on SLES 10 with rev [2918] and with the additional hpsa_obdr_mode command, the problem is fixed.

by Bruno Cornec, 12 years ago

Attachment: hpsa_obdr_mode.c added

Source code for the hpsa_obdr_mode program

by Bruno Cornec, 12 years ago

Attachment: Makefile added

Makefile to build the hpsa_obdr_mode program

comment:14 by Bruno Cornec, 12 years ago

Pending the availability of the official source code from the upstream sourceforge project mentioned upper, I attach a copy of the source and the Makefile to allow building it.

Note: See TracTickets for help on using tickets.