[Bugme-new] [Bug 13594] New: SMART responses for SATA disks on SAS get interpreted as errors

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Sun Jun 21 10:26:29 PDT 2009


http://bugzilla.kernel.org/show_bug.cgi?id=13594

           Summary: SMART responses for SATA disks on SAS get interpreted
                    as errors
           Product: IO/Storage
           Version: 2.5
    Kernel Version: 2.6.30-rc6
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: SCSI
        AssignedTo: linux-scsi at vger.kernel.org
        ReportedBy: sgunderson at bigfoot.com
        Regression: No


Hi,

I just bought a LSI SAS3081E-R which I use against a Supermicro backplane to
drive ten Seagate SATA disks (7200.11, 750GB and 1.5GB). I'm using the
standard Linux Fusion MPT device driver (CONFIG_FUSION_SAS) under Linux
2.6.30-rc6. Everything seems to work pretty well, with one exception: When I
use SMART against the drives (say, smartctl -a /dev/sda) the kernel complains
with:

  [  811.091916] sd 0:0:0:0: [sda] Sense Key : Recovered Error [current]
[descriptor]
  [  811.099807] Descriptor sense data with sense descriptors (in hex):
  [  811.106175]         72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 00 00
  [  811.113262]         00 4f 00 c2 00 50
  [  811.117379] sd 0:0:0:0: [sda] Add. Sense: ATA pass through information
available

I've tried upgrading to the newest firmware (1.28.02.00, 05-MAY-2009), but
all that changed is that the hex dump was added to the error message.

Whenever this happens, it appears like all the disks “hiccup” and the kernel
loses contact with the controller for a small while. If too many of these
happen at once, eventually disks start falling off RAIDs, and the entire
machine goes down. It looks to me as if these messages should simply not be
treated as errors by the kernel -- smartctl explicitly asks for a response even
if the command doesn't fail (by setting CK_COND), so the response probably
shouldn't be taken as an error.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the Bugme-new mailing list