failed command: READ FPDMA QUEUED FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
So my last Seagate SATA drive in my RAID 6 Array died spectacularly taking out my 4.8.4 Kernel and locking up my storage to the point where the only way I can get to it is via the kernel boot parameter init=/bin/bash . The disk lasted about 5.762 years:
[root@rfc1178-01 log]# smartctl -A /dev/sdd
smartctl 6.1 2013-03-16 r3800 [i686-linux-3.10.5-201.fc19.i686] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 082 006 Pre-fail Always – 49816764
3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always – 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always – 358
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always – 0
7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always – 199979728
9 Power_On_Hours 0x0032 043 043 000 Old_age Always – 50479
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always – 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always – 173
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always – 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always – 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always – 665
188 Command_Timeout 0x0032 099 099 000 Old_age Always – 65540
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always – 0
190 Airflow_Temperature_Cel 0x0022 069 059 045 Old_age Always – 31 (Min/Max 23/31)
194 Temperature_Celsius 0x0022 031 041 000 Old_age Always – 31 (0 20 0 0 0)
195 Hardware_ECC_Recovered 0x001a 039 018 000 Old_age Always – 49816764
197 Current_Pending_Sector 0x0012 099 098 000 Old_age Always – 42
198 Offline_Uncorrectable 0x0010 099 098 000 Old_age Offline – 42
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always – 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline – 266288022969
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline – 1037691197
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline – 1219786117
[root@rfc1178-01 log]# hdparm -i /dev/sdd
/dev/sdd:
Model=ST31000520AS, FwRev=CC32, SerialNo=9VX0WJKA
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=1953525168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=yes: unknown setting WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-4,5,6,7
* signifies the current active mode
[root@rfc1178-01 log]#
[root@rfc1178-01 log]#
[root@rfc1178-01 log]# fdisk -l /dev/sdd
Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
[root@rfc1178-01 log]#
And with these errors in the /var/log/messages ( /root/spectacular-failure-messages ):
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: irq_stat 0x40000008
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: failed command: READ FPDMA QUEUED
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: cmd 60/40:70:40:2b:b2/05:00:3b:00:00/40 tag 14 ncq dma 688128 in
Mar 19 15:49:09 mbpc-pc kernel: res 41/40:40:ff:2b:b2/00:05:3b:00:00/00 Emask 0x409 (media error) <F>
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: status: { DRDY ERR }
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: error: { UNC }
Mar 19 15:49:09 mbpc-pc kernel: qla2xxx [0000:04:00.0]-680a:20: Loop down – seconds remaining 160.
Mar 19 15:49:09 mbpc-pc kernel: ata4.00: configured for UDMA/133
Mar 19 15:49:09 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 19 15:49:09 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#14 Sense Key : Medium Error [current]
Mar 19 15:49:09 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#14 Add. Sense: Unrecovered read error – auto reallocate failed
Mar 19 15:49:09 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#14 CDB: Read(10) 28 00 3b b2 2b 40 00 05 40 00
Mar 19 15:49:09 mbpc-pc kernel: blk_update_request: I/O error, dev sdd, sector 1001532415
Mar 19 15:49:09 mbpc-pc kernel: ata4: EH complete
And with these following:
Mar 19 15:54:20 mbpc-pc kernel: blk_update_request: I/O error, dev sdd, sector 1001534264
Mar 19 15:54:20 mbpc-pc kernel: ata4: EH complete
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: irq_stat 0x40000008
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: failed command: READ FPDMA QUEUED
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: cmd 60/08:28:48:34:b2/00:00:3b:00:00/40 tag 5 ncq dma 4096 in
Mar 19 15:54:24 mbpc-pc kernel: res 41/40:08:48:34:b2/00:00:3b:00:00/00 Emask 0x409 (media error) <F>
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: status: { DRDY ERR }
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: error: { UNC }
Mar 19 15:54:24 mbpc-pc kernel: ata4.00: configured for UDMA/133
Mar 19 15:54:24 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 19 15:54:24 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#5 Sense Key : Medium Error [current]
Mar 19 15:54:24 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#5 Add. Sense: Unrecovered read error – auto reallocate failed
Mar 19 15:54:24 mbpc-pc kernel: sd 7:0:0:0: [sdd] tag#5 CDB: Read(10) 28 00 3b b2 34 48 00 00 08 00
Mar 19 15:54:24 mbpc-pc kernel: blk_update_request: I/O error, dev sdd, sector 1001534536
Mar 19 15:54:24 mbpc-pc kernel: ata4: EH complete
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-e872:20: qlt_24xx_atio_pkt_all_vps: qla_target(0): type d ox_id 0000
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-e82e:20: IMMED_NOTIFY ATIO
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-f826:20: qla_target(0): Port ID: 0x00:00:01 ELS opcode: 0x03
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-e81c:20: Sending TERM ELS CTIO (ha=ffff88010ef90000)
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-f897:20: Linking sess ffff8800c3f84b40 [0] wwn 50:01:43:80:16:77:99:38 with PLOGI ACK to wwn 50:01:43:80:16:77:99:38 s_id 01:00:00, ref=1
Mar 19 15:54:24 mbpc-pc kernel: qla2xxx [0000:04:00.0]-e862:20: qla_target(0): Unexpected NOTIFY_ACK received
Mar 19 15:54:26 mbpc-pc kernel: INFO: task kworker/1:2:96 blocked for more than 120 seconds.
Mar 19 15:54:26 mbpc-pc kernel: Not tainted 4.8.4 #2
Mar 19 15:54:26 mbpc-pc kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 19 15:54:26 mbpc-pc kernel: kworker/1:2 D ffff8801115db718 0 96 2 0x0000000
Mar 19 15:54:26 mbpc-pc kernel: kworker/1:2 D ffff8801115db718 0 96 2 0x00000000
Mar 19 15:54:26 mbpc-pc kernel: Workqueue: qla_tgt_wq qlt_do_work [qla2xxx]
Mar 19 15:54:26 mbpc-pc kernel: ffff8801115db718 ffff8801115db688 ffff88011a83a300 ffff88011fc17a80
Mar 19 15:54:26 mbpc-pc kernel: ffff8801115d20c0 ffff880100000001 ffffffff8109075d 0000000000000000
Mar 19 15:54:26 mbpc-pc kernel: ffff88011ffdc5c0 ffff880100000000 0000000000000011 ffff880100000000
Mar 19 15:54:26 mbpc-pc kernel: Call Trace:
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: INFO: task kworker/u16:4:262 blocked for more than 120 seconds.
Mar 19 15:54:26 mbpc-pc kernel: Not tainted 4.8.4 #2
Mar 19 15:54:26 mbpc-pc kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 19 15:54:26 mbpc-pc kernel: kworker/u16:4 D ffff88011086fa18 0 262 2 0x00000000
Mar 19 15:54:26 mbpc-pc kernel: Workqueue: tmr-fileio target_tmr_work [target_core_mod]
Mar 19 15:54:26 mbpc-pc kernel: ffff88011086fa18 0000000000000400 ffff8800bfe1a500 ffff88011086f998
Mar 19 15:54:26 mbpc-pc kernel: ffff880110862000 ffffffff81f99ca0 ffffffff81f998ef ffff880100000000
Mar 19 15:54:26 mbpc-pc kernel: ffffffff812f27d9 ffff880100000000 ffffffff8109a2f8 0000000000000000
Mar 19 15:54:26 mbpc-pc kernel: Call Trace:
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: INFO: task kworker/u16:8:294 blocked for more than 120 seconds.
Mar 19 15:54:26 mbpc-pc kernel: Not tainted 4.8.4 #2
Mar 19 15:54:26 mbpc-pc kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 19 15:54:26 mbpc-pc kernel: kworker/u16:8 D ffff8801109d7a18 0 294 2 0x0000000
Mar 19 15:54:26 mbpc-pc kernel: Workqueue: tmr-fileio target_tmr_work [target_core_mod]
Mar 19 15:54:26 mbpc-pc kernel: ffff8801109d7a18 0000000000000400 ffff88011a84c380 ffff8801109d7998
Mar 19 15:54:26 mbpc-pc kernel: ffff88011090a240 ffffffff81f99ca0 ffffffff81f998ef ffff880100000000
Mar 19 15:54:26 mbpc-pc kernel: ffffffff812f27d9 0000000000000000 0000000000000000 0000000000000000
Mar 19 15:54:26 mbpc-pc kernel: Call Trace:
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
Mar 19 15:54:26 mbpc-pc kernel: [
…..
So with that goes the last disk of it's kind in this array with NO data loss to the array itself, over the last 8 years.
Cheers,
TK