ddrescue recovers 0 byte from HDD

bobby_yan

New member
1 of the 3 HDDs (Western Digitial WD3003FZEX Black 3TB SATA 6GB/S 7200RPM 64MB Cache 3.5IN Hard Drive) fails in a RAID 0 array. At first, it disappears from the OS. Then, it re-appears after re-plug the cables. However, it is not stable, sometime even the serial number cannot be read out (see below).

I try to clone it first with ddrescue, but even cannot even read a byte.

Code:
sudo ddrescue -f  /dev/sdc Port1.img Port1.log
GNU ddrescue 1.22
     ipos:    3000 GB, non-trimmed:        0 B,  current rate:       0 B/sB/s
     opos:    3000 GB, non-scraped:        0 B,  average rate:       0 B/s
non-tried:        0 B,  bad-sector:    3000 GB,    error rate:  23256 kB/s
  rescued:        0 B,   bad areas:        1,        run time: 14h 18m 13s
pct rescued:    0.00%, read errors:5906318682,  remaining time:         n/a
                              time since last successful read:         n/a
Finished

sudo ddrescue -f -d /dev/sdc Port1.img Port1.log
GNU ddrescue 1.22
     ipos:   38402 MB, non-trimmed:   38408 MB,  current rate:       0 B/s
     opos:   38402 MB, non-scraped:        0 B,  average rate:       0 B/s
non-tried:    2962 GB,  bad-sector:        0 B,    error rate:    875 MB/s
  rescued:        0 B,   bad areas:        0,        run time:         17s
pct rescued:    0.00%, read errors:   586172,  remaining time:         n/a
                              time since last successful read:         n/a
Copying non-tried blocks... Pass 5 (forwards)^C
  Interrupted by user

sudo ddrescue -f -d -R /dev/sdc Port1.img Port1.log
GNU ddrescue 1.22
     ipos:    2895 GB, non-trimmed:  105045 MB,  current rate:       0 B/s
     opos:    2895 GB, non-scraped:        0 B,  average rate:       0 B/s
non-tried:    2895 GB,  bad-sector:        0 B,    error rate:   2350 MB/s
  rescued:        0 B,   bad areas:        0,        run time:         46s
pct rescued:    0.00%, read errors:  1602973,  remaining time:         n/a
                              time since last successful read:         n/a
Copying non-tried blocks... Pass 5 (backwards)

After ddrescue, I tried dd as well, but it copies out more than 16TB data and is continuing, which is obviously wrong.

Code:
sudo dd if=/dev/sdc conv=sync,noerror bs=64K | gzip -c  > ./Port1.img.gz
dd: error reading '/dev/sdc': Input/output error
0+248815925 records in
248815925+0 records out
16306400460800 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
dd: error reading '/dev/sdc': Input/output error
0+248815926 records in
248815926+0 records out
16306400526336 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
dd: error reading '/dev/sdc': Input/output error
0+248815927 records in
248815927+0 records out
16306400591872 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
^C0+248815928 records in
248815927+0 records out
16306400591872 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s

Here is the HDD information, the one without serial number of the broken one. After reboot, there is a chance to read the serial number successfully, and it is WD-WMC5D0D61CYD.

Code:
sudo mdadm --detail-platform
mdadm: imsm capabilities not found for controller: /sys/devices/pci0000:00/0000:00:11.4 (type SATA)
       Platform : Intel(R) Rapid Storage Technology
        Version : 14.8.0.2377
    RAID Levels : raid0 raid1 raid10 raid5
    Chunk Sizes : 4k 8k 16k 32k 64k 128k
    2TB volumes : supported
      2TB disks : supported
      Max Disks : 7
    Max Volumes : 2 per array, 4 per controller
 I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
          Port2 : /dev/sdd (WD-WMC5D0D9X6D8)
          Port3 : /dev/sde (WD-WMC1F0EARDW0)
          Port1 : /dev/sdc ()
          Port0 : - no device attached -
          Port4 : - no device attached -
          Port5 : - no device attached –

The motherboard is ASUS X99-E USB 3.1, the RAID 0 is setup with Intel RST, and the I/O Controller is still left in RAID mode in BIOS. However, the raid array only consists of WD-WMC5D0D9X6D8 and WD-WMC1F0EARDW0 in BIOS now, the broken is now a non-member drive. The system must somehow detect the drive failure and remove it from the array automatically.

Any further effort I should try before given if I would like to clone the failed drive? Change the PCB? Thank you.

Edit Jan. 21, 2021: I find the HDD works a few hours after reboot, and ddrescue reads about 90GB data out before the above mentioned 0-byte problem, at an average speed of 2 MB/s. Moreover, mdadm is able to read out the serial number just after reboot, but not any more after 0-byte problem reappears. Shall I keep rebooting the machine? Is there any command I can just reboot/reset the HDD rather than the whole system?
 

Jared

Administrator
Staff member
Not likely a bad PCB with that behavior. More likely it's a weak or failed read/write head.
 

bobby_yan

New member
Thanks. After ddrescue, I tried dd as well, but it copies out more than 16TB data and is continuing, which is obviously wrong.

Code:
sudo dd if=/dev/sdc conv=sync,noerror bs=64K | gzip -c  > ./Port1.img.gz
dd: error reading '/dev/sdc': Input/output error
0+248815925 records in
248815925+0 records out
16306400460800 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
dd: error reading '/dev/sdc': Input/output error
0+248815926 records in
248815926+0 records out
16306400526336 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
dd: error reading '/dev/sdc': Input/output error
0+248815927 records in
248815927+0 records out
16306400591872 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
^C0+248815928 records in
248815927+0 records out
16306400591872 bytes (16 TB, 15 TiB) copied, 91157.6 s, 179 MB/s
 

Jared

Administrator
Staff member
If ddrescue doesn't work, dd is guaranteed to fail. ddrescue is specifically designed to overcome the limitations of dd in cloning from unstable drives.

I think you are at the point where it's time to either consider the data lost or opt for professional service.
 

bobby_yan

New member
I find the HDD works a few hours after reboot, and ddrescue reads about 90GB data out before the above mentioned 0-byte problem, at an average speed of 2 MB/s. Moreover, mdadm is able to read out the serial number just after reboot, but not any more after 0-byte problem reappears.

Shall I keep rebooting the machine? Is there any command I can just reboot/reset the HDD rather than the whole system?

Thank you.
 

Jared

Administrator
Staff member
Yeah, unplug the power cable from the HDD and plug it back in.

I have a suspicion you could be reading just jibberish, but I could be wrong.
 
Top