Data recovery of BTRFS Soft RAID

Data Recovery of Software RAID5 with BTRFS

  • At Computer Assistance we are used to doing advanced data recovery jobs but this one brought to us by an anxious client deserves writing about.
  • The client brought us a Netgear NAS with 4 disks configured in RAID 5. The Netgear was showing the raid as ‘Failed’ - very help.
  • Read on to see how we fixed the issue and ended up with a very happy client and complete data recovery. And if you have any data that needs to be recovered - please contact us, we are happy to help!
  1. First we pulled the drives out of the NAS after carefully labelling them to preserve the order ( e.g. 0,1,2,3 )
  2. Then we then scanned the drives one by one and checked for any SMART errors/bad blocks
  3. You have to be carefull running scans, as if you do read/write scan it will wipe the drives
  4. The next step was to do an image of each drive before continuing with any work ( we used dd ).
  5. We booted an Ubuntu 16.04 live CD to do some scans for the file system
  6. Once booted, we had to install mdadm tool in order to assamble the array. You might also want to install the openssh-server package if you want to be able to do the recovery remotely, or check the progress once you go home.

  • Once we boot the live cd, we need to go and open command prompt

Screen1.png

  • Install OpenSSH-Server and MDADM If you are not going to use SSH to login remotely and do the recovery, you may want to skip installing OpenSSH and the next step with changeing the password
ubuntu@ubuntu:~$ sudo apt install mdadm openssh-server
Reading package lists... Done
Building dependency tree       
Reading state information... Done
mdadm is already the newest version (3.3-2ubuntu7.2).
The following additional packages will be installed:
  openssh-client openssh-sftp-server mdadm
Suggested packages:
  ssh-askpass libpam-ssh keychain monkeysphere rssh molly-guard
3 installed, 0 newly installed, 0 to remove and 389 not upgraded.
Need to get 963 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
  • By default the ubuntu user does not have a password so we want to set one if we are going to use ssh
ubuntu@ubuntu:~$ passwd
Changing password for ubuntu.
(current) UNIX password: 
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

ubuntu@ubuntu:~$
  • for current password, just hit enter as we don’t have one

  • Once setup we can go ahead with examining the drives * in my case it`s sdd3, this might be different with your setup *

root@ubuntu:~# mdadm --examine /dev/sdd3 
/dev/sdd3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9aa649f2:4035e044:10f2f9a3:ed428533
           Name : 2fe64b0c:data-0
  Creation Time : Mon Oct 26 22:45:07 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7804333680 (3721.40 GiB 3995.82 GB)
     Array Size : 11706500352 (11164.19 GiB 11987.46 GB)
  Used Dev Size : 7804333568 (3721.40 GiB 3995.82 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262064 sectors, after=112 sectors
          State : clean
    Device UUID : 999b4ed7:d226695b:7019db2e:3f4ece02

    Update Time : Wed Jul 26 08:19:46 2017
       Checksum : 7c27429f - correct
         Events : 970

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
  • We can see from the first paragraph that it’s RAID5 with 4 drives. You can do this for all of your drives, to make sure you will assemble the MD device with the correct drives.

  • Next we will try assemble the array automatically without giving the drives ( Think about it as a autodiscovery of the RAIDed drives )

ubuntu@ubuntu:~# sudo mdadm --assemble --scan
mdadm: /dev/md/1 has been started with 4 drives.
mdadm: /dev/md/data-0 has been started with 4 drives.
  • In our case /dev/md/1 is the “/” partition where the OS was stored and /dev/md/data-0 is where the actual data

  • Next we are going to identify where is our data device. As far as we know, it will be RAID5 device with total size ~12TB ( 4x4tb in RAID5 )

ubuntu@ubuntu:/home/ubuntu# sudo ls -las /dev/md*
0 brw-rw---- 1 root disk 9,   0 Aug 24 13:03 /dev/md0
0 brw-rw---- 1 root disk 9,   1 Aug 24 13:04 /dev/md1
0 brw-rw---- 1 root disk 9, 127 Aug 24 13:05 /dev/md127
ubuntu@ubuntu:/home/ubuntu# sudo mdadm -D /dev/md1
/dev/md1:
        Version : 1.2
  Creation Time : Tue Jul 18 12:46:02 2017
     Raid Level : raid10
     Array Size : 1046528 (1022.17 MiB 1071.64 MB)
  Used Dev Size : 523264 (511.09 MiB 535.82 MB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Fri Jul 21 14:43:35 2017
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : 2fe64b0c:1
           UUID : a2af1e47:6366d2bc:20a7d517:cc866a6c
         Events : 19

    Number   Major   Minor   RaidDevice State
       0       8       34        0      active sync set-A   /dev/sdc2
       1       8       50        1      active sync set-B   /dev/sdd2
       2       8        2        2      active sync set-A   /dev/sda2
       3       8       18        3      active sync set-B   /dev/sdb2
ubuntu@ubuntu:/home/ubuntu# sudo mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Mon Oct 26 22:45:07 2015
     Raid Level : raid1
     Array Size : 4190208 (4.00 GiB 4.29 GB)
  Used Dev Size : 4190208 (4.00 GiB 4.29 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Thu Aug 24 13:04:48 2017
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           Name : 2fe64b0c:0
           UUID : 4ce0ad55:1f2eba63:39e8b91d:3fd60b15
         Events : 1194

    Number   Major   Minor   RaidDevice State
       4       8       49        0      active sync   /dev/sdd1
       5       8       33        1      active sync   /dev/sdc1
       6       8        1        2      active sync   /dev/sda1
       7       8       17        3      active sync   /dev/sdb1
ubuntu@ubuntu:/home/ubuntu# sudo mdadm -D /dev/md127 
/dev/md127:
        Version : 1.2
  Creation Time : Mon Oct 26 22:45:07 2015
     Raid Level : raid5
     Array Size : 11706500352 (11164.19 GiB 11987.46 GB)
  Used Dev Size : 3902166784 (3721.40 GiB 3995.82 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Wed Jul 26 08:19:46 2017
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : 2fe64b0c:data-0
           UUID : 9aa649f2:4035e044:10f2f9a3:ed428533
         Events : 970

    Number   Major   Minor   RaidDevice State
       4       8       35        0      active sync   /dev/sdc3
       1       8       51        1      active sync   /dev/sdd3
       2       8        3        2      active sync   /dev/sda3
       3       8       19        3      active sync   /dev/sdb3
ubuntu@ubuntu:/home/ubuntu# 
  • /dev/md127 Seems about right - RAID5 with 11164.19 GiB size in total
  • As we already know that the FS is corrupted the next step is a bit longshot. I’ll try mount it.
ubuntu@ubuntu:/home/ubuntu# sudo mount /dev/md127 /data/
mount: wrong fs type, bad option, bad superblock on /dev/md127,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
  • It doesn’t mount, we can run dmesg for a better understanding of the message above
[  752.433528] md127: detected capacity change from 0 to 11987456360448
[  783.713605] Btrfs loaded, crc32c=crc32c-generic
[  783.714350] BTRFS: device label 2fe64b0c:data devid 1 transid 47594 /dev/md127
[  783.714759] BTRFS info (device md127): disk space caching is enabled
[  783.788214] BTRFS error (device md127): bad tree block start 8591023754258257750 18261714698240
[  783.791346] BTRFS error (device md127): bad tree block start 7476525050543419467 18261714698240
[  783.791357] BTRFS warning (device md127): failed to read tree root
[  783.808104] BTRFS: open_ctree failed
[ 9149.368917] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[15037.840623] BTRFS info (device md127): disk space caching is enabled
[15038.911241] BTRFS error (device md127): bad tree block start 8591023754258257750 18261714698240
[15038.911517] BTRFS error (device md127): bad tree block start 7476525050543419467 18261714698240
[15038.911530] BTRFS warning (device md127): failed to read tree root
[15038.932070] BTRFS: open_ctree failed
  • Now let’s try run restore in dry run mode. We are going to use -i to Ignore Errors, -D to run in Dry Run mode and -v to increase the Verbosity
ubuntu@ubuntu:~# sudo btrfs restore -i -D -v /dev/md127 /dev/null 
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found EBF0F3FF wanted B29195E1
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
bytenr mismatch, want=18261714698240, have=8591023754258257750
Couldn't read tree root
Could not open root, trying backup super
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found EBF0F3FF wanted B29195E1
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
bytenr mismatch, want=18261714698240, have=8591023754258257750
Couldn't read tree root
Could not open root, trying backup super
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
checksum verify failed on 18261714698240 found EBF0F3FF wanted B29195E1
checksum verify failed on 18261714698240 found FCAF424E wanted 5A4DE202
bytenr mismatch, want=18261714698240, have=8591023754258257750
Couldn't read tree root
Could not open root, trying backup super
  • If this would of worked we could have continue with the restore running. As we have a corrupted filesystem, you will need to look at the steps bellow bellow steps
sudo btrfs restore -i -D -v /dev/md127 /Data-Recovery

  • We will try to restore from another tree location. In order to find on we need to run btrfs-find-root

  • To find a healthy tree location a.k.a. well blocks you need to run the following command. We will start it as a background proccess as it might take a while and also might wont be easy to read if it’s on the screen.

nohup btrfs-find-root /dev/md127 &> /root/btrfs-find-root &
  • You can monitor the progress with tail -f /root/btrfs-find-root

  • Next, we can create a simple loop to go through the “well blocks” and do a restore to see how many files and folders are restored.

vim /root/btrfs-restore-from-tree.sh

#!/bin/bash
for i in `tac /root/btrfs-find-root | grep 'Well block' | awk '{print $3}' | sed "s/(.*$//"`; do echo "--- Well block $i ---"; btrfs restore -F -D -i -v -t $i /dev/md127 /dev/null 2>&1 | tee /root/rest-btrfs-restore-wb-$i.1; done 
  • make sure the script is a executable
chmod +x /root/btrfs-restore-from-tree.sh
  • Now you can run as a background proccess in order to find which well block has most output
nohup /root/btrfs-restore-from-tree.sh &> /root/restored-from-tree &
  • Use either of the following commands below to find out which well block / tree location has the most restored files & folders (usually biggest rest* file has the most)
ls -lisahSr /root/rest*
for i in /root/rest*; do echo -n "$i : "; cat $i | grep "^Restoring" | wc -l; done;
for i in /root/rest*; do echo -n "$i : "; cat $i | grep "^Restoring" | wc -l; done | sort -nk3
  • The bigger wellblock number the recent data it has, so you can decide which one you want to recover. Usually you would recover the most recent one - the biggest wellblock number

  • Once you know your wellblock number, you can restore its content like so:

btrfs restore -i -o -v -t 123456789123 /dev/md127 /USB
  • This will take some time, on some systems like ReadyNAS, there is -F option which will press YES automatically if the recovery goes into a loop or need a confirmation to restore certain files/folders. I did the recovery on Ubuntu and this option doesn’t exist.