tech notes

Raw ext2 Recovery (Part 1)

Recovering a ext2 partition which has been partly overwritten

What a stupid mistake that was. I had my ext2 drive in my windows box. Since I was trying out a few things I was trying to mount a drive via iSCSI and I accidentally formatted the local drive instead of the iSCSI drive (that explains why I got a whopping 80Mb/s when writing to that drive ;-) ). So sh** happens, I formatted that drive, and copied about a 2GB file onto that drive before realizing what was happening.

My first step was to add a second drive of that similar size so I can clone that drive. This would give me the freedom to try a few recovery options out without messing up things any further. The original drive was /dev/sdc, the added one was /dev/sdd (the first primary partition being the partition I wanted to recover):

dd_rescue -b 4M /dev/sdc /dev/sdd

This took about 3 hours for the 1.5TB.

After that my first thought was to use e2fsck trying to fix things, specifying an alternate superblock since the original was wiped out:

e2fsck -v -y -b 20480000 /dev/sdd1

Finding an alternative superblock was easy:

mkfs.ext2 -n /dev/sdd1

“-n” tells mkfs.ext2 not to actually create the filesystem, but print out what it would do if it was creating the filesystem:

1 root@grml ~ # mkfs.ext3 -n /dev/sdc1
mke2fs 1.41.6 (30-May-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
91578368 inodes, 366284000 blocks
18314200 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
11179 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848

Back to e2fsck. It gave me a lot of “dtime” errors which seem to be not so dangerous. But what was quite worrying was the fact that it showed a lot of messages like this:

Multiply-claimed block(s) in inode 40198333: 163480071

After rerunning all previous steps I made sure that the output of e2fsck is being saved for further analysis. It showed me a list of files which looked like this:

File /Images/_SP/2009-01.iso (Inode #27640092, mod time Wed Mar 18 18:38:07 2009)
  has 7 Multiply-claimed block(s), shared with 8 file(s):

	... (Inode #11720837, mod time Mon May 27 04:08:51 1912)
	... (Inode #11687481, mod time Mon Mar 31 03:23:00 1924)
	... (Inode #11743113, mod time Sat Aug 26 17:41:05 2000)
	... (Inode #11700725, mod time Thu Nov 17 04:16:01 1921)
	... (Inode #11807290, mod time Mon Jan  9 08:12:45 1989)
	... (Inode #11781593, mod time Tue Jun  1 19:26:38 2004)
	... (Inode #11711346, mod time Fri Jan 16 00:54:26 1925)
Multiply-claimed blocks already reassigned or cloned.

So I had a list of files which it said were sharing some blocks. At first I thought I could get away with this and this would be just some kind of bug or something. Further analysis showed that almost all of those files were indeed corrupted (I checked the md5-hashes). :-(

Next step was to try e2salvage. I could not compile it (the source has not been maintained for years) but I found a rescue cd called “PLD Rescue CD” which had a binary of e2salvage. Unfortunately e2salvage didn’t like to run complaining about the missing superblock. Even supplying an alternative superblock did not help. It started a few things but then got stuck (I copied over the superblock manually which did not have any real effect) so I scraped the idea of using e2salvage.

Then I tried a Windows ext recovery tool, called FIXME
It was able to recover a lot of files and maintaining file integrity. So there must be some hope to achieve this in linux as well!

Again back to e2fsck. Maybe e2fsck was confused with all the (random) data found now in the inodes, so maybe wiping those areas which were overwritten in my first mishap would make things easier for e2fsck?

Next step: Identify blocks on which the faulty inodes are mapped to.
First filter out the inode numbers:

grep "Inode" e2fsck.output.log > inodes
cat inodes | sed "s#.*node \([0-9]*\).*#\1#g" > inodes.filtered
egrep "^[0-9]*$" inodes.filtered | egrep "^[0-9]*$" | sort | uniq > inode.nums

Then ask debugfs to find the corresponding blocks:

cat inode.nums | while read l ; do echo "imap <$l>" >> debugfs.todo; done
debugfs -c -b 4096 -s 229376 /dev/sdd1 -f debugfs.todo > debugfs.out
grep located debugfs.out | sed "s#.*block \([0-9]*\).*#\1#g" | sort -n | uniq > blocks

VoilĂ  – a nice list of blocks concerning all affected inodes. So now I had a look at the list and decided which were those area which were overwritten in the first place. It was more or less a big chunk of blocks which were near to each other. I now zeroed out those blocks.

cat blocks | while read l; do dd_rescue -m 1024 -s $(( $l * 4096 )) /dev/zero /dev/sdd1; done

Then I started e2fsck again…

Further Links

Leave a Reply

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Archives