If you want to log all file accesses start the terminal, switch to root-user and type
fs_usage -w -f filesys > cap.txt
Every file access will be logged to cap.txt. You can grep, view edit for post analysis.
tech notes
If you want to log all file accesses start the terminal, switch to root-user and type
fs_usage -w -f filesys > cap.txt
Every file access will be logged to cap.txt. You can grep, view edit for post analysis.
I have had the issue that after suspending a certain VM the disk activity would go up after a few minutes. Quite annoying because the system was almost unusable. Kind of like it was doing some defragmentation. After a few minutes of researching (a.k.a googling
) I came across this thread on the VMWare user forums which explained that you had to edit the .vmx file and add following three lines:
mainMem.useNamedFile = "FALSE" mainMem.partialLazySave = "FALSE" mainMem.partialLazyRestore = "FALSE"
This did the trick for me.
From the client you have to build an SSH tunnel forwarding ports 443, 902, 903. Make sure you are *not* forwarding from 127.0.0.1 (since according to this posting on the VMware forums) this seems to have a special meaning when using the VMWare client, instead use 127.0.0.2. In this example the server’s IP address is 192.168.2.57, and the ssh machine is at 172.20.0.1, listening to port 3333:
ssh -L 127.0.0.2:443:192.168.2.57:443 -L 127.0.0.2:902:192.168.2.57:902 -L 127.0.0.2:903:192.168.2.57:903 root@172.20.0.1 -p 3333
After reading about the most recent faux-pas from McAfee where couple of system files were identified as infected I found this script which restores the quarantined files:
Option Explicit
on error resume next
Dim fso, f, str, commando, df, fdf, r, ff, WshShell, fx, strfx, rr, arrParm
Set fso = CreateObject("Scripting.FileSystemObject")
if fso.FileExists("C:\quarantine\infected.log") Then
Set f = fso.OpenTextFile("infected.log", 1)
Set WshShell = Wscript.CreateObject("WScript.Shell")
Do While Not f.AtEndOfStream
str = f.ReadLine
arrParm = split (str, "=>")
wscript.echo arrParm(0)
wscript.echo arrParm(1)
fso.copyFile Trim(arrParm(1)),Trim(arrParm(0))
Loop
end if
f.Close
(Taken from the aforementioned link)
So this one friend was complaining about his Windows XP acting sluggish after he has logged in. It takes almost 3 minutes until he is able to open the Windows Explorer – during that time the computer reacts very slowly, but the CPU load never really jumps up. After installing all updates and such I was playing around with the services just to find out that following services from Roxio cause the delay:
After disabling all those three services everything was working fine. Now I am really surprised why Dell decided to install a LiveShare P2P Server on a business laptop. If it was not Dell then most likely this was installed with the BlackBerry MediaManager software which is made by Roxio. It might be a good idea to completely get rid of this software – sounds very dodgy to me.
This motivated me to check why one of my personal laptops was also showing similar symptoms. It took about 30 seconds after login until I was able to open the Explorer. During that time I was not able to do anything on the desktop. So trial and error showed that the culprit was one dll loaded upon startup:
C:\Programme\Intel\Wireless\Bin\LgNotify.dll
You can find it in this registry key:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify
Alternativle use autoruns from Sysinternals to disable loading of this dll.
Invoking apt-get update led to following error message
#apt-get update [...] Fetched 254kB in 7s (35.8kB/s) Reading package lists... Done W: There is no public key available for the following key IDs: 9AA38DCD55BE302B W: GPG error: http://ftp2.de.debian.org unstable Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 9AA38DCD55BE302B W: GPG error: http://ftp.uni-stuttgart.de testing Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 9AA38DCD55BE302B W: GPG error: http://ftp.uni-stuttgart.de stable Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 9AA38DCD55BE302B W: You may want to run apt-get update to correct these problems
apt-key update did not fix the issue, but manually importing the key worked fine:
:~#apt-key update gpg: key 2D230C5F: "Debian Archive Automatic Signing Key (2006) <ftpmaster@debian.org>" not changed gpg: key 6070D3A1: "Debian Archive Automatic Signing Key (4.0/etch) <ftpmaster@debian.org>" not changed gpg: key ADB11277: "Etch Stable Release Key <debian-release@lists.debian.org>" not changed gpg: Total number processed: 3 gpg: unchanged: 3 :~# gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 9AA38DCD55BE302B gpg: key 55BE302B: public key "Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>" imported gpg: Total number processed: 1 gpg: imported: 1 (RSA: 1) :~# apt-key add ~/.gnupg/pubring.gpg OK :~# apt-get update [...] Fetched 2702B in 2s (991B/s) Reading package lists... Done
Alternatively do this
:~# gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 9AA38DCD55BE302B gpg: key 55BE302B: public key "Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>" imported gpg: Total number processed: 1 gpg: imported: 1 (RSA: 1) :~# gpg –export –armor 9AA38DCD55BE302B | apt-key add - OK
You want to send a newsletter to a bunch people who have sent you an email? Easy…
Now you should have a text file with all the eMail addresses.
An alternative way is to use a macro to add those addresses as contacts. Details here (german only).
Further links:
Recently a colleague called us and asked whether we could help him. His gentoo-system suffered from a boot problem and he was not able to start his system up which was showing "BOOT DRIVE FAILURE". Since he did not setup the system (which was his IMAP-server, fax-server, phone-management server and on top his webserver to handle the frontend for the customers) he was pretty much clueless about what to do. After arriving at his house he already took apart his computer and took out the drive which seems to took a crap. Connecting this drive via an external USB adapter to my laptop showed exactly nothing. The drive was not showing up, but I could hear how it was physically spinning up. I decided not to waste any time with this drive and asked him about the backups. He didn’t know the details but gave me the phone number of the guy who set up the system. After a call it was clear that there were indeed backups but only backups of the application data and none of the system. Grasping by the thought of setting up his fax-server (I hate doing this in linux) and his mail server within 3 hours (he needed to have his system running a.s.a.p.) I decided to have a quick look at the other drive which was built inside hoping to find at least a few config files etc. So I took out the other drive, connected it to my laptop and was happy to see a working filesystem with /bin, /boot etc folders which gave me a good feeling about being able to quickly get the system running. However I found out that the original maintainer of the system decided to take backups … onto the same drive. Doh!
After putting the drive back into the server and hoping that the system would start up, that 2nd drive was shown as failing in the POST-messages while booting. Ohboy! Disconnecting the drive and reconnecting it to my laptop showed indeed that the drive was crapping out as well and that I could not access the data anymore. What a great day. Additionally I realized that the drive was making very loud and weird noise indicating that it was about to say goodbye completely. Grasping even more… I initiated the download of an Ubuntu 8.0.4-server cd and went to lunch since I was behind a 2mbps DSL line and downloading a CD would take about 45minutes.
When I came back from lunch and while I started to burn the CD I decided to give the first drive another chance. I connected it and – damn! – it started and I could access the data on the drive. Making sure not to waste any time I initiated an image process of that drive (who would know how many minutes the drive would work?).
In a VM I had Knoppix (a live distro) running. I had connected a share from my Windows laptop to the Knoppix:
mount -t cifs -o username=administrator //192.168.222.1/laptopShare /mnt/smb/
Started a dd_rescue into a loop-file:
dd_rescue -b 4M /dev/sdb /mnt/smb/backup/sdb_image
A quick analysis showed that the disk had a capacity of 80GB, which was split onto two partitions. Partition 1 had 50GB, partition 2 30GB. While it was imaging the drive I started setting up the bare system from scratch just in case that I could not restore any of the system or the drive would crap out again.
After two hours the imaging process slowed down severely and the drive was making funny noises, but I was able to read about 57GB so chances were good that at least the first partition was rescued. Next step was to check out what kind of data I just rescued (as mentioned I didn’t want to take any chance so I did not interrupt or delay the imaging process), mounting the file. Since I did not copy each partition seperately I had to find out where the partitions beginnings were, using fdisk:
debian:~# fdisk -u -l /mnt/smb/backup/sdb
You must set cylinders.
You can do this from the extra functions menu.
Disk /mnt/smb/backup/sdb: 0 MB, 0 bytes
255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/mnt/smb/backup/sdb1 63 100020689 50010313+ 83 Linux
Partition 1 has different physical/logical endings:
phys=(1023, 254, 63) logical=(6225, 254, 63)
/mnt/smb/backup/sdb2 100020690 160826714 30403012+ 83 Linux
Partition 2 has different physical/logical beginnings (non-Linux?):
phys=(1023, 254, 63) logical=(6226, 0, 1)
Partition 2 has different physical/logical endings:
phys=(1023, 254, 63) logical=(10010, 254, 63)
In this case I wanted to know the offset for partition, which is sector 63. Multiply this with the sector size (512B/sector) and you get the offset of 32256 Bytes.
debian:~# mount -o loop,offset=32256 /mnt/smb/backup/sdb /mnt/loop debian:~# cd /mnt/loop debian:/mnt/loop# ls bin dev home lost+found opt root sys usr boot etc lib mnt proc sbin tmp var
Excellent. How about partition 2? Let’s have a look:
debian:/mnt/loop# mount -o loop,offset=51210593280 /mnt/smb/backup/sdb /mnt/loop2 debian:/mnt/loop# ls -al /mnt/loop2 total 637964 drwxr-xr-x 6 root root 4096 2009-06-13 14:35 . drwxr-xr-x 16 root root 4096 2009-06-18 15:51 .. -rw-r--r-- 1 root root 2937770 2009-06-13 14:06 bin.tar.bz2 -rw-r--r-- 1 root root 525861 2009-06-13 14:06 etc.tar.bz2 -rw-r--r-- 1 root root 6510707 2009-06-13 14:07 home.tar.bz2 -rw-r--r-- 1 root root 8406356 2009-06-13 14:07 lib.tar.bz2 drwx------ 2 root root 16384 2009-05-01 20:59 lost+found ?--------- ? ? ? ? ? /mnt/loop2/mailing ?--------- ? ? ? ? ? /mnt/loop2/mails ?--------- ? ? ? ? ? /mnt/loop2/test -rw-r--r-- 1 root root 182 2009-06-13 14:07 opt.tar.bz2 -rw-r--r-- 1 root root 13524123 2009-06-13 14:08 root.tar.bz2 -rw-r--r-- 1 root root 1883672 2009-06-13 14:08 sbin.tar.bz2 -rw-r--r-- 1 root root 447615119 2009-06-13 14:35 usr.tar.bz2 -rw-r--r-- 1 root root 171154246 2009-06-13 14:53 var.tar.bz2
Ah! There are the backups… on the same drive.
Since I had the images on my laptop and all I had was a rather slow USB adapter I decided to partly installed Ubunto on the server, add a new drive into that machine and copy the image back via network (my laptop and the server had a gigabit NIC) to save some time. Now there are nice ways to do this with netcat (described here, if you prefer ssh) I decided to use the more simple approach by mapping the share from my laptop to the server.
The next issue which I got into was that the dd’ed drive did not work properly. The filesystem showed all kind of different errors and while mounting the device I had some “attempt to access beyond end of device” errors (or similiar). cfdisk showed the proper sizes but it seems like something else was screwed. Even when I tried to create the fs again on that partition it showed a far too small partition size. I decided to create a larger partition manually ( >50GB ) and then just copy back the one partition only (the 2nd partition did not have any value to me anyway since it was incomplete). I calculated the offset on the target drive (see above) and startet another dd with supplying the source and target offsets. That worked fine and the data was consisten on accessible afterwards.
Now let’s get this drive booted. grub! It’s been ages that I have used grub so I had to get it done by reading, trial and error. Basically these were my steps:
/usr/lib/grub/i386pc/stage1 to /boot/grub//dev/null: Permission Denied-error.chroot into the path where you have mounted the partition from where you want to boot from (more details here)grub-install commandboot title Ubuntu, kernel 2.6.15-25-k7 (recovery mode) root (hd0,0) kernel /boot/vmlinuz-2.6.15-25-k7 root=/dev/sda1 ro single initrd /boot/initrd.img-2.6.15-25-k7
After that I could boot into the system… and encountered a freeze, which was because I forgot to edit the fstab. Correcting it made the system boot up properly and all of the service were accessible afterwards.
Recovering a ext2 partition which has been partly overwritten
What a stupid mistake that was. I had my ext2 drive in my windows box. Since I was trying out a few things I was trying to mount a drive via iSCSI and I accidentally formatted the local drive instead of the iSCSI drive (that explains why I got a whopping 80Mb/s when writing to that drive
). So sh** happens, I formatted that drive, and copied about a 2GB file onto that drive before realizing what was happening.
My first step was to add a second drive of that similar size so I can clone that drive. This would give me the freedom to try a few recovery options out without messing up things any further. The original drive was /dev/sdc, the added one was /dev/sdd (the first primary partition being the partition I wanted to recover):
dd_rescue -b 4M /dev/sdc /dev/sdd
This took about 3 hours for the 1.5TB.
After that my first thought was to use e2fsck trying to fix things, specifying an alternate superblock since the original was wiped out:
e2fsck -v -y -b 20480000 /dev/sdd1
Finding an alternative superblock was easy:
mkfs.ext2 -n /dev/sdd1
“-n” tells mkfs.ext2 not to actually create the filesystem, but print out what it would do if it was creating the filesystem:
1 root@grml ~ # mkfs.ext3 -n /dev/sdc1
mke2fs 1.41.6 (30-May-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
91578368 inodes, 366284000 blocks
18314200 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
11179 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Back to e2fsck. It gave me a lot of “dtime” errors which seem to be not so dangerous. But what was quite worrying was the fact that it showed a lot of messages like this:
Multiply-claimed block(s) in inode 40198333: 163480071
After rerunning all previous steps I made sure that the output of e2fsck is being saved for further analysis. It showed me a list of files which looked like this:
File /Images/_SP/2009-01.iso (Inode #27640092, mod time Wed Mar 18 18:38:07 2009) has 7 Multiply-claimed block(s), shared with 8 file(s): ... (Inode #11720837, mod time Mon May 27 04:08:51 1912) ... (Inode #11687481, mod time Mon Mar 31 03:23:00 1924) ... (Inode #11743113, mod time Sat Aug 26 17:41:05 2000) ... (Inode #11700725, mod time Thu Nov 17 04:16:01 1921) ... (Inode #11807290, mod time Mon Jan 9 08:12:45 1989) ... (Inode #11781593, mod time Tue Jun 1 19:26:38 2004) ... (Inode #11711346, mod time Fri Jan 16 00:54:26 1925) Multiply-claimed blocks already reassigned or cloned.
So I had a list of files which it said were sharing some blocks. At first I thought I could get away with this and this would be just some kind of bug or something. Further analysis showed that almost all of those files were indeed corrupted (I checked the md5-hashes).
Next step was to try e2salvage. I could not compile it (the source has not been maintained for years) but I found a rescue cd called “PLD Rescue CD” which had a binary of e2salvage. Unfortunately e2salvage didn’t like to run complaining about the missing superblock. Even supplying an alternative superblock did not help. It started a few things but then got stuck (I copied over the superblock manually which did not have any real effect) so I scraped the idea of using e2salvage.
Then I tried a Windows ext recovery tool, called FIXME
It was able to recover a lot of files and maintaining file integrity. So there must be some hope to achieve this in linux as well!
Again back to e2fsck. Maybe e2fsck was confused with all the (random) data found now in the inodes, so maybe wiping those areas which were overwritten in my first mishap would make things easier for e2fsck?
Next step: Identify blocks on which the faulty inodes are mapped to.
First filter out the inode numbers:
grep "Inode" e2fsck.output.log > inodes cat inodes | sed "s#.*node \([0-9]*\).*#\1#g" > inodes.filtered egrep "^[0-9]*$" inodes.filtered | egrep "^[0-9]*$" | sort | uniq > inode.nums
Then ask debugfs to find the corresponding blocks:
cat inode.nums | while read l ; do echo "imap <$l>" >> debugfs.todo; done debugfs -c -b 4096 -s 229376 /dev/sdd1 -f debugfs.todo > debugfs.out grep located debugfs.out | sed "s#.*block \([0-9]*\).*#\1#g" | sort -n | uniq > blocks
Voilà – a nice list of blocks concerning all affected inodes. So now I had a look at the list and decided which were those area which were overwritten in the first place. It was more or less a big chunk of blocks which were near to each other. I now zeroed out those blocks.
cat blocks | while read l; do dd_rescue -m 1024 -s $(( $l * 4096 )) /dev/zero /dev/sdd1; done
Then I started e2fsck again…
Further Links