Dell XPS15 SSD Crash Recovery

by kacang bawang

Fucking Dell! An expensive new laptop. Within one month HDD dies. Within 6 months SSD dies.

In this post I will write down what I did to overcome an SSD drive crash while it was active in an SSD-cache setup under 'enhanced' option (ie, both read and write buffering).

When an SSD drive dies while being part of the write cache, it is usually bad news. One doesn’t just lose a few files, but Windows refuses to boot and possibly worse. This is exactly what happened to me. Sequence of events went something like this:

1. Enabled SSD cache, switched to ehanced mode.
2. Comp went to sleep, upon wakeup got BSOD (SSD drive died).
3. Restarted, Windows refused to boot.
4. Confirmed SSD drive death via Dell’s BIOS test suite.

Here are the steps I took to repair the situation

1. Booted up “Parted Magic”. Gparted showed drives as unmountable/unreadable because their filesystem type is 'isw_raid_participant'. Check by: 'lsblk -o name,size,fstype'. Indeed, SSD-cache is a form of software raid (RAID0/striping), and in this setup if one of the raid members goes, the whole thing goes. This particular form of software raid (Intel’s SSD cache) is sometimes called ‘fakeraid’.

Lots of effort was spent here to try to mount the mechanical drive (the one I cared about restoring), but no luck. The ultimate conclusion was that Linux (parted magic is linux) does not support 'fake raid'. Doing sector-by-sector file search did not give useable results.

2. In the BIOS switched “Intel Smart Storage” to “AHCI”. This took some bravery, as there is a nasty warning about data loss that comes up when you do this, so I wanted to exhaust all other options before “pulling the switch”. However, my worries were proved unjustified, as now the logical volumes reappeared in gparted. But.. any traces of 'isw_raid_participant' stuff was gone for good, so you can’t even check status/version/etc of anything 'fakeraid' related. Miraculously the logical volume still contained all my files!!!

3. I tried booting Windows, now that my logical volume was back, but it will BSOD somewhere along the way. Without pushing my luck any further, I went out, bought a large external drive and backed up everything I cared about in Parted Magic. (You could also use Windows “Recovery Console” for this). After saving my files, I tried the Windows “Recovery CD” (actually it was a usb stick), but it also BSODed.

4. I opened up the laptop, removed the failed mSATA SSD, and booted to Windows’ repair menu once again. From “Advanced Options” do “fix startup problems” and it will do disk repair (fine by us, since we already have a copy of our files). I was able to boot properly once that completed, and the BIOS reverted itself back to “Intel Smart Storage”. I don’t have an explanation for that (if I changed it to ACHI again, it would again revert to the smart storage).

Conclusion: I was really lucky to be able to recover all of my data, I wasn’t sure I was going to get anything out for some time. It is not a good feeling, let me tell you. However, this shows that it is possible, at least in some circumstances. In conclusion I only have one piece of advice – don’t use enhanced ssd-cache mode, it isn’t worth it. Stability over speed, always.