Debunking btrfs
btrfs means no evil! btrfs is awesome!
A few weeks ago I listened to one of the Linux Unplugged podcasts about btrfs. It was a reminder that btrfs is great. I have been using btrfs for a few months now, so here is my comment.
For those who aren't familiar with btrfs, it is an advanced filesystem. Its main features are snapshots, compression, deduplication / CoW and volume management.
The bad publicity
I installed btrfs and I lost all of my data! Now that I'm on btrfs, my computer has been super slow! btrfs said my drive was faulty!
We all came across one of these statements on the internet. It is important to remember that what people experienced can be outdated: btrfs has changed a lot over the past few years.
Every kernel ships with a different version of btrfs, it gets more and more stable with time, tests, and feedback. Therefore, a valid concern from a few years ago wouldn't stand today. It doesn't mean these problems never happened, but they shouldn't happen anymore. If you want to give btrfs a try, make sure to use the latest kernel possible.
Avoid raid 5/6
I used to think that until I witnessed a RAID 6 fail. Man was I pissed that day, I remember waking up and checking my phone out of habit and seeing a shitload of idrac's alerts from one of our client's server... 3 disks died in one night... (source)
This is a reminder that raid 5 and raid 6 are more fragile than you may think. On average, a raid 5 with three disks means the three disks are going to wear out in about the same time, meaning when you get a faulty drive, odds are another drive is gonna die soon, soon enough that you will not be able to recover the data. This applies for every file system, not only btrfs.
There are some workarounds, like having a mix of different HDD/SSD brands, bigger & smaller sizes, but you are better off using something else.
The "problem" with df
As you use df, you will notice that what df returns is sometimes inaccurate. My first thought was to blame btrfs as the documentation recommends using btrfs df /
instead of df
.
Turns out the same problem happens when using zfs: the snapshots are not taken into consideration and therefore the df output tells you there's still some space left when in reality your drive is full. Bear in mind that it is reliable most of the time but if you want a better output, btrfs df /
is for you.
btrbk
If you want to use btrfs (or any other fs) you must have a good backup strategy. This post is focusing on btrfs, therefore, you won't find anything better than btrbk!
btrbk relies on btrfs' snapshots and deduplication. You can have hundreds of gigabytes stored and backed up in seconds, that's right, seconds. Your backups will not be incremented, so you can have a window of 14 days for example and prune the old snapshots/backups.
This is a great software for your volumes, it can handle anything on btrfs, even your virtual machines, or LXD containers. I wish such a tool could exist for the other filesystems.
Conclusion
My personal experience with running btrfs on servers is very positive. Btrbk made things easier for me as the data on my servers is backed up on a remote server every day. Being able to sync data easily, as well as shrinking and expanding etc, with all of these features, managing my data became so much easier!
Btrfs keeps on improving and if you're eyeing at btrfs, don't hesitate, give it a try.
Follow us on Twitter! https://twitter.com/swagdotind