An Intro to MergerFS and SnapRAID

A great first step to making sure your data is protected. We look at MergerFS and SnapRaid and discuss the benefits of these running this software.

An Intro to MergerFS and SnapRAID

So your data collection has grown to multiple hard drives, or you have numerous mismatched drives that you want to use and combine? The solution is MergerFS. After you combine all your drives, you now wonder how to protect the data should a drive fail. Well, the answer to that is as simple as a program called SnapRaid.

💡
Please note that the solutions in this article are based on my experience with them running Ubuntu.

MergerFS

MergerFS is a union file system that allows you to join multiple hard drives and present them as one drive to the operating system. This is highly beneficial when you only want to manage one drive for all your needs or to use all those smaller drives you might have lying around.

In my opinion, what sets MergerFS apart from other systems is that you can set it up to your liking. For example, you want it to use the first drive and then roll to the second drive, keep all the files together for the folder, or balance all your files across all your drives. This is all doable, depending on how you configure it.  

For more information, check out the GitHub repository.

SnapRaid

SnapRaid isn't your typical raid solution. It is more of a backup solution by creating a parity drive for your data. It uses one disk for the first four drives, and then for each additional group of seven drives; you need to plan to have another disk. What makes SnapRaid unique, in my opinion, is that the Parity drive(s), as they are called in the documentation, only has to be as big as your biggest drive.

So, for example, you could have a 250 GB drive, a 500 GB drive, and a 2 TB drive, and to back all these drives up, you would only need to have one more 2 TB Drive.

For more information on SnapRaid, please check out the website.

Bonus: SnapRaid Runner

SnapRaid Runner is a Python companion script that can be scheduled as a cron job to run and do your parity check and backup for you. Besides just doing the backup, what makes this a fantastic companion app is that it also checks for differences, and if more files have been deleted, then the threshold you set the script will not make a backup. It can also be configured to email you with the results after each time it runs.

For more details, check out the GitHub Repository.

Conclusion

While this is nowhere near the perfect backup solution and does not follow the 3-2-1 backup strategy, it does help bring peace of mind when you have several terabytes of data you want to protect from a random drive failure.

If you enjoyed this article, stay tuned for the next article, where I will show you how to set everything up.

If you enjoy this content or even feel like helping post content like this, please reach out or join us on the Discord server.