feat: new post

2023-12-13 17:50:14 -08:00 · 2023-12-13 17:50:14 -08:00 · 2fd5368635
commit 2fd5368635
parent 073d2b4c1e
2 changed files with 243 additions and 1 deletions
--- a/content/glusterfs.md
+++ b/content/glusterfs.md
@ -0,0 +1,242 @@
+++
+title = "Glusterfs"
+date = "2023-12-13 16:27:07-08:00"
+[taxonomies]
+tags = ["linux", "nix", "ramble"]
+++
+
+# What's a Glusterfs?
+
+Glusterfs is a network filesystem with many features, but the important ones
+here are it's ability to live on top of another filesystem, and offer high
+availability. If you have used SSHFS, it's quite similar in concept, giving you
+a "fake" filesystem from a remote machine, and as a user, you can use it just
+like normal without caring about the details of where the files are actually
+stored, except "over there I guess". Glusterfs unlike SSHFS, can be stored
+across multiple machines similar to network RAID. If one machine goes down, the
+data is still all there and well.
+
+# Why even bother?
+
+A few years ago I decided that I was tired of managing docker services per
+machine and wanted them in a swarm. No more thinking! If a machine goes down,
+the service is either still up (already replicated across servers like this
+blog), or will come up on another server once it sees the service isn't alive.
+This is well and good until you need the SAN to go down. Now all of the data is
+missing, and the servers don't know, and you basically have to kick the entire
+cluster over to get it back alive. Not exactly ideal to say the least.
+
+## Side rant. Feel free to skip if you only care about the tech bits.
+
+While ZFS has kept my data very secure over the ages, it can't always prevent
+machine oddity. I have had strange issues such as Ryzen bugs that could lock up
+machines at idle, a still not figured out random hang on networking (despite
+changing 80% of the machine, including all disks, operating system, and network
+cards) before it comes back 10 seconds later, and so on. As much as I always
+want to have a reliable machine, updates will require service restarts, reboots
+need done, and honestly, I'm tired of having to babysit computers. Docker swarm
+and NixOS are in my life because I don't want to babysit, but solve problems
+once, and be done with it. Storage stability was the next nail to hit, despite
+it being arguably a small problem, it still reminded me that computers exist
+when I wasn't in the mood for them to exist.
+
+# Why Glusterfs as opposed to Ceph or anything else?
+
+Glusterfs sits on top of a filesystem. This is the feature that took me to it
+over anything else. I have trusted my data to ZFS for many years, and have done
+countless things that should have cost me data, including "oops, I deleted 2TB
+of data on the wrong machine", and having to force power off machines (usually
+SystemD reasons), and all of my data is safe. The very few things it couldn't
+save me from, it will happily tell me where there's corruption and I can replace
+the limited data from a backup. With all of that said, Glusterfs happily lives
+on top of ZFS, even letting me use datasets just as I have been for ages. It
+does however let me expand over several machines by using Glusterfs. There's a
+ton of modes to Glusterfs much as any "RAID software", but I'm sticking to
+effectively a mirror (RAID 1) in essence. Let's look at the hardware setup to
+explain this a bit better.
+
+# The hardware
+
+planex
+- Ryzen 5700
+- 32GB RAM
+- 2x16TB Seagate Exos
+- 2x1TB Crucial MX500
+ 
+ ```
+pool
+-------------------------- 
+exos
+  mirror-0
+    wwn-0x5000c500db2f91e8
+    wwn-0x5000c500db2f6413
+special
+  mirror-1
+    wwn-0x500a0751e5b141ca
+    wwn-0x500a0751e5aff797
+-------------------------- 
+```
+
+morbo
+- Ryzen 2700
+- 32GB RAM
+- 5x3TB Western Digital Red
+- 1x10TB Western Digital (replaced a red when it died)
+- 2x500GB Crucial MX500
+ 
+```
+ red
+  raidz2-0
+    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N3EVYXPT
+    ata-WDC_WD100EMAZ-00WJTA0_1EG9UBBN
+    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6ARC4SV
+    ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6ARCZ43
+    ata-WDC_WD30EFRX-68N32N0_WD-WCC7K2KU0FUR
+    ata-WDC_WD30EFRX-68N32N0_WD-WCC7K7FD8T6K
+special
+  mirror-2
+    ata-CT500MX500SSD1_1904E1E57733-part2
+    ata-CT500MX500SSD1_2005E286AD8B-part2
+logs
+  mirror-1
+    ata-CT500MX500SSD1_1904E1E57733-part1
+    ata-CT500MX500SSD1_2005E286AD8B-part1
+-------------------------------------------- 
+``` 
+
+kif
+- Intel i3 4170
+- 8GB RAM
+- 2x256GB Inland SSD
+
+```
+pool
+-------------------------------
+inland
+  mirror-0
+    ata-SATA_SSD_22082224000061
+    ata-SATA_SSD_22082224000174
+-------------------------------
+```
+
+### Notes
+
+These machines are a bit different in terms of storage layout. Morbo/Planex both
+actually store decent amounts of data, and kif is there just to help validate
+things, so it doesn't get a lot of anything. We'll see why later. Would having
+Morbo/Planex both have identical disk layouts increase performance? Yes, but so
+would SSD's, for all of the data. Tradeoffs.
+
+# ZFS setup
+
+I decided to make my setup simpler on all of my systems, and just keep the mount
+points for glusterfs the same. On each system, I created a dataset named
+`gluster` and set it's mountpoint to `/mnt/gluster`. This makes it a ton easier
+to not remember which machine has data where, and keep things streamlined. It
+may look something like this.
+
+```bash
+zfs create pool/gluster
+zfs set mountpoint=/mnt/gluster
+```
+
+If you have one disk, or just want everything on gluster, you could just mount
+the entire drive/pool to somewhere you'll remember, but I find it most simple to
+use datasets, and I have to migrate data from outside of gluster on the same
+array to inside of gluster. That's it for ZFS specific things.
+
+# Creating a gluster storage pool
+
+```bash
+gluster volume create media replica 2 arbiter 1 planex:/mnt/gluster/media morbo:/mnt/gluster/media kif:/mnt/gluster/media force
+```
+
+This may look like a blob of text that means nothing, so let's look at what it
+does.
+
+```bash
+# Tells gluster that we want to make a volume named "media"
+gluster volume create media
+
+# Replicat 2 arbiter 1 tells gluster to use the first 2 servers to store the
+# full data in a mirror (replicate) and set the last as an arbiter. This acts
+# as a tie breaker for the case that anything ever disagrees, and you
+# need a source of truth. It costs VERY little data to store this.
+replica 2 arbiter 1
+
+# The server name, and the path that we are using to store data on them
+planex:/mnt/gluster/media
+morbo:/mnt/gluster/media
+kif:/mnt/gluster/media
+
+# Normally you want gluster to create it's own directory. When we use datasets,
+# the folder will already exist. This is something you should understand can
+# cause issues if you point it at the wrong place, so check first
+force
+```
+
+If all goes well, you can start the volume with
+
+```bash
+gluster volume start media
+```
+
+You'll want to check the status once it's started, and it should look something
+like this.
+
+```bash
+Status of volume: media
+Gluster process                             TCP Port  RDMA Port  Online  Pid
+------------------------------------------------------------------------------
+Brick planex:/mnt/gluster/media             57715     0          Y       1009102
+Brick morbo:/mnt/gluster/media              57485     0          Y       1530585
+Brick kif:/mnt/gluster/media                54466     0          Y       1015000
+Self-heal Daemon on localhost               N/A       N/A        Y       1009134
+Self-heal Daemon on kif                     N/A       N/A        Y       1015144
+Self-heal Daemon on morbo                   N/A       N/A        Y       1854760
+
+Task Status of Volume media
+------------------------------------------------------------------------------
+```
+
+With that taken care of, you can now mount your Gluster volume on any machine
+that you need! Just follow the normal instructions for your platform to install
+Gluster as it will be different for all of them. On NixOS at the time of
+writing, I'm using this to manage my Glusterfs for my docker swarm for any
+machine hosting storage.
+<https://git.kdb424.xyz/kdb424/nixFlake/src/commit/5a1c902d0233af2302f28ba30de4fec23ddaaac9/common/networking/gluster.nix>
+
+# Using gluster volumes
+
+Once a volume is started, you can mount it pointing at any machine that has data
+in the volume. In my case I can mount from planex/morbo/kif, and even if one
+goes down, the data is still served. You can treat this mount identically to if
+you were storing files locally, or over NFS/SSHFS, and any data stored on it
+will be replicated, and left high availability if a server needs to go down for
+maintenance or if it has issues. This provides a bit of a backup (in the same
+way that a RAID mirror does, never rely on online machines for a full backup),
+so this could not only let you have higher uptime on data, but if you have data
+replication on a schedule for a backup to a machine that's always on, this would
+do that in real time, which is a nice side effect.
+
+# Now what?
+
+With my docker swarm being able to be served without interruption from odd
+quirks, and it replacing my need to ZFS send/recv backups (on live machines,
+please have a cold store backup in a fire box if you care about your data,
+along with an off site backup), this lets me continue to forget that computers
+exist so I can focus on things I want to work on, like eventually setting up
+email alerts for ZFS scrubs, or
+[S.M.A.R.T.](https://en.wikipedia.org/wiki/Self-Monitoring,_Analysis_and_Reporting_Technology)
+scans with any drive warnings, I can continue to mostly forget about the
+details, and stay focused on the problems that are fun to solve. Yes, I could
+host my data elsewhere, but even ignoring the insane cost that I won't pay, I
+get to actually own my data, and not have a company creeping on things. Just
+because I have nothing to hide doesn't mean I leave my door unlocked.
+
+### Obligatory "things I say I won't do, but probably will later"
+
+- Dual network paths. Network switch or cable can knock machines offline.
+- Dual routers! Router upgrades always take too long. 5 minutes offline isn't
+  acceptable these days!
+- Discover the true power of TempleOS.
--- a/content/nix_direnv.md
+++ b/content/nix_direnv.md
@ -59,7 +59,7 @@ for security, so you can run `direnv allow .` in the directory once and it will
 be allowed to load from that point on when you `cd` into the directory, and
 unload when you leave.

-## After throughts
+## After thoughts

 Not only does this allow you to keep your system cleaner by keeping env vars and
 packages out of the system and user's packages, it allows you to keep that