Infrastructure

From NixNet
Revision as of 21:15, 16 September 2021 by Amolith (talk | contribs) (minor wording modifications)

This is a summary of NixNet's infrastructure and it is designed for others to follow as well.

Goals

Isolation & flexibility

One of the main objectives of this series is using industry-standard tools and practises to set up servers. Docker is fine but I prefer the VM workflow; we chose to use LXC instead as their goal is to offer an environment as close as possible to the one you'd get from a VM, but without the [virtualisation] overhead . All of the services we run and all of their dependencies are installed within each container. This affords us a high degree of flexibility when it comes to downtime, backups, and migrating services to different hosts as these system containers can simply be copied wherever necessary without worrying about making sure the application, database, and web server containers along with all the data

Documentation & automation

A lot of people are moving to an infrastructure as code model and, while I appreciate repeatability, I believe some processes should remain manual. In three years, that Bash script you wrote for your company to set up new servers might break and you have to go back through it to figure out what went wrong so you can fix it. If you have to stand up a new server every month or so, you're quite likely to notice the instant some API or applications changes and correct it. Ansible playbooks are much easier to work with than scripts but the same principles apply; if you end up leaving the company, what do they do when there's a breaking update? Have someone else spend weeks wrapping their head around the custom code and complex playbooks you wrote?

My solution for this problem is copious documentation. One of the podcasts I listen to, 2.5 Admins, had an episode where they talk about this at length and it was my main inspiration for starting all of this. It is Episode 04 and the timestamp is 18:46. In case the site ever goes down, I'll try to briefly summarize it below but I really recommend giving it a listen.

Someone sent them an email and asked if they could "dive a little deeper into how to move from an admin who treats their servers like pets to an admin who treats their servers like cattle." The answer basically had to do with how you name them; an admin might have two or three servers called jupiter, aphrodite, and mars or whatever. Each of them will do many things and have its own quirks that only you know about. All of this serves to humanize your servers and really is the last thing we want here. We want one server to do one thing and one thing only and its setup needs to be 100% repeatable. As one of the hosts said (I don't know their names well enough yet), "when it's misbehaving, I can just shoot it in the head and get a new one … It's not the server I spent the last five years cobbling together with duct tape." A little blunt but it illustrates exactly what we're going for. This will become apparent as you read through the rest of these docs as well; everything is named with very intentionally and with purpose. A physical host in Luxembourg is lu1, a Nextcloud VM on your second host in the US is nextcloud.us2, and so on.

For my setup process with all of this, the guides and pages here are my docs. After I have written incredibly detailed setup information such that I can follow each one with my brain turned off, only then will I automate portions, like hardening SSH and locking down the network.

Education

Another goal, one of my original ones, is learning as much as I can. I said before that I dislike downloading random container images and that's largely from a security standpoint but also because I want to learn about how these applications interact, how to configure a database server, how to have this application in a container over here store its files on a Ceph instance over there, and so on. I got started with "sysadmin" in the summer of 2018 while I was messing with Nextcloud on a Raspberry Pi my father had given me. Since then, I have learned so much by simply doing things and refusing to take the easy route. If you're reading this, I really encourage you to do the same. I'm going to try to make these guides into a teaching tool as well, not just something to blindly follow and copy/paste commands from.

Guides

Steps

Unfinished Guides

These are pages I'm currently in the process of writing.