diff options
author | doc <doc@filenotfound.org> | 2025-06-30 20:06:28 +0000 |
---|---|---|
committer | doc <doc@filenotfound.org> | 2025-06-30 20:06:28 +0000 |
commit | 717fcb9c81d2bc3cc7a84a3ebea6572d7ff0f5cf (patch) | |
tree | 7cbd6a8d5046409a82b22d34b01aac93b3e24818 | |
parent | 8368ff389ec596dee6212ebeb85e01c638364fb3 (diff) |
72 files changed, 3983 insertions, 0 deletions
diff --git a/blog/deathtominio.md b/blog/deathtominio.md new file mode 100644 index 0000000..cf423be --- /dev/null +++ b/blog/deathtominio.md @@ -0,0 +1,39 @@ +# Death to Object Storage: A Love Letter to Flat Files + +Once upon a time, I believed in MinIO. + +I really did. The idea was beautiful: S3-compatible object storage, self-hosted, redundant, robust — all those wonderful buzzwords they slap on the side of a Docker image and call “enterprise.” I bought into it. I built around it. I dreamed in buckets. + +And then, reality set in. + +What reality, you ask? + +- Media uploads timing out. +- Phantom 403s from ghosts of CORS configs past. +- Uploader works on Tuesday, breaks on Wednesday. +- “Why are all the thumbnails gone?” +- “Why does the backup contain *literally nothing*?” + +MinIO became that coworker who talks a big game but never shows up to help move the server rack. Sure, he says he's “highly available” — but when you need him? Boom. 503. + +So I did what any burned-out, overcaffeinated sysadmin would do. I tore it all down. + +Flat files. ZFS. Snapshots. The old gods. + +Now my media lives on Shredder. It’s fast. It’s simple. It scrubs itself weekly and never lies to me. Want to know if something's backed up? I check with my own eyes — not by playing 20 questions with a broken object path and a timestamp from the Nixon administration. + +I don’t have to `mc alias` anything. +I don’t need to care about ACLs. +I don’t need to learn how to spell “presigned URLs” ever again. + +It. Just. Works. + +So, farewell MinIO. You tried. You failed. You’re off my network. + +Long live `chmod -R`, long live ZFS, and long live sysadmins who know when to throw the whole stack in the trash and start over. + +--- + +📌 PS: If you’re still on object storage for your Mastodon instance… I’m sorry. I really am. + + diff --git a/blog/docker.md b/blog/docker.md new file mode 100644 index 0000000..f4b4e5f --- /dev/null +++ b/blog/docker.md @@ -0,0 +1,118 @@ +Fuck Docker +It works, but it gaslights you about everything. + +Docker is amazing when it works. And when it doesn’t? +It’s a smug little daemon that eats your RAM, forgets your volumes, lies about its health, and restarts things for reasons it refuses to explain. + +Scene 1: Everything Is Fine™ + +You run: +docker ps + +It tells you: +azuracast Up 30 seconds +db Up 31 seconds +nginx Up 30 seconds + +Everything is up. +Except the site is down. +The UI is dead. +curl gives you nothing. +The logs? Empty. + +Docker: “Everything’s running fine 👍” + +Scene 2: Logs Are a Lie + +docker logs azuracast + +Returns: + + Just enough output to give you hope + + Then nothing + + Then silence + +You tail it. +You restart it. +You exec into it. +It’s just a tomb with a PID. + +Scene 3: It Forgets Everything + +You reboot the host. + +Suddenly: + + Your containers forget their volumes + + Your docker-compose.override.yml is ignored + + Your networks vanish + + And the bridge interface is now possessed + +Scene 4: Volumes Are Haunted + +docker volume rm azuracast_station_data + +Error: volume is in use + +By what? +You stopped all containers. You nuked the services. +It’s still in use — by ghosts. + +Eventually you just: + +rm -rf /var/lib/docker + +Because therapy is cheaper than debugging this. + +Scene 5: docker-compose Is a Trick + +docker-compose down +docker-compose up -d + +Now: + + Some things are gone + + Some things are doubled + + Your stations/ folder is missing + + And your database container is holding a grudge + +You try to roll back. +There is no roll back. Only sadness. + +Scene 6: It’s Not Even Docker Anymore + +Modern Docker is: + + Docker + + Which is actually Moby + + Which uses containerd + + Which is managed by nerdctl + + Which builds with buildkit + + Which logs via journald + + Which stores data in an OCI-conforming mess of layers + +None of it can be managed with just docker. + +Final Thought + +Docker is powerful. +Docker is everywhere. +Docker changed the world. + +But once you run real infrastructure on it? + +Fuck Docker. diff --git a/blog/minio.md b/blog/minio.md new file mode 100644 index 0000000..b08df44 --- /dev/null +++ b/blog/minio.md @@ -0,0 +1,77 @@ +# MinIO: It Works, But It Hates Me + +*By someone who survived a 150,000-file sync and lived to tell the tale.* + +--- + +MinIO is fast. It's lightweight. It's compatible with Amazon S3. It’s everything you want in a self-hosted object storage system. + +Until you try to **use it like a filesystem**. + +Then it becomes the most temperamental, moody, selectively mute piece of software you've ever met. + +--- + +## What I Was Trying to Do + +All I wanted was to migrate ~40GB of Mastodon media from local disk into a MinIO bucket. Nothing fancy. Just a clean `rclone sync` and a pat on the back. + +--- + +## What Actually Happened + +- **Load average spiked to 33** +- `find` froze +- `rclone size` hung +- `zfs snapshot` stalled so long I thought the server died +- The MinIO **UI lied to my face** about how much data was present (5GB when `rclone` said 22GB) +- Directory paths that looked like files. Files that were secretly directories. I saw `.meta` and `.part.1` in my dreams. + +--- + +## The Root Problem + +MinIO is **not** a filesystem. + +It's a flat key-value object store that's just *pretending* to be a folder tree. And when you throw 150,000+ nested objects at it — especially from a tool like `rclone` — all the lies unravel. + +It keeps going, but only if: +- You feed it one file at a time +- You don’t ask it questions (`rclone ls`, `rclone size`, `find`, etc.) +- You don’t use the UI expecting it to reflect reality + +--- + +## The Fixes That Kept Me Sane + +- Switched from `rclone ls` to `rclone size` with `--json` (when it worked) +- Cleaned up thousands of broken `.meta`/`.part.*` directories using a targeted script +- Paused `rclone` mid-sync with `kill -STOP` to get snapshots to complete +- Used `du -sh` instead of `find` to track usage +- Lowered `rclone` concurrency with `--transfers=4 --checkers=4` +- Drank water. A lot of it. + +--- + +## The Moral of the Story + +If you're going to use MinIO for massive sync jobs, treat it like: + +- A **delicate black box** with fast internals but fragile mood +- Something that **prefers to be written to, not inspected** +- An S3 clone with boundary issues + +--- + +## Final Thought + +MinIO *does* work. It's powerful. It’s fast. But it also absolutely hates being watched while it works. + +And you won't realize how much until you're 100,000 files deep, snapshot frozen, and `rclone` is telling you you're doing great — while the UI smirks and says you're at 5 gigs. + +MinIO: It works. +But it hates you. + +--- + +**Filed under:** `disaster recovery`, `object storage`, `sync trauma`, `zfs`, `rclone`, `why me` diff --git a/blog/toolkit.md b/blog/toolkit.md new file mode 100644 index 0000000..6b86cb1 --- /dev/null +++ b/blog/toolkit.md @@ -0,0 +1,69 @@ +# Building Your Own CLI Toolkit: Introducing `genesisctl` + +After weeks of refining backup scripts, documenting resilience routines, and shoveling thousands of lines of shell logic into shape, we now have something more powerful than the sum of its parts: a unified, self-documenting, command-line interface for managing infrastructure scripts — `genesisctl`. + +## What is `genesisctl`? + +`genesisctl` is a simple but powerful Bash tool designed to manage and interact with the full suite of sysadmin scripts under the Genesis infrastructure umbrella. It pulls together documentation, logging, and execution into one cohesive interface. + +## Features + +* 🔍 `describe <tool.sh>` — Pretty-prints documentation and metadata from Markdown files auto-generated by our toolchain. +* 📋 `list` — Displays all installed tools based on your setup logs. +* 🚀 `run <tool.sh>` — Executes a script from your `bin/` folder like a command-line native. + +## Why This Matters + +When you're running dozens of bash scripts across multiple machines — backups, verifications, restores, syncs, DR drills — things get messy. With `genesisctl`, every script: + +* Has structured metadata +* Lives in a clean hierarchy +* Comes with a Markdown doc +* Can be queried or executed with a single, consistent command + +## Example Usage + +```bash +# Describe a script and its purpose +$ ./genesisctl describe backup.sh + +# List everything you've got installed +$ ./genesisctl list + +# Run a ZFS bootstrap script +$ ./genesisctl run zfs_bootstrap.sh +``` + +## Behind the Scenes + +Every time we run our scaffold script (`setup_genesis_tools.sh`), it: + +* Reorganizes the toolchain into folders (`bin/`, `docs/`, `archive/`) +* Generates Markdown from script headers (with frontmatter) +* Logs every install to a central file + +Then `genesisctl` reads that metadata in real-time — no guesswork, no rot. + +## What's Next + +This framework is rock solid for CLI use. Next steps may include: + +* `genesisctl doctor` — to validate all tools have docs and correct structure +* `genesisctl docgen` — regenerate Markdown docs on demand +* Static site export of docs with category tags + +But for now? It's stable, extensible, and battle-tested. + +## Final Thoughts + +If you’ve ever tried to manage 50+ bash scripts without structure, `genesisctl` is the toolkit you wish you had. With it, documentation isn't an afterthought — it's baked in. + +Stay tuned. This thing’s just getting started. + +--- + +📁 Repo: Coming soon to Gitea. + +📬 Ping @doc if you want help wiring this into your own ops stack. + +💀 FailZero approved. diff --git a/casestudies/chaosmonkey.md b/casestudies/chaosmonkey.md new file mode 100644 index 0000000..4b64906 --- /dev/null +++ b/casestudies/chaosmonkey.md @@ -0,0 +1,90 @@ +# 🛡️ Case Study: Bulletproofing Genesis Infrastructure with ChaosMonkey DR Drills + +**Date:** May 10, 2025 +**Organization:** Genesis Hosting Technologies +**Lead Engineer:** Doc (Genesis Radio, Infrastructure Director) + +--- + +## 🎯 Objective + +Design and validate a robust, automated disaster recovery (DR) system for Genesis infrastructure — including PostgreSQL, MinIO object storage, and ZFS-backed media — with an external testbed (Linode-hosted) named **ChaosMonkey**. + +--- + +## 🧩 Infrastructure Overview + +| Component | Role | Location | +|------------------|--------------------------------------|-----------------------------| +| PostgreSQL | Primary/replica database nodes | zcluster.technodrome1/2 | +| MinIO | S3-compatible object storage | shredder | +| ZFS | Primary media storage backend | minioraid5, thevault | +| GenesisSync | Hybrid mirroring and integrity check | Deployed to all asset nodes | +| ChaosMonkey | DR simulation and restore target | Linode | + +--- + +## 🧰 Tools Developed + +### `genesis_sync.sh` +- Mirrors local ZFS to MinIO and vice versa +- Supports verification, dry-run, and audit mode +- Alerts via KrangBot on error or drift + +### `run_dr_failover.sh` & `run_dr_failback.sh` +- Safely fail over and restore PostgreSQL + GenesisSync +- Auto-promotes DB nodes +- Sends alerts via Telegram + +### `genesis_clone_manager_multihost.sh` +- Clones live systems (DB, ZFS, MinIO) from prod to ChaosMonkey +- Runs with dry-run preview mode +- Multi-host orchestration via SSH + +### `genesis_clone_validator.sh` +- Runs on ChaosMonkey +- Verifies PostgreSQL snapshot, ZFS datasets, and MinIO content +- Can optionally trigger a GenesisSync `--verify` + +--- + +## 🧪 DR Drill Process (Stage 3 - Controlled Live Test) + +1. 🔒 Freeze writes on production nodes +2. 📤 Snapshot and clone entire stack to ChaosMonkey +3. 🔁 Promote standby PostgreSQL and redirect test traffic +4. 🧪 Validate application behavior and data consistency +5. 📩 Alert via KrangBot with sync/report logs +6. ✅ Trigger safe failback using snapshot + delta sync + +--- + +## 🚨 Results + +- **Recovery time (RTO)**: PostgreSQL in 3 min, full app < 10 min +- **Zero data loss** using basebackups and WAL +- **GenesisSync** completed with verified parity between ZFS and MinIO +- **Repeatable**: Same scripts reused weekly for validation + +--- + +## 💡 Key Takeaways + +- **Scripts are smarter than sleepy admins** — guardrails matter +- **ZFS + WAL + GitOps-style orchestration = rock solid DR** +- **Testing DR live on ChaosMonkey builds real confidence** +- **Failure Friday is not a risk — it’s a training ground** + +--- + +## 🌟 Final Thoughts + +By taking DR out of theory and into action, Genesis Hosting Technologies ensures that not only is data safe — it’s recoverable, testable, and fully verified on demand. With ChaosMonkey in the mix, Genesis now embraces disaster… on its own terms. + + + +--- + +## 📝 A Note on Naming + +"ChaosMonkey" is inspired by the original [Chaos Monkey](https://github.com/Netflix/chaosmonkey) tool created by Netflix, designed to test the resilience of their infrastructure by randomly terminating instances. Our use of the name pays homage to the same principles of reliability, failover testing, and engineering with failure in mind. No affiliation or endorsement by Netflix is implied. diff --git a/casestudies/genesissynccs.md b/casestudies/genesissynccs.md new file mode 100644 index 0000000..0eeb23e --- /dev/null +++ b/casestudies/genesissynccs.md @@ -0,0 +1,99 @@ +# GenesisSync: Hybrid Object–Block Media Architecture for Broadcast Reliability and Scalable Archiving + +## Executive Summary + +GenesisSync is a hybrid storage architecture developed by Genesis Hosting Technologies to solve a persistent challenge in modern broadcast environments: enabling fast, local access for traditional DJ software while simultaneously ensuring secure, scalable, and redundant storage using object-based infrastructure. + +The system has been implemented in a live production environment, integrating StationPlaylist (SPL), AzuraCast, Mastodon, and MinIO object storage with ZFS-backed block storage. GenesisSync enables near-real-time file synchronization, integrity checking, and disaster recovery with no vendor lock-in or reliance on fragile mount hacks. + +--- + +## The Problem + +- **SPL and similar DJ automation systems** require low-latency, POSIX-style file access for real-time media playback and cue-point accuracy. +- **Web-native applications** (like Mastodon and AzuraCast) operate more efficiently using scalable object storage (e.g., S3, MinIO). +- Legacy systems often can't interface directly with object storage without middleware or fragile FUSE mounts. +- Previous attempts to unify object and block storage often led to file locking issues, broken workflows, or manual copy loops. + +--- + +## The GenesisSync Architecture + +### Components + +- **Primary Storage**: ZFS-backed local block volumes (ext4 or ZFS) +- **Backup Target**: MinIO object storage with S3-compatible APIs +- **Apps**: StationPlaylist (Windows via SMB), AzuraCast (Docker), Mastodon +- **Sync Tooling**: `rsync` for local, `mc mirror` for object sync + +### Sync Strategy + +- Local paths like `/mnt/azuracast` and `/mnt/stations` serve as the source of truth +- Hourly cronjob or systemd timer mirrors data to MinIO using: + ```bash + mc mirror --overwrite --remove /mnt/azuracast localminio/azuracast-backup + ``` +- Optionally, `rsync` is used for internal ZFS → block migrations + +### Benefits + +- 🎧 Local-first for performance-sensitive apps +- ☁️ Cloud-capable for redundancy and long-term archiving +- 🔁 Resilient to network blips, container restarts, or media sync delays + +--- + +## Real-World Implementation + +| Component | Role | +|------------------|--------------------------------------------------| +| SPL | Reads from ZFS mirror via SMB | +| AzuraCast | Writes directly to MinIO via S3 API | +| MinIO | Remote object store for backups | +| ZFS | Local resilience, snapshots, and fast access | +| `mc` | Handles object sync from local storage | +| `rsync` | Handles safe internal migration and deduplication | + +### Recovery Drill + +- Snapshot-based rollback with ZFS for quick recovery +- Verified `mc mirror` restore from MinIO to cold boot new environment + +--- + +## Results + +| Metric | Value | +|-------------------------------|----------------------------------------| +| Playback latency (SPL) | <10ms via local ZFS | +| Average mirror time (100MB) | ~12 seconds | +| Recovery time (5GB) | <2 minutes | +| Deployment size | ~4.8TB usable | +| Interruption events | 0 file-level issues since deployment | + +--- + +## Lessons Learned + +- Object storage is powerful, but it's not a filesystem — don't pretend it is. +- Legacy apps need real disk paths — even if the data lives in the cloud. +- Syncing on your terms (with tools like `rsync` and `mc`) beats fighting with FUSE. +- Snapshot + mirror = peace of mind. + +--- + +## Future Roadmap + +- 📦 Add bidirectional sync detection for selective restores +- ✅ Build in sync integrity verification (hash/diff-based) +- 🔔 Hook Telegram alerts for failed syncs or staleness +- 🌐 Publish GenesisSync as an open-source utility +- 📄 Full documentation for third-party station adoption + +--- + +## About Genesis Hosting Technologies + +Genesis Hosting Technologies operates media infrastructure for Genesis Radio and affiliated stations. With a focus on low-latency access, hybrid cloud flexibility, and disaster resilience, GenesisSync represents a foundational step toward a smarter, mirrored media future. + +_"Fast on the air, safe on the backend."_ diff --git a/cheatsheets/rclone_cheat_sheet.md b/cheatsheets/rclone_cheat_sheet.md new file mode 100644 index 0000000..4637fcd --- /dev/null +++ b/cheatsheets/rclone_cheat_sheet.md @@ -0,0 +1,133 @@ +# 📘 Rclone Command Cheat Sheet + +## ⚙️ Configuration + +### Launch Configuration Wizard +```bash +rclone config +``` + +### Show Current Config +```bash +rclone config show +``` + +### List Remotes +```bash +rclone listremotes +``` + +## 📁 Basic File Operations + +### Copy Files +```bash +rclone copy source:path dest:path +``` + +### Sync Files +```bash +rclone sync source:path dest:path +``` + +### Move Files +```bash +rclone move source:path dest:path +``` + +### Delete Files or Dirs +```bash +rclone delete remote:path +rclone purge remote:path # Delete entire path +``` + +### Check Differences +```bash +rclone check source:path dest:path +``` + +## 🔍 Listing and Info + +### List Directory +```bash +rclone ls remote:path +rclone lsd remote:path # List only directories +rclone lsl remote:path # Long list with size and modification time +``` + +### Tree View +```bash +rclone tree remote:path +``` + +### File Size and Count +```bash +rclone size remote:path +``` + +## 📦 Mounting + +### Mount Remote (Linux/macOS) +```bash +rclone mount remote:path /mnt/mountpoint +``` + +### Mount with Aggressive Caching (Windows) +```bash +rclone mount remote:path X: \ + --vfs-cache-mode full \ + --cache-dir C:\path\to\cache \ + --vfs-cache-max-size 100G \ + --vfs-read-chunk-size 512M \ + --vfs-read-ahead 1G +``` + +## 🔁 Sync with Filtering + +### Include / Exclude Files +```bash +rclone sync source:path dest:path --exclude "*.tmp" +rclone sync source:path dest:path --include "*.jpg" +``` + +## 📄 Logging and Dry Runs + +### Verbose and Dry Run +```bash +rclone sync source:path dest:path -v --dry-run +``` + +### Log to File +```bash +rclone sync source:path dest:path --log-file=rclone.log -v +``` + +## 📡 Remote Control (RC) + +### Start RC Server +```bash +rclone rcd --rc-web-gui +``` + +### Use RC Command +```bash +rclone rc core/stats +rclone rc vfs/stats +``` + +## 🛠️ Miscellaneous + +### Serve Over HTTP/WebDAV/SFTP +```bash +rclone serve http remote:path +rclone serve webdav remote:path +rclone serve sftp remote:path +``` + +### Crypt Operations +```bash +rclone config create secure crypt remote:path +``` + +--- + +> ✅ **Tip**: Always use `--dry-run` when testing `sync`, `move`, or `delete` to prevent accidental data loss. diff --git a/cheatsheets/server_hardening_disaster_recovery.md b/cheatsheets/server_hardening_disaster_recovery.md new file mode 100644 index 0000000..fd23c40 --- /dev/null +++ b/cheatsheets/server_hardening_disaster_recovery.md @@ -0,0 +1,87 @@ +# 🛡️ Server Hardening & Disaster Recovery Cheat Sheet + +## 🔐 Server Hardening Checklist + +### 🔒 OS & User Security +- ✅ Use **key-based SSH authentication** (`~/.ssh/authorized_keys`) +- ✅ Disable root login: + ```bash + sudo sed -i 's/^PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config + sudo systemctl restart sshd + ``` +- ✅ Change default SSH port and rate-limit with Fail2Ban or UFW +- ✅ Set strong password policies: + ```bash + sudo apt install libpam-pwquality + sudo nano /etc/security/pwquality.conf + ``` +- ✅ Lock down `/etc/sudoers`, remove unnecessary sudo privileges + +### 🔧 Kernel & System Hardening +- ✅ Install and configure `ufw` or `iptables`: + ```bash + sudo ufw default deny incoming + sudo ufw allow ssh + sudo ufw enable + ``` +- ✅ Disable unused filesystems: + ```bash + echo "install cramfs /bin/true" >> /etc/modprobe.d/disable-filesystems.conf + ``` +- ✅ Set kernel parameters: + ```bash + sudo nano /etc/sysctl.d/99-sysctl.conf + # Example: net.ipv4.ip_forward = 0 + sudo sysctl -p + ``` + +### 🧾 Logging & Monitoring +- ✅ Enable and configure `auditd`: + ```bash + sudo apt install auditd audispd-plugins + sudo systemctl enable auditd + ``` +- ✅ Centralize logs using `rsyslog`, `logrotate`, or Fluentbit +- ✅ Use `fail2ban`, `CrowdSec`, or `Wazuh` for intrusion detection + +## 💾 Disaster Recovery Checklist + +### 📦 Backups +- ✅ Automate **daily database dumps** (e.g., `pg_dump`, `mysqldump`) +- ✅ Use **ZFS snapshots** for versioned backups +- ✅ Sync offsite via `rclone`, `rsync`, or cloud storage +- ✅ Encrypt backups using `gpg` or `age` + +### 🔁 Testing & Recovery +- ✅ **Verify backup integrity** regularly: + ```bash + gpg --verify backup.sql.gpg + pg_restore --list backup.dump + ``` +- ✅ Practice **bare-metal restores** in a test environment +- ✅ Use **PITR** (Point-In-Time Recovery) for PostgreSQL + +### 🛑 Emergency Scripts +- ✅ Create service restart scripts: + ```bash + systemctl restart mastodon + docker restart azuracast + ``` +- ✅ Pre-stage `rescue.sh` to rebuild key systems +- ✅ Include Mastodon/Gitea/etc. reconfig tools + +### 🗂️ Documentation +- ✅ Maintain a **runbook** with: + - Service recovery steps + - IPs, ports, login methods + - Admin contacts and escalation + +### 🧪 Chaos Testing +- ✅ Simulate failure of: + - A disk or volume (use `zpool offline`) + - A network link (`iptables -A OUTPUT ...`) + - A database node (use Patroni/pg_auto_failover tools) + +--- + +> ✅ **Pro Tip**: Integrate all hardening and backup tasks into your Ansible playbooks for consistency and redeployability. diff --git a/cheatsheets/zfs_cheat_sheet.md b/cheatsheets/zfs_cheat_sheet.md new file mode 100644 index 0000000..760aeb1 --- /dev/null +++ b/cheatsheets/zfs_cheat_sheet.md @@ -0,0 +1,153 @@ +# 📘 ZFS Command Cheat Sheet + +## 🛠️ Pool Management + +### Create a Pool +```bash +zpool create <poolname> <device> +zpool create <poolname> mirror <dev1> <dev2> +zpool create <poolname> raidz1 <dev1> <dev2> <dev3> ... +``` + +### List Pools +```bash +zpool list +``` + +### Destroy a Pool +```bash +zpool destroy <poolname> +``` + +### Add Devices to a Pool +```bash +zpool add <poolname> <device> +``` + +### Export / Import Pool +```bash +zpool export <poolname> +zpool import <poolname> +zpool import -d /dev/disk/by-id <poolname> +``` + +## 🔍 Pool Status and Health + +### Check Pool Status +```bash +zpool status +zpool status -v +``` + +### Scrub a Pool +```bash +zpool scrub <poolname> +``` + +### Clear Errors +```bash +zpool clear <poolname> +``` + +## 🧱 Dataset Management + +### Create a Dataset +```bash +zfs create <poolname>/<dataset> +``` + +### List Datasets +```bash +zfs list +zfs list -t all +``` + +### Destroy a Dataset +```bash +zfs destroy <poolname>/<dataset> +``` + +## 📦 Mounting and Properties + +### Set Mount Point +```bash +zfs set mountpoint=/your/path <poolname>/<dataset> +``` + +### Mount / Unmount +```bash +zfs mount <dataset> +zfs unmount <dataset> +``` + +### Auto Mount +```bash +zfs set canmount=on|off|noauto <dataset> +``` + +## 📝 Snapshots & Clones + +### Create a Snapshot +```bash +zfs snapshot <poolname>/<dataset>@<snapshotname> +``` + +### List Snapshots +```bash +zfs list -t snapshot +``` + +### Roll Back to Snapshot +```bash +zfs rollback <poolname>/<dataset>@<snapshotname> +``` + +### Destroy a Snapshot +```bash +zfs destroy <poolname>/<dataset>@<snapshotname> +``` + +### Clone a Snapshot +```bash +zfs clone <poolname>/<dataset>@<snapshot> <poolname>/<new-dataset> +``` + +## 🔁 Sending & Receiving + +### Send Snapshot to File or Pipe +```bash +zfs send <snapshot> > file +zfs send -R <snapshot> | zfs receive <pool>/<dataset> +``` + +### Receive Snapshot +```bash +zfs receive <pool>/<dataset> +``` + +## 🧮 Useful Info & Tuning + +### Check Available Space +```bash +zfs list +``` + +### Set Quota or Reservation +```bash +zfs set quota=10G <dataset> +zfs set reservation=5G <dataset> +``` + +### Enable Compression +```bash +zfs set compression=lz4 <dataset> +``` + +### Enable Deduplication (use cautiously) +```bash +zfs set dedup=on <dataset> +``` + +--- + +> ✅ **Tip**: Always test ZFS commands in a safe environment before using them on production systems! diff --git a/disklabels/baboon.md b/disklabels/baboon.md new file mode 100644 index 0000000..5f450b8 --- /dev/null +++ b/disklabels/baboon.md @@ -0,0 +1,10 @@ +🟩 **Detailed /dev/disk/by-id/ Partitions (Live Ones)** +====================================================== + +```shell +lrwxrwxrwx 1 root root 10 May 29 08:53 ata-Timetec_30TT253X2-256GB_PL211014YSA256G0116-part1 -> ../../sda1 +lrwxrwxrwx 1 root root 10 May 29 08:53 ata-Timetec_30TT253X2-256GB_PL211014YSA256G0116-part2 -> ../../sda2 +lrwxrwxrwx 1 root root 10 May 29 08:53 ata-Timetec_30TT253X2-256GB_PL211014YSA256G0116-part3 -> ../../sda3 +lrwxrwxrwx 1 root root 10 May 29 08:53 ata-Timetec_30TT253X2-256GB_PL220302YSA256G0877-part1 -> ../../sdb1 +lrwxrwxrwx 1 root root 10 May 29 08:53 scsi-35000cca291d4b870-part1 -> ../../sdd1 +lrwxrwxrwx 1 root root 10 May 29 08:53 scsi-35000cca2b093888c-part1 -> ../../sdc1 diff --git a/disklabels/infernode.md b/disklabels/infernode.md new file mode 100644 index 0000000..18e6baf --- /dev/null +++ b/disklabels/infernode.md @@ -0,0 +1,52 @@ +🟩 **Disk-to-Serial Mapping on ZFS VM (FreeBSD)** +================================================= + +```shell + 1. Name: cd0 + Mediasize: 1310040064 (1.2G) + descr: QEMU QEMU DVD-ROM + ident: (null) + + 1. Name: da0 + Mediasize: 268435456000 (250G) + descr: QEMU QEMU HARDDISK + ident: (null) + + 1. Name: da1 + Mediasize: 10737418240000 (9.8T) + descr: QEMU QEMU HARDDISK + ident: (null) + + 1. Name: da2 + Mediasize: 10737418240000 (9.8T) + descr: QEMU QEMU HARDDISK + ident: (null) + + 1. Name: da3 + Mediasize: 34359738368 (32G) + descr: QEMU QEMU HARDDISK + ident: (null) + + pool: inferno + state: ONLINE +config: + + NAME STATE READ WRITE CKSUM + inferno ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + da1 ONLINE 0 0 0 + da2 ONLINE 0 0 0 + logs + da3 ONLINE 0 0 0 + +errors: No known data errors + + pool: zroot + state: ONLINE +config: + + NAME STATE READ WRITE CKSUM + zroot ONLINE 0 0 0 + da0p3 ONLINE 0 0 0 + +errors: No known data errors diff --git a/drills/zfs_drill.log b/drills/zfs_drill.log new file mode 100644 index 0000000..da09586 --- /dev/null +++ b/drills/zfs_drill.log @@ -0,0 +1,513 @@ +💥 Starting ZFS drill on dataset: inferno/drilltest +🟩 Creating dataset: inferno/drilltest +🟩 Generating dummy data (500x 1GB files)... +Created file_1.bin +Created file_2.bin +Created file_3.bin +Created file_4.bin +Created file_5.bin +Created file_6.bin +Created file_7.bin +Created file_8.bin +Created file_9.bin +Created file_10.bin +Created file_11.bin +Created file_12.bin +Created file_13.bin +Created file_14.bin +Created file_15.bin +Created file_16.bin +Created file_17.bin +Created file_18.bin +Created file_19.bin +Created file_20.bin +Created file_21.bin +Created file_22.bin +Created file_23.bin +Created file_24.bin +Created file_25.bin +Created file_26.bin +Created file_27.bin +Created file_28.bin +Created file_29.bin +Created file_30.bin +Created file_31.bin +Created file_32.bin +Created file_33.bin +Created file_34.bin +Created file_35.bin +Created file_36.bin +Created file_37.bin +Created file_38.bin +Created file_39.bin +Created file_40.bin +Created file_41.bin +Created file_42.bin +Created file_43.bin +Created file_44.bin +Created file_45.bin +Created file_46.bin +Created file_47.bin +Created file_48.bin +Created file_49.bin +Created file_50.bin +Created file_51.bin +Created file_52.bin +Created file_53.bin +Created file_54.bin +Created file_55.bin +Created file_56.bin +Created file_57.bin +Created file_58.bin +Created file_59.bin +Created file_60.bin +Created file_61.bin +Created file_62.bin +Created file_63.bin +Created file_64.bin +Created file_65.bin +Created file_66.bin +Created file_67.bin +Created file_68.bin +Created file_69.bin +Created file_70.bin +Created file_71.bin +Created file_72.bin +Created file_73.bin +Created file_74.bin +Created file_75.bin +Created file_76.bin +Created file_77.bin +Created file_78.bin +Created file_79.bin +Created file_80.bin +Created file_81.bin +Created file_82.bin +Created file_83.bin +Created file_84.bin +Created file_85.bin +Created file_86.bin +Created file_87.bin +Created file_88.bin +Created file_89.bin +Created file_90.bin +Created file_91.bin +Created file_92.bin +Created file_93.bin +Created file_94.bin +Created file_95.bin +Created file_96.bin +Created file_97.bin +Created file_98.bin +Created file_99.bin +Created file_100.bin +Created file_101.bin +Created file_102.bin +Created file_103.bin +Created file_104.bin +Created file_105.bin +Created file_106.bin +Created file_107.bin +Created file_108.bin +Created file_109.bin +Created file_110.bin +Created file_111.bin +Created file_112.bin +Created file_113.bin +Created file_114.bin +Created file_115.bin +Created file_116.bin +Created file_117.bin +Created file_118.bin +Created file_119.bin +Created file_120.bin +Created file_121.bin +Created file_122.bin +Created file_123.bin +Created file_124.bin +Created file_125.bin +Created file_126.bin +Created file_127.bin +Created file_128.bin +Created file_129.bin +Created file_130.bin +Created file_131.bin +Created file_132.bin +Created file_133.bin +Created file_134.bin +Created file_135.bin +Created file_136.bin +Created file_137.bin +Created file_138.bin +Created file_139.bin +Created file_140.bin +Created file_141.bin +Created file_142.bin +Created file_143.bin +Created file_144.bin +Created file_145.bin +Created file_146.bin +Created file_147.bin +Created file_148.bin +Created file_149.bin +Created file_150.bin +Created file_151.bin +Created file_152.bin +Created file_153.bin +Created file_154.bin +Created file_155.bin +Created file_156.bin +Created file_157.bin +Created file_158.bin +Created file_159.bin +Created file_160.bin +Created file_161.bin +Created file_162.bin +Created file_163.bin +Created file_164.bin +Created file_165.bin +Created file_166.bin +Created file_167.bin +Created file_168.bin +Created file_169.bin +Created file_170.bin +Created file_171.bin +Created file_172.bin +Created file_173.bin +Created file_174.bin +Created file_175.bin +Created file_176.bin +Created file_177.bin +Created file_178.bin +Created file_179.bin +Created file_180.bin +Created file_181.bin +Created file_182.bin +Created file_183.bin +Created file_184.bin +Created file_185.bin +Created file_186.bin +Created file_187.bin +Created file_188.bin +Created file_189.bin +Created file_190.bin +Created file_191.bin +Created file_192.bin +Created file_193.bin +Created file_194.bin +Created file_195.bin +Created file_196.bin +Created file_197.bin +Created file_198.bin +Created file_199.bin +Created file_200.bin +Created file_201.bin +Created file_202.bin +Created file_203.bin +Created file_204.bin +Created file_205.bin +Created file_206.bin +Created file_207.bin +Created file_208.bin +Created file_209.bin +Created file_210.bin +Created file_211.bin +Created file_212.bin +Created file_213.bin +Created file_214.bin +Created file_215.bin +Created file_216.bin +Created file_217.bin +Created file_218.bin +Created file_219.bin +Created file_220.bin +Created file_221.bin +Created file_222.bin +Created file_223.bin +Created file_224.bin +Created file_225.bin +Created file_226.bin +Created file_227.bin +Created file_228.bin +Created file_229.bin +Created file_230.bin +Created file_231.bin +Created file_232.bin +Created file_233.bin +Created file_234.bin +Created file_235.bin +Created file_236.bin +Created file_237.bin +Created file_238.bin +Created file_239.bin +Created file_240.bin +Created file_241.bin +Created file_242.bin +Created file_243.bin +Created file_244.bin +Created file_245.bin +Created file_246.bin +Created file_247.bin +Created file_248.bin +Created file_249.bin +Created file_250.bin +Created file_251.bin +Created file_252.bin +Created file_253.bin +Created file_254.bin +Created file_255.bin +Created file_256.bin +Created file_257.bin +Created file_258.bin +Created file_259.bin +Created file_260.bin +Created file_261.bin +Created file_262.bin +Created file_263.bin +Created file_264.bin +Created file_265.bin +Created file_266.bin +Created file_267.bin +Created file_268.bin +Created file_269.bin +Created file_270.bin +Created file_271.bin +Created file_272.bin +Created file_273.bin +Created file_274.bin +Created file_275.bin +Created file_276.bin +Created file_277.bin +Created file_278.bin +Created file_279.bin +Created file_280.bin +Created file_281.bin +Created file_282.bin +Created file_283.bin +Created file_284.bin +Created file_285.bin +Created file_286.bin +Created file_287.bin +Created file_288.bin +Created file_289.bin +Created file_290.bin +Created file_291.bin +Created file_292.bin +Created file_293.bin +Created file_294.bin +Created file_295.bin +Created file_296.bin +Created file_297.bin +Created file_298.bin +Created file_299.bin +Created file_300.bin +Created file_301.bin +Created file_302.bin +Created file_303.bin +Created file_304.bin +Created file_305.bin +Created file_306.bin +Created file_307.bin +Created file_308.bin +Created file_309.bin +Created file_310.bin +Created file_311.bin +Created file_312.bin +Created file_313.bin +Created file_314.bin +Created file_315.bin +Created file_316.bin +Created file_317.bin +Created file_318.bin +Created file_319.bin +Created file_320.bin +Created file_321.bin +Created file_322.bin +Created file_323.bin +Created file_324.bin +Created file_325.bin +Created file_326.bin +Created file_327.bin +Created file_328.bin +Created file_329.bin +Created file_330.bin +Created file_331.bin +Created file_332.bin +Created file_333.bin +Created file_334.bin +Created file_335.bin +Created file_336.bin +Created file_337.bin +Created file_338.bin +Created file_339.bin +Created file_340.bin +Created file_341.bin +Created file_342.bin +Created file_343.bin +Created file_344.bin +Created file_345.bin +Created file_346.bin +Created file_347.bin +Created file_348.bin +Created file_349.bin +Created file_350.bin +Created file_351.bin +Created file_352.bin +Created file_353.bin +Created file_354.bin +Created file_355.bin +Created file_356.bin +Created file_357.bin +Created file_358.bin +Created file_359.bin +Created file_360.bin +Created file_361.bin +Created file_362.bin +Created file_363.bin +Created file_364.bin +Created file_365.bin +Created file_366.bin +Created file_367.bin +Created file_368.bin +Created file_369.bin +Created file_370.bin +Created file_371.bin +Created file_372.bin +Created file_373.bin +Created file_374.bin +Created file_375.bin +Created file_376.bin +Created file_377.bin +Created file_378.bin +Created file_379.bin +Created file_380.bin +Created file_381.bin +Created file_382.bin +Created file_383.bin +Created file_384.bin +Created file_385.bin +Created file_386.bin +Created file_387.bin +Created file_388.bin +Created file_389.bin +Created file_390.bin +Created file_391.bin +Created file_392.bin +Created file_393.bin +Created file_394.bin +Created file_395.bin +Created file_396.bin +Created file_397.bin +Created file_398.bin +Created file_399.bin +Created file_400.bin +Created file_401.bin +Created file_402.bin +Created file_403.bin +Created file_404.bin +Created file_405.bin +Created file_406.bin +Created file_407.bin +Created file_408.bin +Created file_409.bin +Created file_410.bin +Created file_411.bin +Created file_412.bin +Created file_413.bin +Created file_414.bin +Created file_415.bin +Created file_416.bin +Created file_417.bin +Created file_418.bin +Created file_419.bin +Created file_420.bin +Created file_421.bin +Created file_422.bin +Created file_423.bin +Created file_424.bin +Created file_425.bin +Created file_426.bin +Created file_427.bin +Created file_428.bin +Created file_429.bin +Created file_430.bin +Created file_431.bin +Created file_432.bin +Created file_433.bin +Created file_434.bin +Created file_435.bin +Created file_436.bin +Created file_437.bin +Created file_438.bin +Created file_439.bin +Created file_440.bin +Created file_441.bin +Created file_442.bin +Created file_443.bin +Created file_444.bin +Created file_445.bin +Created file_446.bin +Created file_447.bin +Created file_448.bin +Created file_449.bin +Created file_450.bin +Created file_451.bin +Created file_452.bin +Created file_453.bin +Created file_454.bin +Created file_455.bin +Created file_456.bin +Created file_457.bin +Created file_458.bin +Created file_459.bin +Created file_460.bin +Created file_461.bin +Created file_462.bin +Created file_463.bin +Created file_464.bin +Created file_465.bin +Created file_466.bin +Created file_467.bin +Created file_468.bin +Created file_469.bin +Created file_470.bin +Created file_471.bin +Created file_472.bin +Created file_473.bin +Created file_474.bin +Created file_475.bin +Created file_476.bin +Created file_477.bin +Created file_478.bin +Created file_479.bin +Created file_480.bin +Created file_481.bin +Created file_482.bin +Created file_483.bin +Created file_484.bin +Created file_485.bin +Created file_486.bin +Created file_487.bin +Created file_488.bin +Created file_489.bin +Created file_490.bin +Created file_491.bin +Created file_492.bin +Created file_493.bin +Created file_494.bin +Created file_495.bin +Created file_496.bin +Created file_497.bin +Created file_498.bin +Created file_499.bin +Created file_500.bin +🟩 Dummy data generation complete. +🟩 Creating snapshot inferno/drilltest@drill +🟩 Calculating checksums before restore... +🟩 Simulating deletion of some files... +🟩 Deleted: file_1.bin and file_2.bin +🟩 Rolling back to snapshot drill +🟩 Calculating checksums after restore... +🟩 Verifying checksums... +✅ Checksums match! Restore verified. +🟩 Cleaning up busy processes before destroy... diff --git a/fztodo.md b/fztodo.md new file mode 100644 index 0000000..1a1a8aa --- /dev/null +++ b/fztodo.md @@ -0,0 +1,61 @@ +FailZero TODO List +✅ Completed + +fz_ip_validator.py runs on Krang with systemd and venv + +Logging to /var/log/failzero/ip_validator.log + +IP abuse detection via /validate endpoint + +PayPal billing form with terminal-style UI + +Telegram alerts on order + +Abuse watcher with threshold-based disable + +genesisctl disable --ip blocks outbound traffic + + Screen-based background runner (genesisctl watch-abuse) + +🧠 Next Steps (Active TODO List) +🔒 Abuse Management + +Build /api/report endpoint to manually flag IPs from Krang or external tools + +Switch abuse_list in fz_ip_validator.py to file-based or Redis-backed source + + Log confirmed abuse incidents to /var/log/genesis-abuse-confirmed.log + +🌐 Frontend Integration + +Modify billing HTML to call /validate before starting PayPal process + +Display an error if IP is flagged (valid === false) and block purchase + + Show dynamic pricing and risk flags in the form using the validator output + +💳 Billing + Provision + +Hook PayPal IPN or success return URL to trigger VPS creation + +Match PayPal TXID to IP + label and log it + +Generate reverse DNS automatically on provision (e.g., nighthawk01.failzero.net) + + Add /privacy and /terms static pages to keep things legally clean + +⚙️ Tooling & UX + +Add genesisctl enable --ip to unblock previously flagged IPs + +Add genesisctl status --ip to query abuse hits / log activity + + Optionally hash or sign each VPS order for non-repudiation audit trail + +🧪 Optional / Nice-to-Have + +Build a minimal dashboard or log viewer for flagged IPs + +Rate-limit /validate via nginx or Flask limiter + +Replace all external IP tools with internal validator diff --git a/genesishosting/access/account-creation.md b/genesishosting/access/account-creation.md new file mode 100644 index 0000000..12fd857 --- /dev/null +++ b/genesishosting/access/account-creation.md @@ -0,0 +1,20 @@ +# Account Creation Policy + +## Customer Accounts + +- Created automatically via WHMCS upon signup +- Email verification is required before service activation +- Strong passwords (minimum 10 characters) are enforced +- 2FA is recommended and required for admin-facing services + +## Staff/Admin Accounts + +- Created manually by Super Admin only +- Must use SSH keys for server access +- Access logs are enabled and monitored +- Each staff account must be linked to an internal email + +## Account Naming Convention + +- Customers: `client_{username}` +- Admins: `admin.{firstname}` diff --git a/genesishosting/access/account-deletion.md b/genesishosting/access/account-deletion.md new file mode 100644 index 0000000..71fd0df --- /dev/null +++ b/genesishosting/access/account-deletion.md @@ -0,0 +1,13 @@ +# Account Deletion Policy + +## Customer Accounts + +- Users may request account deletion via WHMCS support ticket +- Data is retained for 30 days post-termination (unless legally required) +- Backups including user data are purged after 30 days + +## Internal Accounts + +- Deactivated immediately upon staff departure or role change +- SSH keys, DirectAdmin access, and database credentials revoked +- Logs associated with the account are retained for audit purposes diff --git a/genesishosting/access/least-priv.md b/genesishosting/access/least-priv.md new file mode 100644 index 0000000..00f85ac --- /dev/null +++ b/genesishosting/access/least-priv.md @@ -0,0 +1,20 @@ +# Least Privilege Policy + +Genesis Hosting enforces least privilege access for all systems. + +## Principles + +- Users are given the minimum level of access necessary to perform their work +- Admin tools are isolated by function (e.g., billing vs. system access) +- Escalation of privileges must be requested, documented, and time-bound + +## Tools in Use + +- WHMCS permissions are restricted by group +- SSH access is limited using `AllowUsers` and firewalled IPs +- TeamTalk server admins are rotated and audited monthly + +## Review Cycle + +- Access roles are reviewed quarterly +- Logs of access changes are stored and rotated every 90 days diff --git a/genesishosting/access/user-roles.md b/genesishosting/access/user-roles.md new file mode 100644 index 0000000..0f485f3 --- /dev/null +++ b/genesishosting/access/user-roles.md @@ -0,0 +1,18 @@ +# User Roles + +Genesis Hosting Technologies uses Role-Based Access Control (RBAC) to ensure that users only have access to what they need. + +## Role Definitions + +| Role | Description | Examples | +|----------------|----------------------------------------------------------|----------------------------------| +| Customer | End users with access to services they’ve purchased | DirectAdmin clients, Streamers | +| Support Staff | Limited admin functions for resolving client issues | Helpdesk, WHMCS support agents | +| Administrator | Full access to provision, maintain, and modify services | Infrastructure admins | +| Super Admin | Root-level access to all systems | Owner/Lead Engineer | + +## Guidelines + +- Roles are assigned during onboarding. +- Access levels are reviewed quarterly. +- No one should hold higher access than required for their duties. diff --git a/genesishosting/backups/backup-disaster-recovery.md b/genesishosting/backups/backup-disaster-recovery.md new file mode 100644 index 0000000..18b8d67 --- /dev/null +++ b/genesishosting/backups/backup-disaster-recovery.md @@ -0,0 +1,26 @@ +# Disaster Recovery Plan + +Genesis Hosting is prepared to recover core systems from catastrophic failure. + +## Recovery Objectives + +- **RPO (Recovery Point Objective)**: 24 hours +- **RTO (Recovery Time Objective)**: 4 hours for customer services + +## Full Recovery Flow + +1. Triage the affected systems +2. Identify last successful backup or snapshot +3. Restore individual services: + - DNS + - WHMCS + - DirectAdmin + - AzuraCast + - TeamTalk +4. Run post-restore validation scripts +5. Notify customers of incident and resolution + +## DR Testing + +- Simulated quarterly +- Logs retained in `/var/log/genesisdr.log` diff --git a/genesishosting/backups/backup-integrity.md b/genesishosting/backups/backup-integrity.md new file mode 100644 index 0000000..ced96f2 --- /dev/null +++ b/genesishosting/backups/backup-integrity.md @@ -0,0 +1,23 @@ +# Backup Integrity + +We verify all backups regularly to ensure they are complete, uncorrupted, and restorable. + +## Weekly Tasks + +- ZFS scrubs for all pools +- Hash checks (SHA-256) for tarballs and dumps +- rsync `--checksum` verification for remote mirrors + +## Alerts + +- Email/Mastodon alert if: + - ZFS reports checksum errors + - Scheduled backup is missing + - Remote sync fails or lags > 24h + +## Tools Used + +- `zfs scrub` +- `sha256sum` + custom validation script +- rclone sync logs +- Telegram bot and Genesis Shield notifications diff --git a/genesishosting/backups/backup-policy.md b/genesishosting/backups/backup-policy.md new file mode 100644 index 0000000..6bd0de0 --- /dev/null +++ b/genesishosting/backups/backup-policy.md @@ -0,0 +1,29 @@ +# Backup Policy + +Genesis Hosting Technologies maintains regular backups to ensure customer data and internal infrastructure are recoverable in the event of failure, corruption, or disaster. + +## Backup Schedule + +| System | Frequency | Retention | Method | +|----------------|-----------|-----------|------------------| +| DirectAdmin | Daily | 7 Days | rsync + tarball | +| WHMCS | Daily | 14 Days | Encrypted dump | +| AzuraCast | Daily | 7 Days | Docker volume snapshot + config export | +| TeamTalk | Daily | 7 Days | XML + config archive | +| Full VMs | Weekly | 4 Weeks | ZFS snapshots or Proxmox backups | +| Offsite Backups| Weekly | 4 Weeks | Rsync to remote ZFS or object storage | + +## Retention Policy + +- Daily: 7 days +- Weekly: 4 weeks +- Monthly: Optional, for specific business data + +## Encryption + +- Backups are encrypted at rest (AES-256) +- Transfers to remote locations use SSH or TLS + +## Notes + +- No backup occurs on client plans marked "opt-out" diff --git a/genesishosting/backups/dr/assets-mastodon-bucket.md b/genesishosting/backups/dr/assets-mastodon-bucket.md new file mode 100644 index 0000000..6a36a15 --- /dev/null +++ b/genesishosting/backups/dr/assets-mastodon-bucket.md @@ -0,0 +1,45 @@ +## 2025-05-02 22:24:25 – MinIO Bucket Access Configuration for Mastodon + +**Bucket**: `assets-mastodon` +**Server**: `shredderv2` +**User**: `genesisuser` +**Permissions**: Read / Write / Delete +**Policy Name**: `assets-mastodon-rw-policy` + +### Commands Executed: + +```bash +mc alias set localminio http://localhost:9000 genesisadmin MutationXv3! + +cat > assets_mastodon_rw_policy.json <<EOF +{ + "Version": "2012-10-17", + "Statement": [ + { + "Action": [ + "s3:GetBucketLocation", + "s3:ListBucket" + ], + "Effect": "Allow", + "Resource": "arn:aws:s3:::assets-mastodon" + }, + { + "Action": [ + "s3:PutObject", + "s3:GetObject", + "s3:DeleteObject" + ], + "Effect": "Allow", + "Resource": "arn:aws:s3:::assets-mastodon/*" + } + ] +} +EOF + +mc admin policy add localminio assets-mastodon-rw-policy assets_mastodon_rw_policy.json +mc admin policy set localminio assets-mastodon-rw-policy user=genesisuser +``` + +### Outcome: + +User `genesisuser` now has full authenticated access to `assets-mastodon` on `shredderv2`'s MinIO. diff --git a/genesishosting/backups/dr/assets_azuracast.md b/genesishosting/backups/dr/assets_azuracast.md new file mode 100644 index 0000000..ad687ed --- /dev/null +++ b/genesishosting/backups/dr/assets_azuracast.md @@ -0,0 +1,93 @@ +## 2025-05-02 22:24:25 – MinIO Bucket Access Configuration for Mastodon + +**Bucket**: `assets-mastodon` +**Server**: `shredderv2` +**User**: `genesisuser` +**Permissions**: Read / Write / Delete +**Policy Name**: `assets-mastodon-rw-policy` + +### Commands Executed: + +```bash +mc alias set localminio http://localhost:9000 genesisadmin MutationXv3! + +cat > assets_mastodon_rw_policy.json <<EOF +{ + "Version": "2012-10-17", + "Statement": [ + { + "Action": [ + "s3:GetBucketLocation", + "s3:ListBucket" + ], + "Effect": "Allow", + "Resource": "arn:aws:s3:::assets-mastodon" + }, + { + "Action": [ + "s3:PutObject", + "s3:GetObject", + "s3:DeleteObject" + ], + "Effect": "Allow", + "Resource": "arn:aws:s3:::assets-mastodon/*" + } + ] +} +EOF + +mc admin policy add localminio assets-mastodon-rw-policy assets_mastodon_rw_policy.json +mc admin policy set localminio assets-mastodon-rw-policy user=genesisuser +``` + +### Outcome: + +User `genesisuser` now has full authenticated access to `assets-mastodon` on `shredderv2`'s MinIO. + +--- + +## 2025-05-02 22:43:00 – MinIO Transfer Log: AzuraCast Assets + +**Source**: `thevault:/nexus/miniodata/assets_azuracast` +**Destination**: `shredderv2 MinIO` bucket `assets-azuracast` + +### Transfer Method: + +```bash +rclone sync thevault:/nexus/miniodata/assets_azuracast localminio:assets-azuracast \ + --progress \ + --transfers=8 \ + --checkers=8 \ + --s3-chunk-size=64M \ + --s3-upload-concurrency=4 \ + --s3-acl=private \ + --s3-storage-class=STANDARD +``` + +### Outcome: + +Data from AzuraCast backup (`assets_azuracast`) successfully synchronized to MinIO bucket `assets-azuracast` on `shredderv2`. + +--- + +## 2025-05-02 23:05:00 – MinIO Transfer Log: Mastodon Assets + +**Source**: `thevault:/nexus/miniodata/assets_mastodon` +**Destination**: `shredderv2 MinIO` bucket `assets-mastodon` + +### Transfer Method: + +```bash +rclone sync thevault:/nexus/miniodata/assets_mastodon localminio:assets-mastodon \ + --progress \ + --transfers=8 \ + --checkers=8 \ + --s3-chunk-size=64M \ + --s3-upload-concurrency=4 \ + --s3-acl=private \ + --s3-storage-class=STANDARD +``` + +### Outcome: + +Assets from `assets_mastodon` replicated to `assets-mastodon` bucket on `shredderv2`. No impact to production (`shredderv1`) during sync. diff --git a/genesishosting/backups/restore-instructions.md b/genesishosting/backups/restore-instructions.md new file mode 100644 index 0000000..7738466 --- /dev/null +++ b/genesishosting/backups/restore-instructions.md @@ -0,0 +1,32 @@ +# Restore Instructions + +The following steps outline how to restore data for each supported service. + +## DirectAdmin + +1. Access DA panel as admin +2. Go to Admin Backup/Transfer +3. Select user and backup date +4. Click "Restore" + +## WHMCS + +1. SSH into WHMCS server +2. Restore from encrypted MySQL dump +3. Restart `php-fpm` and `nginx` + +## AzuraCast + +1. Stop all Docker containers +2. Replace `station_data` and `config` volumes +3. Restart stack via `docker-compose up -d` + +## TeamTalk + +1. Replace configuration file (`tt5srv.xml`) +2. Restart TeamTalk server + +## VM-Level Restore (ZFS) + +1. `zfs rollback poolname/dataset@snapshotname` +2. Verify service health and logs diff --git a/genesishosting/clients/abuse.md b/genesishosting/clients/abuse.md new file mode 100644 index 0000000..b23e16e --- /dev/null +++ b/genesishosting/clients/abuse.md @@ -0,0 +1,27 @@ +# Abuse Handling Policy + +We take reports of abuse seriously and aim to resolve them quickly. + +## How to Report Abuse + +Send an email to abuse@genesishostingtechnologies.com with: + +- Description of the abuse +- IP or domain involved +- Any relevant logs or screenshots + +## Internal Response Process + +1. Triage within 12 hours +2. Investigate logs and usage +3. Contact the client with 24h to respond +4. Temporary suspension may be issued to prevent further harm + +## DMCA Takedowns + +- We comply with valid DMCA requests +- The client will be notified and given 48h to address or refute + +## Escalation + +Repeat offenders may be permanently banned. diff --git a/genesishosting/clients/account-suspension.md b/genesishosting/clients/account-suspension.md new file mode 100644 index 0000000..59ebb04 --- /dev/null +++ b/genesishosting/clients/account-suspension.md @@ -0,0 +1,22 @@ +# Account Suspension Policy + +Accounts may be suspended for violations of our Acceptable Use Policy, overdue invoices, or abuse complaints. + +## Common Reasons + +- Non-payment (after 5-day grace period) +- Resource abuse or denial-of-service behavior +- Hosting prohibited content +- Violating community guidelines on TeamTalk + +## Suspension Procedure + +- Warning issued via WHMCS ticket and email +- If no resolution within 24–48h, service is suspended +- Admin note added to client profile for audit tracking + +## Reinstatement + +- Suspension is lifted upon payment or resolution +- $5 reactivation fee may apply (for non-payment suspensions) +- Services are not reinstated if terminated due to serious AUP violation diff --git a/genesishosting/clients/aup.md b/genesishosting/clients/aup.md new file mode 100644 index 0000000..0f3a263 --- /dev/null +++ b/genesishosting/clients/aup.md @@ -0,0 +1,27 @@ +# Acceptable Use Policy (AUP) + +This policy outlines the acceptable use of services provided by Genesis Hosting Technologies. + +## Prohibited Activities + +Clients may not use our services to: + +- Host or distribute malware, phishing sites, or spyware +- Send unsolicited email (spam), whether direct or relayed +- Host copyrighted content without permission (DMCA applies) +- Promote hate speech, harassment, or targeted abuse +- Overuse system resources in a way that affects others + +## Special Notes + +- Streaming via AzuraCast must comply with DMCA and public broadcast standards +- TeamTalk users must not harass, dox, or spam other users +- VPNs, proxies, and anonymizing services are not allowed without prior approval + +## Enforcement + +Violations will result in one or more of the following: + +- Warning via email or WHMCS ticket +- Service suspension +- Permanent termination without refund (in egregious cases) diff --git a/genesishosting/clients/refunds-cancellations.md b/genesishosting/clients/refunds-cancellations.md new file mode 100644 index 0000000..016d8e4 --- /dev/null +++ b/genesishosting/clients/refunds-cancellations.md @@ -0,0 +1,24 @@ +# Refunds & Cancellations + +Genesis Hosting Technologies offers a clear refund and cancellation policy. + +## Cancellation + +- Clients may cancel via WHMCS at any time +- Cancellation before next billing date avoids future charges +- No prorated refunds for unused time unless due to service failure + +## Refunds + +- Full refund within 7 days of initial purchase (DirectAdmin, AzuraCast, TeamTalk) +- Domain registrations, SSL certificates, and add-ons are non-refundable +- No refunds issued for abuse-related suspensions or policy violations + +## Exceptions + +- If we fail to deliver a service or suffer extended downtime (>24h), credit may be issued +- All refund requests are reviewed manually by support + +## How to Request + +Submit a WHMCS ticket with reason for refund diff --git a/genesishosting/company/company-code-of-conduct.md b/genesishosting/company/company-code-of-conduct.md new file mode 100644 index 0000000..5c2ca65 --- /dev/null +++ b/genesishosting/company/company-code-of-conduct.md @@ -0,0 +1,20 @@ +# Code of Conduct + +We maintain a respectful, safe, and inclusive environment for both staff and clients. + +## Expectations + +- Treat all clients and team members with professionalism and courtesy +- Communicate clearly and constructively — even during escalations +- Uphold privacy, security, and transparency at every level +- Follow internal and customer-facing policies at all times + +## Zero Tolerance + +We do not tolerate: + +- Harassment or abuse (verbal, written, or otherwise) +- Discrimination based on identity, ability, or belief +- Intentional sabotage of infrastructure or service integrity + +Violations may result in immediate termination of access or service. diff --git a/genesishosting/company/company-mission-statement.md b/genesishosting/company/company-mission-statement.md new file mode 100644 index 0000000..8ad1643 --- /dev/null +++ b/genesishosting/company/company-mission-statement.md @@ -0,0 +1,12 @@ +# Mission Statement + +At Genesis Hosting Technologies, our mission is to provide secure, reliable, and transparent hosting services with a personal touch. + +We believe that even the smallest teams deserve enterprise-grade infrastructure — without enterprise-grade headaches. + +Our goal is to deliver: + +- Fast, stable hosting environments +- Fair pricing with no upsell games +- Transparent policies and proactive support +- A commitment to data ownership and user privacy diff --git a/genesishosting/company/company-tos.md b/genesishosting/company/company-tos.md new file mode 100644 index 0000000..2cddbcb --- /dev/null +++ b/genesishosting/company/company-tos.md @@ -0,0 +1,25 @@ +# Terms of Service (TOS) + +By using services from Genesis Hosting Technologies, you agree to the following terms: + +## Service Provision + +- Services are delivered as-is, with best-effort uptime and technical support +- Users must abide by our Acceptable Use Policy (AUP) +- Access may be suspended for abuse, non-payment, or security issues + +## Billing & Renewals + +- All services are billed monthly or annually +- Automatic renewal is enabled by default +- Invoices are due within 5 days of issue unless otherwise agreed + +## Termination + +- You may cancel at any time via WHMCS +- We reserve the right to suspend or terminate accounts that violate our policies + +## Liability + +- We are not liable for data loss, service interruptions, or indirect damages +- Backups are provided as a best-effort courtesy unless contractually guaranteed diff --git a/genesishosting/company/dmca.md b/genesishosting/company/dmca.md new file mode 100644 index 0000000..b6a4097 --- /dev/null +++ b/genesishosting/company/dmca.md @@ -0,0 +1,25 @@ +# DMCA Policy + +Genesis Hosting Technologies complies with the Digital Millennium Copyright Act (DMCA). + +## Filing a Takedown Notice + +Email dmca@genesishostingtechnologies.com with: + +- Your contact information +- Description of the copyrighted work +- URL or IP address of the infringing content +- A statement of good faith belief +- A statement of accuracy and authority + +## What Happens Next + +- We review and respond within 48 hours +- The client is notified and given a chance to respond +- If no valid counter-notice is filed, content may be removed or suspended + +## Filing a Counter Notice + +Clients who believe their content was wrongly removed may submit a counter notice with similar contact and justification information. + +We will not tolerate repeated infringement and may terminate accounts accordingly. diff --git a/genesishosting/company/privacy-policy.md b/genesishosting/company/privacy-policy.md new file mode 100644 index 0000000..380aee0 --- /dev/null +++ b/genesishosting/company/privacy-policy.md @@ -0,0 +1,26 @@ +# Privacy Policy + +We respect your privacy and protect your data. + +## What We Collect + +- Account information: name, email, billing address +- Service usage data: IPs, access logs, system metrics +- Communications: support tickets and emails + +## How We Use It + +- Service provisioning and support +- Abuse prevention and system integrity +- Internal analytics (not shared or sold) + +## Data Sharing + +- We do not sell user data +- We may share limited data with trusted providers (e.g., payment processors) +- Law enforcement requests must include valid legal process + +## Data Retention + +- User data is retained as long as the account is active +- Backups are purged per the Backup Policy diff --git a/genesishosting/disrec/zfsdestroycasestudy.md b/genesishosting/disrec/zfsdestroycasestudy.md new file mode 100644 index 0000000..aa330ec --- /dev/null +++ b/genesishosting/disrec/zfsdestroycasestudy.md @@ -0,0 +1,64 @@ +# 📛 Case Study: Why RAID Is Not a Backup + +## Overview + +On May 4, 2025, we experienced a production data loss incident involving the `nexus` dataset on `shredderv1`, a Linux RAID5 server. Despite no hardware failure, critical files were lost due to an unintended command affecting live data. + +This incident serves as a clear, real-world illustration of the maxim: + +> **RAID protects against hardware failure — not human error, data corruption, or bad automation.** + +--- + +## 🔍 What Happened + +- `shredderv1` uses RAID5 for media storage. +- The dataset `nexus/miniodata` (housing `genesisassets`, `genesislibrary`, etc.) was accidentally destroyed. +- **No disks failed.** The failure was logical, not physical. + +--- + +## 🔥 Impact + +- StationPlaylist (SPL) lost access to the Genesis media library. +- MinIO bucket data was instantly inaccessible. +- Temporary outage and scrambling to reconfigure mounts, media, and streaming. + +--- + +## ✅ Recovery + +Thanks to our disaster recovery stack: + +- Nightly **rsync backups** were synced to **The Vault** (backup server). +- **ZFS snapshots** existed on The Vault for the affected datasets. +- We restored the latest snapshot **from The Vault back to Shredder**, effectively reversing the loss. +- No data corruption occurred; sync validation showed dataset integrity. + +--- + +## 🎓 Takeaway + +This is a live demonstration of why: + +- **RAID is not a backup** +- **Snapshots without off-host replication** are not enough +- **Real backups must be off-server and regularly tested** + +--- + +## 🔐 Current Protection Measures + +- Production data (`genesisassets`, `genesislibrary`) now replicated nightly to The Vault via `rsync`. +- ZFS snapshots are validated daily via a **dry-run restore validator**. +- Telegram alerts notify success/failure of backup verification jobs. +- Future goal: full ZFS storage on all production servers for native snapshot support. + +--- + +## 🧠 Lessons Learned + +- Always assume you'll delete the wrong thing eventually. +- Snapshots are amazing — **if** they're somewhere else. +- Automated restore testing should be part of every backup pipeline. + diff --git a/genesishosting/dns-check b/genesishosting/dns-check new file mode 100644 index 0000000..14f5537 --- /dev/null +++ b/genesishosting/dns-check @@ -0,0 +1,50 @@ +# 🌐 DNS Access Issues – Troubleshooting Guide + +If you're having trouble reaching **Genesis Radio** or the stream won't load, the issue may be with your DNS provider (the service that turns domain names into IP addresses). + +This happens more often than you'd think — and it's easy to fix. + +--- + +## ✅ Quick Fix: Change Your DNS + +We recommend switching to one of these trusted, fast, privacy-respecting DNS providers: + +| Provider | DNS Servers | +|--------------|-----------------------------| +| **Google** | `8.8.8.8` and `8.8.4.4` | +| **Cloudflare** | `1.1.1.1` and `1.0.0.1` | +| **Quad9** | `9.9.9.9` | + +--- + +## 💻 How to Change Your DNS + +### Windows 10/11 +1. Open **Settings → Network & Internet** +2. Click **Change adapter options** +3. Right-click your active connection → **Properties** +4. Select **Internet Protocol Version 4 (TCP/IPv4)** → Click **Properties** +5. Choose **"Use the following DNS server addresses"** +6. Enter: + - Preferred: `1.1.1.1` + - Alternate: `8.8.8.8` +7. Save and reconnect + +--- + +### macOS +1. Go to **System Preferences → Network** +2. Select your active network → Click **Advanced** +3. Go to the **DNS** tab +4. Click `+` and add: + - `1.1.1.1` + - `8.8.8.8` +5. Apply changes and reconnect + +--- + +### Linux (CLI) +For a quick test: +```bash +sudo resolvectl dns eth0 1.1.1.1 8.8.8.8 diff --git a/genesishosting/genesisctl.md b/genesishosting/genesisctl.md new file mode 100644 index 0000000..1a7b01b --- /dev/null +++ b/genesishosting/genesisctl.md @@ -0,0 +1,72 @@ +# Genesis Hosting Transparency Statement + +At Genesis Hosting, we believe that trust is earned — not assumed. +That’s why we’re upfront about what powers our platform and how we protect your services. + +--- + +## 🧱 Infrastructure You Can Count On + +We provision VPS environments on battle-tested cloud infrastructure from providers like **Linode**. These data centers offer: + +- ✅ Reliable hardware +- 🌐 Global network reach +- 🕒 99.9%+ hardware uptime SLAs + +But we don’t stop there — we don’t just hand you a cloud box and walk away. + +--- + +## 🛡 We Add What Cloud Alone Can’t + +Every Genesis VPS is: + +- 🔐 Hardened at the OS level (SSH keys, firewalls, fail2ban) +- 💾 Integrated into our backup and disaster recovery platform +- 📦 Monitored and optionally snapshotted offsite +- 💬 Supported by real humans — not ticket purgatory + +--- + +## 💡 What You’re Really Paying For + +We don’t resell “raw” VPS slices. We provide: + +- Sysadmin-grade support +- Verified DR and offsite backup protection +- Snapshot + restore capabilities +- Service monitoring and uptime validation + +We use trusted cloud metal as a foundation — then build **Genesis resilience** on top of it. + +--- + +## 🤝 Want a Bare VPS? We’re Not Your Host. +### Want a Resilient One? Welcome to Genesis. + +We’re here to make sure your stuff *stays online*, *comes back up*, and *has a human watching over it*. + +--- + +# 💵 Genesis VPS Pricing + +| Package | Specs | Genesis Price | Ideal For | +|------------|-----------------------------------------|--------------------|------------------------------------------| +| `micro` | 1 CPU / 1GB RAM / 25GB SSD | **$9/month** | Relays, bots, cron jobs, light apps | +| `safe` | 2 CPU / 4GB RAM / 80GB SSD | **$19/month** | General-purpose, DR-ready | +| `mastodon` | 4 CPU / 8GB RAM / 160GB SSD | **$34–39/month** | Mastodon, SPL, media-heavy apps | +| `ultra` | 4 Dedicated CPU / 8GB RAM / 160GB SSD | **$55–65/month** | High performance apps, DB workloads | + +--- + +### ➕ Optional Add-Ons + +- Daily snapshot + rollback: **+$5/mo** +- Weekly DR restore drill: **+$7/mo** +- Telegram alerting + uptime stats: **+$3/mo** +- Genesis Admin SLA Support: **+$10/mo** + +--- + + +Need help choosing? Ping us — real humans respond. diff --git a/genesishosting/infra/genesis-shield.md b/genesishosting/infra/genesis-shield.md new file mode 100644 index 0000000..853f6d9 --- /dev/null +++ b/genesishosting/infra/genesis-shield.md @@ -0,0 +1,24 @@ +# Genesis Shield – Security & Threat Monitoring + +Genesis Shield is our custom-built alert and ban system, integrated across our infrastructure. + +## Features + +- Aggregates Fail2Ban logs across all VMs +- Bans pushed in real-time via Mastodon DM and Telegram +- Scripts track: + - Repeated SSH failures + - API abuse + - Web panel brute force attempts + +## Interfaces + +- Terminal dashboard for live bans/unbans +- Role-based control (root/admin only) +- Daily threat summary via Mastodon bot + +## Roadmap + +- WHMCS integration for abuse tickets +- Live threat map by country/IP +- REST API for admin toolkit diff --git a/genesishosting/infra/infra-maintenance-windows.md b/genesishosting/infra/infra-maintenance-windows.md new file mode 100644 index 0000000..0f48e77 --- /dev/null +++ b/genesishosting/infra/infra-maintenance-windows.md @@ -0,0 +1,25 @@ +# Maintenance Window Policy + +To maintain consistency and reduce customer impact, we adhere to a strict maintenance schedule. + +## Standard Window + +- **Every Sunday, 7 PM – 9 PM Eastern** +- Non-emergency changes must occur during this window + +## What’s Allowed + +- OS & kernel updates +- Docker/image upgrades +- ZFS snapshots & cleanup +- Rolling restarts of containers + +## Emergencies + +- Critical security patches can bypass the window +- All emergency changes must be logged and reviewed + +## Notifications + +- Posted on Mastodon at least 1 hour before the window begins +- Clients notified via WHMCS if it will affect their service diff --git a/genesishosting/infra/infra-monitoring-setup.md b/genesishosting/infra/infra-monitoring-setup.md new file mode 100644 index 0000000..e0f6c16 --- /dev/null +++ b/genesishosting/infra/infra-monitoring-setup.md @@ -0,0 +1,25 @@ +# Monitoring Setup + +We use a layered monitoring approach to ensure full visibility and rapid response. + +## Stack + +- **Prometheus** for metrics collection +- **Grafana** for visualization dashboards +- **Fail2Ban** for intrusion attempts +- **Genesis Shield** for aggregated alerts (Telegram + Mastodon) + +## What We Monitor + +| System | Metric Examples | +|----------------|--------------------------------------------| +| PostgreSQL | Replication lag, disk usage, active queries | +| Web Servers | HTTP response time, TLS errors | +| MinIO / Assets | Cache hit ratio, sync status | +| Docker Hosts | Container uptime, memory pressure | + +## Alerting + +- Telegram: Real-time infra alerts +- Mastodon bot: Daily summaries and status posts +- Fallback email alerts for critical failures diff --git a/genesishosting/infra/server-naming-convention.md b/genesishosting/infra/server-naming-convention.md new file mode 100644 index 0000000..0097b1c --- /dev/null +++ b/genesishosting/infra/server-naming-convention.md @@ -0,0 +1,19 @@ +# Server Naming Convention + +To reduce confusion and improve clarity, we follow a clear and themed naming structure. + +## Naming Style + +Examples: + +- `krang.internal` – Master backend server +- `replica.db3.sshjunkie.com` – Staging PostgreSQL replica +- `shredderv2` – ZFS backup server +- `anthony` – Ansible control node +- `nexus` – Main ZFS pool server for assets + +## Guidelines + +- Avoid generic names (`server1`, `host123`) +- Use themed names (e.g., TMNT characters for core infrastructure) +- Include environment tags where needed (`-test`, `-prod`) diff --git a/genesishosting/infra/zfs-strategy.md b/genesishosting/infra/zfs-strategy.md new file mode 100644 index 0000000..a69a1fa --- /dev/null +++ b/genesishosting/infra/zfs-strategy.md @@ -0,0 +1,23 @@ +# ZFS Strategy + +ZFS is used across Genesis Hosting Technologies for performance, integrity, and snapshot-based backup operations. + +## Pool Layout + +- RAIDZ1 or mirrored vdevs depending on use case +- Dataset naming: `genesisassets-secure`, `genesisshows-secure`, etc. +- Dedicated pools for: + - Mastodon media + - Client backups + - Internal scripts and logs + +## Snapshots + +- Hourly: last 24 hours +- Daily: last 7 days +- Weekly: last 4 weeks + +## Send/Receive + +- Used for offsite replication to Servarica and backup nodes +- Verified using checksums and `zfs receive -F` diff --git a/genesishosting/master_compliance_checklist.md b/genesishosting/master_compliance_checklist.md new file mode 100644 index 0000000..10485bc --- /dev/null +++ b/genesishosting/master_compliance_checklist.md @@ -0,0 +1,63 @@ +# ✅ Master Compliance Checklist +*(Status: ☐ = Not Started | 🟨 = In Progress | ✅ = Complete)* + +## 🧑💼 Access & User Management +- [ ] Role-Based Access Control (RBAC) in place (Customer, Admin, etc.) +- [ ] Account creation follows secure onboarding workflows +- [ ] Admin access restricted to SSH keys only +- [ ] Inactive accounts locked or removed quarterly +- [ ] Least privilege enforced across all services + +## 💾 Backups & Disaster Recovery +- [ ] Daily backups enabled for all key services (DirectAdmin, WHMCS, AzuraCast, TeamTalk) +- [ ] Weekly offsite backups with verification +- [ ] ZFS snapshots scheduled (hourly/daily/weekly) +- [ ] Backup integrity validated with checksums or scrubs +- [ ] Quarterly disaster recovery drill completed +- [ ] Restore instructions documented and tested + +## 🔐 Security +- [ ] 2FA enabled on all admin interfaces (WHMCS, SSH, DirectAdmin) +- [ ] SSH password auth disabled, key-only enforced +- [ ] Weekly patching or updates scheduled (Sunday 7–9 PM) +- [ ] Centralized logging active and stored 30–90 days +- [ ] Fail2Ban + Genesis Shield integrated and alerting +- [ ] TLS 1.2+ enforced for all public services +- [ ] AES-256 encryption at rest on backups and sensitive volumes + +## 🖥️ Provisioning & Automation +- [ ] WHMCS integrated with DirectAdmin, AzuraCast, TeamTalk +- [ ] All provisioning scripts tested and logged +- [ ] Post-deploy verification checklist followed +- [ ] DNS + SSL automation in place (Let's Encrypt) +- [ ] Monitoring added after provisioning (Prometheus/Grafana) + +## 📋 Client Policies +- [ ] Acceptable Use Policy posted and enforced +- [ ] Abuse response process defined and working +- [ ] DMCA policy publicly available and followed +- [ ] Suspension and refund rules defined in WHMCS +- [ ] Privacy Policy and Terms of Service available on client portal + +## 🌐 Services Configuration +- [ ] DirectAdmin quotas enforced (disk, bandwidth, email) +- [ ] AzuraCast listener/storage/bitrate limits respected +- [ ] TeamTalk server abuse protection and user limits enforced +- [ ] Domain registration/renewal workflows tested +- [ ] SSL auto-renew working correctly (Let's Encrypt + certbot) + +## ⚙️ Infrastructure +- [ ] ZFS pools configured for redundancy (RAIDZ1, mirrors) +- [ ] rclone mount points with caching working and monitored +- [ ] Genesis Shield actively alerting via Telegram/Mastodon +- [ ] All VMs named per convention (e.g., `krang`, `shredderv2`) +- [ ] Sunday maintenance window consistently followed +- [ ] Ansible playbooks used for provisioning/config consistency + +## 🛠️ Tools & Scripts +- [ ] All scripts version-controlled and documented +- [ ] Backups and restore tools tested and working +- [ ] Mastodon alert bot operating with secure tokens +- [ ] Rclone VFS stats monitored regularly +- [ ] Admin tools accessible only by authorized users +""" diff --git a/genesishosting/pmgenesisiorealignment.md b/genesishosting/pmgenesisiorealignment.md new file mode 100644 index 0000000..8789bca --- /dev/null +++ b/genesishosting/pmgenesisiorealignment.md @@ -0,0 +1,83 @@ +# Postmortem: Genesis I/O Realignment + +**Date:** May 8, 2025 +**Author:** Doc +**Systems Involved:** minioraid5, shredder, chatwithus.live, zcluster.technodrome1/2, thevault +**Scope:** Local-first mirroring, permission normalization, MinIO transition + +--- + +## 🎯 Objective + +To realign the Genesis file flow architecture by: + +- Making local block storage the **primary source** of truth for AzuraCast and Genesis buckets +- Transitioning FTP uploads to target local storage instead of MinIO directly +- Establishing **two-way mirroring** between local paths and MinIO buckets +- Correcting inherited permission issues across `/mnt/raid5` using `find + chmod` +- Preserving MinIO buckets as **backup mirrors**, not primary data stores + +--- + +## 🔧 Work Performed + +### ✅ Infrastructure changes: +- Deployed block storage volume to Linode Mastodon instance +- Mirrored MinIO buckets (`genesisassets`, `genesislibrary`, `azuracast`) to local paths +- Configured cron-based `mc mirror` jobs: + - Local ➜ MinIO: every 5 minutes with `--overwrite --remove` + - MinIO ➜ Local: nightly pull, no `--remove` + +### ✅ FTP Pipeline Adjustments: +- Users now upload to `/mnt/spl/ftp/uploads` (local) +- Permissions set so only admins access full `/mnt/spl/ftp` +- FTP directory structure created for SPL automation + +### ✅ System Tuning: +- Set `vm.swappiness=10` on all nodes +- Apache disabled where not in use +- Daily health checks via `pull_health_everywhere.sh` +- Krang Telegram alerts deployed for cleanup and system state + +--- + +## 🧠 Observations + +- **High load** on `minioraid5` during `mc mirror` and `chmod` overlap + - Load ~6.5 due to concurrent I/O pressure + - `chmod` stuck in `D` state (I/O wait) while `mc` dominated disk queues + - Resolved after `mc` completion — `chmod` resumed and completed + +- **MinIO buckets were temporarily inaccessible** due to permissions accidentally inherited by FTP group + - Resolved by recursively resetting permissions on `/mnt/raid5` + +- **Krang telemetry** verified: + - Mastodon swap usage rising under asset load + - All nodes had Apache disabled or dormant + - Health alerts triggered on high swap or load + +--- + +## ✅ Outcome + +- Full Genesis and AzuraCast data now reside locally with resilient S3 mirrors +- Mastodon running on block storage, no longer dependent on MinIO latency +- FTP integration with SPL directory trees complete +- Cleanup script successfully deployed across all nodes via Krang +- Daily health reports operational with alerts for high swap/load + +--- + +## 🔁 Recommendations + +- Consider adding snapshot-based ZFS backups for `/mnt/raid5` +- Build `verify_mirror.sh` to detect drift between MinIO and local storage +- Auto-trigger `chmod` only after `mc mirror` finishes +- Monitor long-running background jobs with Krang watchdogs + +--- + +**Signed,** +Doc +Genesis Hosting Technologies + diff --git a/genesishosting/provisioning/checklist.md b/genesishosting/provisioning/checklist.md new file mode 100644 index 0000000..13741d3 --- /dev/null +++ b/genesishosting/provisioning/checklist.md @@ -0,0 +1,23 @@ +# Provisioning Checklist + +This checklist is followed every time a new service is deployed. + +## Pre-Provisioning + +- [ ] Verify order and payment in WHMCS +- [ ] Confirm product mapping is correct +- [ ] Check available server resources + +## Provisioning + +- [ ] Trigger appropriate script/module +- [ ] Log provisioning result +- [ ] Assign DNS entries if applicable +- [ ] Generate Let’s Encrypt SSL if public-facing + +## Post-Provisioning + +- [ ] Send welcome email via WHMCS +- [ ] Confirm monitoring alert is active +- [ ] Test login credentials and endpoints +- [ ] Label service with client ID in Grafana/Prometheus diff --git a/genesishosting/provisioning/post-deploy-verification.md b/genesishosting/provisioning/post-deploy-verification.md new file mode 100644 index 0000000..d46f727 --- /dev/null +++ b/genesishosting/provisioning/post-deploy-verification.md @@ -0,0 +1,22 @@ +# Post-Deployment Verification + +All services go through a post-deploy QA check to ensure they're live and stable. + +## Verification Tasks + +- [ ] Service reachable from public IP or internal route +- [ ] DNS resolves correctly (for domains/subdomains) +- [ ] SSL certificate is active and trusted +- [ ] Admin login works as expected +- [ ] Usage quotas correctly applied (disk, users, bandwidth) + +## Monitoring + +- [ ] Add to Prometheus for service-specific metrics +- [ ] Set alert thresholds (e.g., disk > 80%) +- [ ] Confirm Telegram/Mastodon alert webhook is functional + +## Documentation + +- [ ] Log final status in WHMCS admin notes +- [ ] Store internal service details in `genesis-inventory.yaml` diff --git a/genesishosting/provisioning/whmcs-integration.md b/genesishosting/provisioning/whmcs-integration.md new file mode 100644 index 0000000..3f5663d --- /dev/null +++ b/genesishosting/provisioning/whmcs-integration.md @@ -0,0 +1,23 @@ +# WHMCS Integration + +WHMCS handles client billing, service provisioning, and support workflows. + +## Services Integrated + +| Service | Method | +|--------------|---------------------------------| +| DirectAdmin | Built-in WHMCS module | +| AzuraCast | Custom provisioning script | +| TeamTalk | API + XML user patching scripts | + +## Auto-Provisioning Steps + +1. Client signs up and completes payment +2. WHMCS triggers product-specific hook +3. Script/module provisions the service +4. Welcome email is sent with credentials + +## Logging & Troubleshooting + +- Logs stored at `/var/log/whmcs-hooks.log` +- Errors generate internal ticket automatically if provisioning fails diff --git a/genesishosting/security/incident-response.md b/genesishosting/security/incident-response.md new file mode 100644 index 0000000..29f7ce5 --- /dev/null +++ b/genesishosting/security/incident-response.md @@ -0,0 +1,25 @@ +# Incident Response Policy + +This document defines how we detect, respond to, and report security incidents. + +## Response Workflow + +1. Detection via monitoring, alert, or client report +2. Triage severity and affected systems +3. Contain and isolate threat (e.g., suspend access) +4. Notify stakeholders if client-impacting +5. Perform root cause analysis +6. Patch, re-secure, and document the event + +## Timelines + +- Initial triage: within 2 hours +- Client notification (if impacted): within 24 hours +- Final report delivered internally within 72 hours + +## Tools Used + +- Fail2Ban +- Genesis Shield alerting +- Zabbix/Prometheus incident flags +- Manual log reviews (forensic-level) diff --git a/genesishosting/security/logging-monitoring.md b/genesishosting/security/logging-monitoring.md new file mode 100644 index 0000000..c305627 --- /dev/null +++ b/genesishosting/security/logging-monitoring.md @@ -0,0 +1,24 @@ +# Logging & Monitoring Policy + +We collect and monitor system activity to detect threats, enforce accountability, and assist in incident resolution. + +## Log Types + +- SSH login attempts +- WHMCS access logs +- AzuraCast and TeamTalk server logs +- PostgreSQL query and connection logs +- Fail2Ban logs (ban/unban events) + +## Monitoring Tools + +- Prometheus for metrics +- Grafana dashboards for visual alerts +- Genesis Shield (Telegram + Mastodon alerting) +- Manual log review every 7 days + +## Retention + +- General logs: 30 days +- Security-related logs: 90 days minimum +- Logs archived to encrypted ZFS volume diff --git a/genesishosting/security/security-encryption-standards.md b/genesishosting/security/security-encryption-standards.md new file mode 100644 index 0000000..6d9139c --- /dev/null +++ b/genesishosting/security/security-encryption-standards.md @@ -0,0 +1,23 @@ +# Encryption Standards + +Encryption is applied to all data in transit and at rest across Genesis Hosting Technologies infrastructure. + +## In Transit + +- HTTPS via TLS 1.3 (minimum TLS 1.2 for legacy fallback) +- SFTP for all file transfers +- SSH for all administrative access +- rclone with TLS for object storage replication + +## At Rest + +- ZFS encryption on backup pools +- PostgreSQL encryption at the database or filesystem level +- WHMCS and DirectAdmin credentials hashed and salted +- Backups encrypted with AES-256 before remote transfer + +## Key Management + +- SSH keys rotated every 6 months +- Let's Encrypt certs auto-renew every 90 days +- Master encryption keys stored offline and version-controlled diff --git a/genesishosting/security/security-policy.md b/genesishosting/security/security-policy.md new file mode 100644 index 0000000..7ed282f --- /dev/null +++ b/genesishosting/security/security-policy.md @@ -0,0 +1,23 @@ +# Security Policy + +Genesis Hosting Technologies enforces strict security practices across all infrastructure and services to protect client data and maintain service integrity. + +## Core Principles + +- Least privilege for all users and services +- Regular audits and patching +- Encrypted communication and storage +- Real-time monitoring and alerting + +## Enforcement Areas + +- 2FA required for all admin portals +- SSH access limited to key-based logins +- Centralized log collection and review +- All critical assets monitored via Genesis Shield + +## Review Cycle + +- Policies reviewed quarterly +- Logs retained for 30–90 days depending on system +- Incidents reviewed post-mortem with improvements logged diff --git a/genesishosting/services/azuracast-policy.md b/genesishosting/services/azuracast-policy.md new file mode 100644 index 0000000..3bf3fc4 --- /dev/null +++ b/genesishosting/services/azuracast-policy.md @@ -0,0 +1,32 @@ +# AzuraCast Streaming Policy + +## Features + +- Custom stream URLs (via relay or direct) +- Icecast or SHOUTcast available +- AutoDJ + scheduled playlists +- Web-based file upload + schedule + +## Plans & Limits + +| Plan | Storage | Listeners | Bitrate | +|----------|---------|-----------|---------| +| StreamLite | 2 GB | 25 | 128 kbps| +| StreamPro | 10 GB | 100 | 192 kbps| +| StreamMax | 50 GB | 250 | 320 kbps| + +## Fair Usage Policy + +- No nonstop streaming of static loops to inflate uptime +- Long-form live shows should rotate metadata periodically +- Content must not violate copyright laws + +## Backups + +- Daily backups of config + playlists +- Client media backup is optional (paid add-on) + +## Support + +- Stream diagnostics available in client panel +- WHMCS ticket support for outages or playlist issues diff --git a/genesishosting/services/directadmin-policy.md b/genesishosting/services/directadmin-policy.md new file mode 100644 index 0000000..7d238b2 --- /dev/null +++ b/genesishosting/services/directadmin-policy.md @@ -0,0 +1,27 @@ +# DirectAdmin Hosting Policy + +## Features + +- FTP, webmail, MySQL, file manager, and site statistics +- Optional Let's Encrypt SSL enabled by default +- Nightly site + database backups (7-day retention) + +## Plans & Limits + +| Plan | Disk | Bandwidth | Domains | Email Accounts | +|------------|------|-----------|---------|----------------| +| Starter | 5 GB | 100 GB | 1 | 5 | +| Standard | 20 GB| 500 GB | 5 | 25 | +| Unlimited | 100 GB| ∞ | ∞ | ∞ | + +## Abuse Prevention + +- Email rate limits applied to prevent outbound spam +- CPU usage and inode caps enforced +- Suspicious files scanned automatically + +## Support + +- Available via WHMCS ticket system +- Response within 12 business hours + diff --git a/genesishosting/services/domain-management-policy.md b/genesishosting/services/domain-management-policy.md new file mode 100644 index 0000000..088a009 --- /dev/null +++ b/genesishosting/services/domain-management-policy.md @@ -0,0 +1,22 @@ +# Domain Management Policy + +## Registration + +- Domains registered through our WHMCS interface are managed via third-party registrar API +- Registration typically completes within 5 minutes +- WHOIS privacy included by default (where available) + +## Renewals + +- Auto-renew is enabled by default +- Reminders sent 30, 7, and 1 day before expiration + +## Transfers + +- Domains can be transferred in or out with EPP code +- Support required if domain is locked or expired + +## DNS + +- Free DNS hosting included +- Custom DNS records managed through DirectAdmin or WHMCS panel diff --git a/genesishosting/services/ssl-certs.md b/genesishosting/services/ssl-certs.md new file mode 100644 index 0000000..f6e275c --- /dev/null +++ b/genesishosting/services/ssl-certs.md @@ -0,0 +1,23 @@ +# SSL Certificate Policy + +## Free Certificates + +- Let’s Encrypt certificates issued automatically +- Applies to DirectAdmin, AzuraCast, and custom subdomains +- Auto-renews every 60 days with 30-day buffer + +## Premium SSL + +- Custom SSL certs (e.g., EV/OV) available for purchase +- Requires manual install via WHMCS ticket + +## Certificate Management + +- Certbot used for automation +- Custom certs must be supplied in `.crt` + `.key` format +- Broken SSL installs may be reverted to Let’s Encrypt fallback + +## Support + +- Certificate issues resolved within 24h of report +- DNS challenges supported for wildcard certs diff --git a/genesishosting/services/teamtalk-policy.md b/genesishosting/services/teamtalk-policy.md new file mode 100644 index 0000000..4d2e3ff --- /dev/null +++ b/genesishosting/services/teamtalk-policy.md @@ -0,0 +1,26 @@ +# TeamTalk Hosting Policy + +## Features + +- Private and public servers +- Voice chat, file sharing, push-to-talk +- Admin access with room/channel management + +## Plans & Limits + +| Plan | Users | Bitrate Limit | Admin Access | +|--------------|-------|---------------|--------------| +| Basic Chat | 10 | 64 kbps | Yes | +| Pro Voice | 50 | 128 kbps | Yes | +| Broadcast+ | 100 | 256 kbps | Yes | + +## Rules + +- No harassment, spamming, or automated bots without permission +- Abuse may result in temp suspension or permanent ban +- Admins are responsible for moderating their own servers + +## Configuration + +- Clients may request config changes via WHMCS ticket +- Backups of XML configs stored nightly diff --git a/genesishosting/sla.md b/genesishosting/sla.md new file mode 100644 index 0000000..fc6e505 --- /dev/null +++ b/genesishosting/sla.md @@ -0,0 +1,51 @@ +# Genesis Admin SLA Support Policy + +For clients subscribed to the **Genesis Admin SLA Support** tier ($10/month), we offer elevated support priority and limited hands-on assistance beyond standard infrastructure management. + +--- + +## ✅ What’s Included + +| Feature | Description | +|-----------------------------|-----------------------------------------------------------------------------| +| **Priority Support** | Your issues are handled ahead of non-SLA users | +| **Response Window** | We aim to respond within 2–4 hours during SLA hours | +| **Basic Troubleshooting** | Includes disk usage, load issues, stuck processes, reboot help | +| **Snapshot Assist** | We'll manually snapshot your instance on request before major changes | +| **Log Check & App Restart** | We’ll review system logs and restart your services when requested | + +--- + +## 🕒 SLA Hours + +> **Monday–Saturday: 10:00 AM – 10:00 PM ET** +> Responses outside this window are best effort — not guaranteed. + +--- + +## ❌ What’s Not Included + +- Deep debugging of custom app code (e.g., your Laravel, Node, or Python stack) +- Managing third-party services not provisioned by Genesis +- 24/7 paging or phone support (unless you upgrade to Enterprise tier — coming soon) +- SLA outside the hours above + +--- + +## 📬 How to Get Help + +1. Open a support ticket or send a Telegram message to our admin channel +2. Include your VPS label and a brief description of the issue +3. We’ll confirm receipt and begin work within SLA windows + +--- + +## 🧠 Summary + +> SLA support gives you confidence that a real sysadmin will prioritize your issue — quickly, reliably, and with care. + +For more details, email `support@sshjunkie.com` or open a Genesis Hosting support request. + +--- + +Generated by `genesisctl sla-policy` diff --git a/incident_response.md b/incident_response.md new file mode 100644 index 0000000..412f671 --- /dev/null +++ b/incident_response.md @@ -0,0 +1,128 @@ +# ⚠️ Incident Response Checklists for Common Failures + +These checklists are designed to normalize responses and reduce stress during downtime in your infrastructure. + +--- + +## 🔌 Node Reboot or Power Loss + +- [ ] Verify ZFS pools are imported: `zpool status` +- [ ] Check all ZFS mounts: `mount | grep /mnt` +- [ ] Confirm Proxmox VM auto-start behavior +- [ ] Validate system services: PostgreSQL, Mastodon, MinIO, etc. +- [ ] Run `genesis-tools/healthcheck.sh` or equivalent + +--- + +## 🐘 PostgreSQL Database Failure + +- [ ] Ping cluster VIP +- [ ] Check replication lag: `pg_stat_replication` +- [ ] Inspect ClusterControl / Patroni node status +- [ ] Verify HAProxy is routing to correct primary +- [ ] If failover occurred, verify application connections + +--- + +## 🌐 Network Drop or Routing Issue + +- [ ] Check interface status: `ip a`, `nmcli` +- [ ] Ping gateway and internal/external hosts +- [ ] Test inter-VM connectivity +- [ ] Inspect HAProxy or Keepalived logs for failover triggers +- [ ] Validate DNS and NTP services are accessible + +--- + +## 📦 Object Storage Outage (MinIO / rclone) + +- [ ] Confirm rclone mounts: `mount | grep rclone` +- [ ] View VFS cache stats: `rclone rc vfs/stats` +- [ ] Verify MinIO service and disk health +- [ ] Check cache disk space: `df -h` +- [ ] Restart rclone mounts if needed + +--- + +## 🧠 Split Brain in PostgreSQL Cluster (ClusterControl) + +### Symptoms: +- Two nodes think they're primary +- WAL timelines diverge +- Errors in ClusterControl, or inconsistent data in apps + +### Immediate Actions: +- [ ] Use `pg_controldata` to verify state and timeline on both nodes +- [ ] Temporarily pause failover automation +- [ ] Identify true primary (most recent WAL, longest uptime, etc.) +- [ ] Stop false primary immediately: `systemctl stop postgresql` + +### Fix the Broken Replica: +- [ ] Rebuild broken node: + ```bash + pg_basebackup -h <true-primary> -D /var/lib/postgresql/XX/main -U replication -P --wal-method=stream + ``` +- [ ] Restart replication and confirm sync + +### Post-Mortem: +- [ ] Audit any split writes for data integrity +- [ ] Review Keepalived/HAProxy fencing logic +- [ ] Add dual-primary alerts with `pg_is_in_recovery()` checks +- [ ] Document findings and update HA policies + +--- + +## 🐘 PostgreSQL Replication Lag / Sync Delay + +- [ ] Query replication status: + ```sql + SELECT client_addr, state, sync_state, sent_lsn, write_lsn, flush_lsn, replay_lsn FROM pg_stat_replication; + ``` +- [ ] Compare LSNs for lag +- [ ] Check for disk I/O, CPU, or network bottlenecks +- [ ] Ensure WAL retention and streaming are healthy +- [ ] Restart replica or sync service if needed + +--- + +## 🪦 MinIO Bucket Inaccessibility or Failure + +- [ ] Run `mc admin info local` to check node status +- [ ] Confirm MinIO access credentials/environment +- [ ] Check rclone and MinIO logs +- [ ] Restart MinIO service: `systemctl restart minio` +- [ ] Check storage backend health/mounts + +--- + +## 🐳 Dockerized Service Crash (e.g., AzuraCast) + +- [ ] Inspect containers: `docker ps -a` +- [ ] View logs: `docker logs <container>` +- [ ] Check disk space: `df -h` +- [ ] Restart with Docker or Compose: + ```bash + docker restart <container> + docker-compose down && docker-compose up -d + ``` + +--- + +## 🔒 Fail2Ban or Genesis Shield Alert Triggered + +- [ ] Tail logs: + ```bash + journalctl -u fail2ban + tail -f /var/log/fail2ban.log + ``` +- [ ] Inspect logs for false positives +- [ ] Unban IP if needed: + ```bash + fail2ban-client set <jail> unbanip <ip> + ``` +- [ ] Notify via Mastodon/Telegram alert system +- [ ] Tune jail thresholds or IP exemptions + +--- + +> ✅ Store these in a Gitea wiki or `/root/checklists/` for quick access under pressure. diff --git a/postmortems/genesisdrivefail.md b/postmortems/genesisdrivefail.md new file mode 100644 index 0000000..6a8c421 --- /dev/null +++ b/postmortems/genesisdrivefail.md @@ -0,0 +1,76 @@ +# Postmortem: SPL Media Disk Incident and Disaster Recovery Drill + +**Date:** [05/19/2025] +**Author:** Doc (Genesis Hosting) +* +--- + +## Summary + +On [05/19/2025], while attempting to remove a deprecated RAID5 drive from the SPL Windows host, the incorrect disk was accidentally detached. This disk contained the live SPL media volume. Due to Windows' handling of dynamic disks, the volume was marked as "Failed" and inaccessible, triggering an immediate DR response. + +Despite the unintentional nature of the incident, it served as a live test of Genesis Hosting's SPL disaster recovery process. The full restore was completed successfully in under an hour using tarball-based SCP transfer from Shredder, validating both the local snapshot source and DR scripting approach. + +--- + +## Timeline + +- **T-0 (Start):** Attempt made to remove deprecated RAID5 disk +- **T+0:** Incorrect disk unplugged (live SPL media) +- **T+2m:** Disk appears in Windows as "Missing/Failed" +- **T+5m:** SCP-based restore initiated from Shredder +- **T+10m:** `.zfs` snapshot artifact detected and ignored +- **T+15m:** Decision made to continue full tarball-based SCP restore +- **T+58m:** Restore completed to `R:\` and SPL resumed functionality + +--- + +## Impact + +- SPL station was temporarily offline (estimated downtime < 1 hour) +- No data was lost +- No external users were affected due to off-air timing + +--- + +## Root Cause + +Human error during manual drive removal in a mixed-disk environment where Windows showed multiple 5TB drives. + +--- + +## Resolution + +- Restore initiated from validated ZFS source (Shredder) +- SCP-based tarball transfer completed +- Permissions and structure preserved +- SPL fully restored to operational state + +--- + +## Lessons Learned + +1. Windows dynamic disks are fragile and easily corrupted by hot-unplug events +2. SCP is reliable but not optimal for large restores +3. `.zfs` snapshot visibility can interfere with SCP unless explicitly excluded +4. Tarball-based transfers dramatically reduce restore time +5. Disaster recovery scripts should log and time every phase + +--- + +## Action Items + +- [x] Set up secondary disk on SPL host for test restores +- [x] Begin alternating restore tests from Shredder and Linode Object Storage +- [x] Convert restore flow to tarball-based for faster execution +- [ ] Formalize `genesisctl drill` command for DR testing +- [ ] Add timed logging to all DR scripts +- [ ] Expand approach to AzuraCast and Mastodon (in progress) + +--- + +## Conclusion + +While the incident began as a misstep, it evolved into a high-value test of Genesis Hosting's disaster recovery capabilities. The successful, timely restore validated the core backup architecture and highlighted key improvements to be made in automation, speed, and DR testing processes moving forward. + +This will serve as Drill #1 in the GenesisOps DR series, codename: **Sterling Forest**. diff --git a/procedures/GROWL.md b/procedures/GROWL.md new file mode 100644 index 0000000..119682d --- /dev/null +++ b/procedures/GROWL.md @@ -0,0 +1,111 @@ +# GROWL — Genesis Radio Commit Style Guide + +--- + +## 🛡️ Purpose + +To keep our Git commit history **clean, calm, and clear** — +even during chaos, downtime, or tired late-night edits. + +Every commit should **GROWL**: + +| Letter | Meaning | +|:---|:---| +| **G** | Good | +| **R** | Readable | +| **O** | Obvious | +| **W** | Well-Scoped | +| **L** | Logical | + +--- + +## 🧠 GROWL Principles + +### **G — Good** + +Write clear, helpful commit messages. +Imagine your future self — tired, panicked — trying to understand what you did. + +**Bad:** +`update` + +**Good:** +`Fix retry logic for mount guardian script` + +--- + +### **R — Readable** + +Use short, plain English sentences. +No cryptic shorthand. No weird abbreviations. + +**Bad:** +`fx psh scrpt` + +**Good:** +`Fix powershell script argument passing error` + +--- + +### **O — Obvious** + +The commit message should explain what changed without needing a diff. + +**Bad:** +`misc` + +**Good:** +`Add dark mode CSS to healthcheck dashboard` + +--- + +### **W — Well-Scoped** + +One logical change per commit. +Don't fix five things at once unless they're tightly related. + +**Bad:** +`fix mount issues, added healthcheck, tweaked retry` + +**Good:** +`Fix asset mount detection timing issue` + +(And then a separate commit for healthcheck tweaks.) + +--- + +### **L — Logical** + +Commits should build logically. +Each one should bring the repo to a **better, deployable state** — not leave it broken. + +**Bad:** +Commit partial broken code just because "I need to leave soon." + +**Good:** +Finish a working block, then commit. + +--- + +## 📋 Quick GROWL Checklist Before You Push: + +- [ ] Is my message clear to a stranger? +- [ ] Did I only change one logical thing? +- [ ] Can I tell from the commit what changed, without a diff? +- [ ] Would sleepy me at 3AM thank me for writing this? + +--- + +## 🎙️ Why We GROWL + +Because panic, fatigue, or adrenaline can't be avoided — +but **good habits under pressure can save a system** (and a future you) every time. + +Stay calm. +Make it obvious. +Let it GROWL. + +--- + +# 🐺 Genesis Radio Operations +*Built with pride. Built to last.* diff --git a/procedures/OPS.md b/procedures/OPS.md new file mode 100644 index 0000000..63f0e28 --- /dev/null +++ b/procedures/OPS.md @@ -0,0 +1,154 @@ +# 🚀 Genesis Radio - Healthcheck Response Runbook + +## Purpose +When an alert fires (Critical or Warning), this guide tells you what to do so that **any team member** can react quickly, even if the admin is not available. + +--- + +## 🛠️ How to Use +- Every Mastodon DM or Dashboard alert gives you a **timestamp**, **server name**, and **issue**. +- Look up the type of issue in the table below. +- Follow the recommended action immediately. + +--- + +## 📋 Quick Response Table + +| Type of Alert | Emoji | What it Means | Immediate Action | +|:---|:---|:---|:---| +| [Critical Service Failure](#critical-service-failure-) | 🔚 | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | +| [Disk Filling Up](#disk-filling-up-) | 📈 | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | +| [Rclone Mount Error](#rclone-mount-error-) | 🐢 | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | +| [PostgreSQL Replication Lag](#postgresql-replication-lag-) | 💥 | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | +| [RAID Degraded](#raid-degraded-) | 🧸 | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | +| [Log File Warnings](#log-file-warnings-) | ⚠️ | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | + +--- + +## 💻 If Dashboard Shows +- ✅ **All Green** = No action needed. +- ⚠️ **Warnings** = Investigate soon. Not urgent unless repeated. +- 🚨 **Criticals** = Drop everything and act immediately. + +--- + +## 🛡️ Emergency Contacts +| Role | Name | Contact | +|:----|:-----|:--------| +| Primary Admin | (You) | [845-453-0820] | +| Secondary | Brice | [BRICE CONTACT INFO] | + +(Replace placeholders with actual contact details.) + +--- + +## ✍️ Example Cheat Sheet for Brice + +**Sample Mastodon DM:** +> 🚨 Genesis Radio Critical Healthcheck 2025-04-28 14:22:33 🚨 +> ⚡ 1 critical issue found: +> - 🔚 [mastodon] CRITICAL: Service mastodon-web not running! + +**Brice should:** +1. SSH into Mastodon server. +2. Run `systemctl restart mastodon-web`. +3. Confirm the service is running again. +4. If it fails or stays down, escalate to admin. + +--- + +# 🌟 TL;DR +- 🚨 Criticals: Act immediately. +- ⚠️ Warnings: Investigate soon. +- ✅ Healthy: No action needed. + +--- + +# 🛠️ Genesis Radio - Detailed Ops Playbook + +## Critical Service Failure (🔚) +**Symptoms:** Service marked as CRITICAL. + +**Fix:** +1. SSH into server. +2. `sudo systemctl status <service>` +3. `sudo systemctl restart <service>` +4. Confirm running. Check logs if it fails. + +--- + +## Disk Filling Up (📈) +**Symptoms:** Disk space critically low. + +**Fix:** +1. SSH into server. +2. `df -h` +3. Delete old logs: + ```bash + sudo rm -rf /var/log/*.gz /var/log/*.[0-9] + sudo journalctl --vacuum-time=2d + ``` +4. If still low, find big files and clean. + +--- + +## Rclone Mount Error (🐢) +**Symptoms:** Mount failure or slowness. + +**Fix:** +1. SSH into SPL server. +2. Unmount & remount: + ```bash + sudo fusermount -uz /path/to/mount + sudo systemctl restart rclone@<mount> + ``` +3. Confirm mount is active. + +--- + +## PostgreSQL Replication Lag (💥) +**Symptoms:** Replica database lagging. + +**Fix:** +1. SSH into replica server. +2. Check lag: + ```bash + sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;" + ``` +3. Restart PostgreSQL if stuck. +4. Monitor replication logs. + +--- + +## RAID Degraded (🧸) +**Symptoms:** RAID missing a disk. + +**Fix:** +1. SSH into server. +2. `cat /proc/mdstat` +3. Find failed drive: + ```bash + sudo mdadm --detail /dev/md0 + ``` +4. Replace failed disk, rebuild array: + ```bash + sudo mdadm --add /dev/md0 /dev/sdX + ``` + +--- + +## Log File Warnings (⚠️) +**Symptoms:** Errors in syslog or nginx. + +**Fix:** +1. SSH into server. +2. Review logs: + ```bash + grep ERROR /var/log/syslog + ``` +3. Investigate. Escalate if necessary. + +--- + +**Stay sharp. Early fixes prevent major downtime!** 🛡️💪 + diff --git a/procedures/buildandbpack.sh b/procedures/buildandbpack.sh new file mode 100755 index 0000000..cdc3564 --- /dev/null +++ b/procedures/buildandbpack.sh @@ -0,0 +1,59 @@ +#!/usr/bin/env bash +set -euo pipefail + +# 🔧 CONFIGURATION +FREEBSD_BRANCH="stable/14" +KERNCONF="STORAGE_ZFS" +MAKEJOBS=$(nproc) +BUILDROOT="$HOME/freebsd-kernel-build" +OBJDIR="/tmp/obj" +TOOLCHAIN_BIN="/tmp/amd64.amd64/usr/bin" + +# 🌱 Step 1: Prep Environment +mkdir -p "$BUILDROOT" +cd "$BUILDROOT" + +# 🔻 Step 2: Get FreeBSD source +if [ ! -d "src" ]; then + git clone https://git.freebsd.org/src.git + cd src + git checkout "$FREEBSD_BRANCH" +else + cd src + git fetch + git checkout "$FREEBSD_BRANCH" + git pull +fi + +# 🛠️ Step 3: Build FreeBSD toolchain (only once) +if [ ! -d "$TOOLCHAIN_BIN" ]; then + echo "[*] Bootstrapping FreeBSD native-xtools..." + bmake XDEV=amd64 XDEV_ARCH=amd64 native-xtools +else + echo "[*] Toolchain already built. Skipping..." +fi + +# 🔁 Step 4: Prepare kernel config +cd "$BUILDROOT/src/sys/amd64/conf" +if [ ! -f "$KERNCONF" ]; then + cp GENERIC "$KERNCONF" + echo "[*] Created new kernel config from GENERIC: $KERNCONF" +fi + +# 🧠 Step 5: Build the kernel +export PATH="$TOOLCHAIN_BIN:$PATH" +export MAKEOBJDIRPREFIX="$OBJDIR" + +cd "$BUILDROOT/src" +bmake -j"$MAKEJOBS" buildkernel TARGET=amd64 TARGET_ARCH=amd64 KERNCONF="$KERNCONF" + +# 📦 Step 6: Package the kernel +KERNEL_OUT="$OBJDIR/$BUILDROOT/src/amd64.amd64/sys/$KERNCONF" +PACKAGE_NAME="freebsd-kernel-$(date +%Y%m%d-%H%M%S).tar.gz" + +tar czf "$BUILDROOT/$PACKAGE_NAME" -C "$KERNEL_OUT" kernel + +# 📣 Done +echo "✅ Kernel build and package complete." +echo "➡️ Output: $BUILDROOT/$PACKAGE_NAME" + diff --git a/procedures/databasecluster.md b/procedures/databasecluster.md new file mode 100644 index 0000000..1c26165 --- /dev/null +++ b/procedures/databasecluster.md @@ -0,0 +1,87 @@ +# Database Cluster (baboon.sshjunkie.com) + +## Overview +The database cluster consists of two PostgreSQL database servers hosted on `baboon.sshjunkie.com`. These servers are used to store data for services such as Mastodon and AzuraCast. The cluster ensures high availability and fault tolerance through replication and backup strategies. + +## Installation +Install PostgreSQL on both nodes in the cluster: + +```bash +# Update package list and install PostgreSQL +sudo apt update +sudo apt install -y postgresql postgresql-contrib + +# Ensure PostgreSQL is running +sudo systemctl start postgresql +sudo systemctl enable postgresql +``` + +## Configuration +### PostgreSQL Configuration Files: +- **pg_hba.conf**: + - Allow replication and local connections. + - Example: + ```ini + local all postgres md5 + host replication all 192.168.0.0/16 md5 + ``` +- **postgresql.conf**: + - Set `wal_level` for replication: + ```ini + wal_level = hot_standby + max_wal_senders = 3 + ``` + +### Replication Configuration: +- Set up streaming replication between the two nodes (`baboon.sshjunkie.com` as the master and the second node as the replica). + +1. On the master node, enable replication and restart PostgreSQL. +2. On the replica node, set up replication by copying the data directory from the master node and configure the `recovery.conf` file. + +Example `recovery.conf` on the replica: +```ini +standby_mode = on +primary_conninfo = 'host=baboon.sshjunkie.com port=5432 user=replicator password=your_password' +trigger_file = '/tmp/postgresql.trigger.5432' +``` + +## Usage +- **Check the status of PostgreSQL**: + ```bash + sudo systemctl status postgresql + ``` + +- **Promote the replica to master**: + ```bash + pg_ctl promote -D /var/lib/postgresql/data + ``` + +## Backups +Use `pg_basebackup` to create full backups of the cluster. Example: + +```bash +pg_basebackup -h baboon.sshjunkie.com -U replicator -D /backups/db_backup -Ft -z -P +``` + +Automate backups with cronjobs for regular snapshots. + +## Troubleshooting +- **Issue**: Replica is lagging behind. + - **Solution**: Check network connectivity and ensure the replica is able to connect to the master node. Monitor replication lag with: + ```bash + SELECT * FROM pg_stat_replication; + ``` + +## Monitoring +- **Monitor replication status**: + ```bash + SELECT * FROM pg_stat_replication; + ``` + +- **Monitor database health**: + ```bash + pg_isready + ``` + +## Additional Information +- [PostgreSQL Streaming Replication Documentation](https://www.postgresql.org/docs/current/warm-standby.html) diff --git a/procedures/decom.md b/procedures/decom.md new file mode 100644 index 0000000..525e295 --- /dev/null +++ b/procedures/decom.md @@ -0,0 +1,57 @@ +# 🗑️ Decommissioning Checklist for `shredderv1` + +**Date:** 2025-05-01 + +--- + +## 🔐 1. Verify Nothing Critical Is Running +- [ ] Confirm all services (e.g., AzuraCast, Docker containers, media playback) have **been migrated** +- [ ] Double-check DNS entries (e.g., CNAMEs or A records) have been **updated to the new server** +- [ ] Ensure any **active mounts, Rclone remotes, or scheduled tasks** are disabled + +--- + +## 📦 2. Migrate/Preserve Data +- [ ] Backup and copy remaining relevant files (station configs, logs, recordings, playlists) +- [ ] Verify data was successfully migrated to the new ZFS-based AzuraCast VM +- [ ] Remove temporary backup files and export archives + +--- + +## 🧹 3. Remove from Infrastructure +- [ ] Remove from monitoring tools (e.g., Prometheus, Nagios, Grafana) +- [ ] Remove from Ansible inventory or configuration management systems +- [ ] Remove any scheduled crons or automation hooks targeting this VM + +--- + +## 🔧 4. Disable and Secure +- [ ] Power down services (`docker stop`, `systemctl disable`, etc.) +- [ ] Disable remote access (e.g., SSH keys, user accounts) +- [ ] Lock or archive internal credentials (e.g., API tokens, DB creds, rclone configs) + +--- + +## 🧽 5. Wipe or Reclaim Resources +- [ ] If VM: Delete or archive VM snapshot in Proxmox or hypervisor +- [ ] If physical: Securely wipe disks (e.g., `shred`, `blkdiscard`, or DBAN) +- [ ] Reclaim IP address (e.g., assign to new ZFS-based VM) + +--- + +## 📜 6. Documentation & Closure +- [ ] Log the decommission date in your infrastructure inventory or documentation +- [ ] Tag any previous support tickets/issues as “Resolved (Decommissioned)” +- [ ] Inform team members that `shredderv1` has been retired + +--- + +## 🚫 Final Step +```bash +shutdown -h now +``` + +Or if you're feeling dramatic: +```bash +echo "Goodnight, sweet prince." && shutdown -h now +``` diff --git a/procedures/genesis_uptime_monitor.md b/procedures/genesis_uptime_monitor.md new file mode 100644 index 0000000..6505f06 --- /dev/null +++ b/procedures/genesis_uptime_monitor.md @@ -0,0 +1,57 @@ +# Genesis Uptime Monitor + +This package sets up a simple service uptime tracker on your local server (e.g., Krang). It includes: + +- A Python Flask API to report 24-hour uptime +- A bash script to log uptime results every 5 minutes +- A systemd unit to keep the API running + +## Setup Instructions + +### 1. Install Requirements + +```bash +sudo apt install python3-venv curl +cd ~ +python3 -m venv genesis_api +source genesis_api/bin/activate +pip install flask +``` + +### 2. Place Files + +- `uptime_server.py` → `/home/doc/uptime_server.py` +- `genesis_check.sh` → `/usr/local/bin/genesis_check.sh` (make it executable) +- `genesis_uptime_api.service` → `/etc/systemd/system/genesis_uptime_api.service` + +### 3. Enable Cron + +Edit your crontab with `crontab -e` and add: + +```cron +*/5 * * * * /usr/local/bin/genesis_check.sh +``` + +### 4. Start API Service + +```bash +sudo systemctl daemon-reload +sudo systemctl enable --now genesis_uptime_api +``` + +Then browse to `http://localhost:5000/api/uptime/radio` + +## Web Integration + +In your HTML, add a div and script like this: + +```html +<div id="radioUptime"><small>Uptime: Loading…</small></div> +<script> +fetch('/api/uptime/radio') + .then(r => r.json()) + .then(data => { + document.getElementById('radioUptime').innerHTML = `<small>Uptime: ${data.uptime}% (24h)</small>`; + }); +</script> +``` diff --git a/procedures/infrastructure.md b/procedures/infrastructure.md new file mode 100644 index 0000000..65c8eb8 --- /dev/null +++ b/procedures/infrastructure.md @@ -0,0 +1,86 @@ +# 📊 Genesis Radio Infrastructure Overview +**Date:** April 30, 2025 +**Prepared by:** Doc + +--- + +## 🏗️ Infrastructure Summary + +Genesis Radio now operates a fully segmented, secure, and performance-tuned backend suitable for enterprise-grade broadcasting and media delivery. The infrastructure supports high availability (HA) principles for storage and platform independence for core services. + +--- + +## 🧱 Core Components + +### 🎙️ Genesis Radio Services +- **StationPlaylist (SPL)**: Windows-based automation system, mounts secure object storage as drives via rclone +- **Voice Tracker (Remote Access)**: Synced with SPL backend and available to authorized remote users +- **Azuracast (Secondary automation)**: Dockerized platform running on dedicated VM +- **Mastodon (Community)**: Hosted in Docker with separate PostgreSQL cluster and MinIO object storage + +--- + +## 💾 Storage Architecture + +| Feature | Status | +|-----------------------------|---------------------------| +| Primary Storage Backend | MinIO on `shredderv2` | +| Storage Filesystem | ZFS RAID-Z1 | +| Encryption | Enabled (per-bucket S3 SSE) | +| Buckets (Scoped) | `genesislibrary-secure`, `genesisassets-secure`, `genesisshows-secure`, `mastodonassets-secure` | +| Snapshot Capability | ✅ (ZFS native snapshots) | +| Caching | SSD-backed rclone VFS cache per mount | + +--- + +## 🛡️ Security & Access Control + +- TLS for all services (Let's Encrypt) +- MinIO Console behind HTTPS (`consolev2.sshjunkie.com`) +- User policies applied per-bucket (read/write scoped) +- Server-to-server rsync/rclone over SSH + +--- + +## 🔄 Backup & Recovery + +- Dedicated backup server with SSH access +- Nightly rsync for show archives and Mastodon data +- Snapshot replication via `zfs send | ssh backup zfs recv` planned +- Manual and automated snapshot tools + +--- + +## 🔍 Monitoring & Observability + +| Component | Status | Notes | +|------------------|--------------|------------------------------| +| System Monitoring| `vmstat`, `watch`, custom CLI tools | +| Log Aggregation | Centralized on pyapps VM | +| Prometheus | Partial (used with ClusterControl) | +| Alerts | Mastodon warning bot, Telegram planned | + +--- + +## 🚦 Current Migration Status + +| Component | Status | Notes | +|------------------|----------------|---------------------------------| +| Mastodon Assets | ✅ Migrated | Verified, encrypted, ZFS snapshotted | +| Genesis Library | ✅ Migrated | Synced from backup server | +| Genesis Assets | ✅ Migrated | Cleanup of shows in progress | +| Genesis Shows | ✅ Migrated | Pulled from same source, cleanup to follow | +| Azuracast | Migrated | Staged and restored from staging + +--- + +## 🧭 Next Steps + +- Clean up misplaced show files in assets bucket +- Automate ZFS snapshot replication +- Consider Grafana/Prometheus dashboard for real-time metrics +- Continue phasing out legacy containers (LXC → full VMs) + +--- + +This infrastructure is stable, secure, and built for scale. Further improvements will refine observability, automate recovery, and enhance multi-user coordination. diff --git a/procedures/map.md b/procedures/map.md new file mode 100644 index 0000000..3fd39a7 --- /dev/null +++ b/procedures/map.md @@ -0,0 +1,85 @@ +# Genesis Radio Internal Architecture Map + +--- + +## 🏢 Core Infrastructure + +| System | Purpose | Location | +|:---|:---|:---| +| Krang | Main admin server / script runner / monitoring node | On-premises / VM | +| SPL Server (Windows) | StationPlaylist Studio automation and playout system | On-premises / VM | +| Shredder | MinIO Object Storage / Cache server | On-premises / VM | +| PostgreSQL Cluster (db1/db2) | Mastodon database backend / Other app storage | Clustered VMs | +| Mastodon Server | Frontend social interface for alerts, community | Hosted at `chatwithus.live` | + +--- + +## 🧠 Automation Components + +| Component | Description | Hosted Where | +|:---|:---|:---| +| `mount_guardian.ps1` | Automatically ensures Rclone mounts (Q:\ and R:\) are up | SPL Server (Windows) | +| `rotate_mount_logs.ps1` | Weekly log rotation for mount logs | SPL Server (Windows) | +| `healthcheck.py` | Multi-node health and service monitor | Krang | +| Mastodon DM Alerts | Immediate alerting if something breaks (Mounts, Services) | Krang via API | +| Genesis Mission Control Landing Page | Web dashboard with Commandments + Live Healthcheck | Hosted on Krang's Nginx | + +--- + +## 🎙️ Storage and Streaming + +| Mount | Purpose | Backed by | +|:---|:---|:---| +| Q:\ (Assets) | Station IDs, sweepers, intro/outros, promos | GenesisAssets Bucket (Rclone) | +| R:\ (Library) | Full music library content | GenesisLibrary Bucket (Rclone) | + +✅ Primary Cache: `L:\` (SSD) +✅ Secondary Cache: `X:\` (Spinning HDD) + +--- + +## 📡 Communications + +| Alert Type | How Sent | +|:---|:---| +| Mount Failures | Direct Mastodon DM | +| Healthcheck Failures (Disk, Service, SMART, RAID) | Direct Mastodon DM | +| Git Push Auto-Retry Failures (optional future upgrade) | Potential Mastodon DM | + +--- + +## 📋 GitOps Flow + +| Step | Script | Behavior | +|:---|:---|:---| +| Save changes | giteapush.sh | Auto stage, commit (timestamped), push to Gitea | +| Retry failed push | giteapush.sh auto-retry block | Up to 3x tries with 5-second gaps | +| Repo status summary | giteapush.sh final step | Clean `git status -sb` displayed | + +✅ Follows GROWL commit style: +Good, Readable, Obvious, Well-Scoped, Logical. + +--- + +## 📜 Policies and Procedures + +| Document | Purpose | +|:---|:---| +| `OPS.md` | Healthcheck Runbook and Service Recovery Instructions | +| `GROWL.md` | Git Commit Message Style Guide | +| `Mission Control Landing Page` | Browser homepage with live dashboard + ops philosophy | + +--- + +## 🛡️ Key Principles + +- Calm is Contagious. +- Go Slow to Go Fast. +- Snappy Snaps Save Lives. +- Scripts are Smarter Than Sleepy Admins. +- If You Didn't Write It Down, It Didn't Happen. + +--- + +# 🎙️ Genesis Radio Ops +Built with pride. Built to last. 🛡️🚀 diff --git a/procedures/mastodon/mastodon-content-policy.md b/procedures/mastodon/mastodon-content-policy.md new file mode 100644 index 0000000..09bb359 --- /dev/null +++ b/procedures/mastodon/mastodon-content-policy.md @@ -0,0 +1,24 @@ +# Mastodon Content Policy + +Genesis Hosting Technologies supports a variety of voices on **chatwithus.live** — but not at the cost of safety or legality. + +## Allowed Content + +- Personal posts, art, tech content, memes, news + + +## Prohibited Content + +- Hate speech or glorification of hate groups +- Violent extremism +- Sexual content involving minors (real or fictional) +- Cryptocurrency scams, pyramid schemes + +## Bots & Automation + +- Allowed only with prior approval +- Must include a descriptive profile and clear opt-out methods + +## Creative Commons / Attribution + +- Users posting CC-licensed or open-source content should include attribution where applicable diff --git a/procedures/mastodon/mastodon-maintenance-policy.md b/procedures/mastodon/mastodon-maintenance-policy.md new file mode 100644 index 0000000..7dc56c9 --- /dev/null +++ b/procedures/mastodon/mastodon-maintenance-policy.md @@ -0,0 +1,24 @@ +# Mastodon Maintenance Policy + +We adhere to structured maintenance windows for **chatwithus.live** to ensure reliability without disrupting users. + +## Weekly Maintenance + +- **Window**: Sundays, 7 PM – 9 PM Eastern Time +- Routine updates (OS, Docker images, dependencies) +- Asset rebuilds, minor database tune-ups + +## Emergency Maintenance + +- Patching vulnerabilities (e.g., CVEs) +- Redis/PostgreSQL crash recovery +- Federation or relay failures + +## Notifications + +- Posted to Mastodon via @administration at least 1 hour in advance +- Maintenance announcements also pushed to the server status page + +## Failures During Maintenance + +- If the instance does not recover within 30 minutes, full rollback initiated diff --git a/procedures/mastodon/mastodon-moderation-policy.md b/procedures/mastodon/mastodon-moderation-policy.md new file mode 100644 index 0000000..4ac13da --- /dev/null +++ b/procedures/mastodon/mastodon-moderation-policy.md @@ -0,0 +1,26 @@ +# Mastodon Moderation Policy + +Moderation is essential to protecting the health of **chatwithus.live**. + +## Enforcement + +- Reports reviewed by admin/mod team within 24 hours +- Immediate suspension for: + - Threats of violence + - Doxxing or credible harassment + - Hosting or linking CSAM, gore, or hate groups + +## Report Processing + +- All reports logged with timestamps and notes +- Outcomes recorded and reviewed monthly for fairness + +## Appeal Process + +- Users may appeal a moderation decision by opening a ticket via WHMCS +- Appeals are reviewed by at least two moderators + +## Transparency + +- Moderation decisions and defederation actions are optionally listed at `/about/more` +- Annual transparency reports summarize key moderation stats diff --git a/procedures/mastodon/mastodon-uptime-policy.md b/procedures/mastodon/mastodon-uptime-policy.md new file mode 100644 index 0000000..58fc5bf --- /dev/null +++ b/procedures/mastodon/mastodon-uptime-policy.md @@ -0,0 +1,22 @@ +# Mastodon Uptime Policy + +Genesis Hosting Technologies strives to maintain high availability for our Mastodon instance at **chatwithus.live**. + +## Availability Target + +- **Uptime Goal**: 99.5% monthly (approx. 3.5 hours of downtime max) +- We consider chatwithus.live "unavailable" when: + - The web UI fails to load or times out + - Toot delivery is delayed by >10 minutes + - Federation is broken for more than 30 minutes + +## Redundancy + +- PostgreSQL cluster with HA failover +- Redis and Sidekiq monitored 24/7 +- Mastodon is backed by ZFS storage and hourly snapshots + +## Exceptions + +- Scheduled maintenance (see Maintenance Policy) +- DDoS or external platform failures (e.g., relay outages) diff --git a/procedures/mastodon/mastodon-user-policy.md b/procedures/mastodon/mastodon-user-policy.md new file mode 100644 index 0000000..139b53c --- /dev/null +++ b/procedures/mastodon/mastodon-user-policy.md @@ -0,0 +1,26 @@ +# Mastodon User Policy + +This document governs behavior on our Mastodon instance **chatwithus.live**. + +## Behavior Expectations + +- No harassment, hate speech, or targeted abuse +- No spam, bots, or auto-posting without permission +- No doxxing or sharing of private information + +## Federation + +- Defederated instances may not be interacted with via this server +- Federation decisions are made by the moderation team + +## Account Management + +- Inactive accounts with 0 posts may be purged after 90 days +- Users must keep a valid email address on file +- Multiple accounts are allowed, but abuse may result in bans + +## Banned Activities + +- Disruptive scraping or crawling of the API +- Hosting or linking to malware/phishing content +- Evading moderation decisions with alternate accounts diff --git a/procedures/planned_db_cluster_ZFS.md b/procedures/planned_db_cluster_ZFS.md new file mode 100644 index 0000000..63be9d6 --- /dev/null +++ b/procedures/planned_db_cluster_ZFS.md @@ -0,0 +1,34 @@ +# 🗺️ PostgreSQL High-Availability Architecture with ZFS (Genesis Hosting) + +```plaintext + ┌──────────────────────────────┐ + │ Client Applications │ + └────────────┬─────────────────┘ + │ + ▼ + ┌─────────────────┐ + │ HAProxy │ + │ (Load Balancer) │ + └────────┬────────┘ + │ + ┌────────────┴────────────┐ + │ │ + ▼ ▼ + ┌──────────────┐ ┌──────────────┐ + │ Primary Node │ │ Replica Node │ + │ (DB Server) │ │ (DB Server) │ + └──────┬───────┘ └──────┬───────┘ + │ │ + ▼ ▼ + ┌──────────────┐ ┌──────────────┐ + │ ZFS Storage │ │ ZFS Storage │ + │ (RAIDZ1) │ │ (RAIDZ1) │ + └──────────────┘ └──────────────┘ + │ │ + └────────┬────────┬────────┘ + │ │ + ▼ ▼ + ┌──────────────┐ + │ Backup Node │ + │ (ZFS RAIDZ1) │ + └──────────────┘ diff --git a/procedures/runv1.md b/procedures/runv1.md new file mode 100644 index 0000000..6c78a31 --- /dev/null +++ b/procedures/runv1.md @@ -0,0 +1,107 @@ +📜 Genesis Radio Mission Control Runbook (v1) +🛡️ Genesis Radio Mission Control: Ops Runbook + + Purpose: + Quickly diagnose and fix common Genesis Radio infrastructure issues without guesswork, even under pressure. + +🚨 If a Mount is Lost (Q:\ or R:) + +Symptoms: + + Station playback errors + + Skipping or dead air after a Station ID + + Log shows: Audio Engine Timeout on Q:\ or R:\ paths + +Immediate Actions: + + Check if drives Q:\ and R:\ are visible in Windows Explorer. + + Open C:\genesis_rclone_mount.log and check last 10 lines. + + Run Mount Guardian manually: + + powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\mount_guardian.ps1" + + Wait 15 seconds. + + Verify that Q:\ and R:\ reappear. + + If re-mounted, check logs for successful ✅ mount entry. + +If Mount Guardian fails to remount: + + Check if rclone.exe is missing or updated incorrectly. + + Check disk space on L:\ and X:\ cache drives. + + Manually run rclone mounts with correct flags (see below). + +🛠️ Manual Rclone Mount Commands (Emergency) + +rclone mount genesisassets:genesisassets Q:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572 + +rclone mount genesislibrary:genesislibrary R:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572 + +✅ Always mount assets (Q:) first, then library (R:). +📬 If Mastodon DMs a Mount Failure Alert + +Message example: + + 🚨 Genesis Radio Ops: Failed to mount Q:\ after recovery attempt! + +Actions: + + Immediately check C:\genesis_rclone_mount.log + + Verify if the mount succeeded after retry + + If not: manually run Mount Guardian + + Escalate if disk space or critical cache drive failure suspected + +📊 If Dashboard Data Looks Broken + +Symptoms: + + Health dashboard empty + + No refresh + + Tables missing + +Actions: + + Check that healthcheck HTML generator is still scheduled. + + SSH into Krang: + +systemctl status healthcheck.timer + +Restart healthcheck if necessary: + + systemctl restart healthcheck.timer + + Check /var/www/html/healthcheck.html timestamp. + +🧹 Log Rotation and Space + + Logfile is rotated automatically weekly if over 5MB. + + If needed manually: + + powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\rotate_mount_logs.ps1" + +🐢 Critical Reminders (Go Slow to Go Fast) + + Breathe. Double-check before restarting services. + + Don't panic-restart Windows unless all mount attempts fail. + + Document what you changed. Always. + +🛡️ Mission: Keep Genesis Radio running, clean, and stable. + +Scripters are smarter than panickers. +Calm is contagious. |