summaryrefslogtreecommitdiff
path: root/procedures
diff options
context:
space:
mode:
Diffstat (limited to 'procedures')
-rw-r--r--procedures/GROWL.md111
-rw-r--r--procedures/OPS.md154
-rwxr-xr-xprocedures/buildandbpack.sh59
-rw-r--r--procedures/databasecluster.md87
-rw-r--r--procedures/decom.md57
-rw-r--r--procedures/genesis_uptime_monitor.md57
-rw-r--r--procedures/infrastructure.md86
-rw-r--r--procedures/map.md85
-rw-r--r--procedures/mastodon/mastodon-content-policy.md24
-rw-r--r--procedures/mastodon/mastodon-maintenance-policy.md24
-rw-r--r--procedures/mastodon/mastodon-moderation-policy.md26
-rw-r--r--procedures/mastodon/mastodon-uptime-policy.md22
-rw-r--r--procedures/mastodon/mastodon-user-policy.md26
-rw-r--r--procedures/planned_db_cluster_ZFS.md34
-rw-r--r--procedures/runv1.md107
15 files changed, 959 insertions, 0 deletions
diff --git a/procedures/GROWL.md b/procedures/GROWL.md
new file mode 100644
index 0000000..119682d
--- /dev/null
+++ b/procedures/GROWL.md
@@ -0,0 +1,111 @@
+# GROWL β€” Genesis Radio Commit Style Guide
+
+---
+
+## πŸ›‘οΈ Purpose
+
+To keep our Git commit history **clean, calm, and clear** β€”
+even during chaos, downtime, or tired late-night edits.
+
+Every commit should **GROWL**:
+
+| Letter | Meaning |
+|:---|:---|
+| **G** | Good |
+| **R** | Readable |
+| **O** | Obvious |
+| **W** | Well-Scoped |
+| **L** | Logical |
+
+---
+
+## 🧠 GROWL Principles
+
+### **G β€” Good**
+
+Write clear, helpful commit messages.
+Imagine your future self β€” tired, panicked β€” trying to understand what you did.
+
+**Bad:**
+`update`
+
+**Good:**
+`Fix retry logic for mount guardian script`
+
+---
+
+### **R β€” Readable**
+
+Use short, plain English sentences.
+No cryptic shorthand. No weird abbreviations.
+
+**Bad:**
+`fx psh scrpt`
+
+**Good:**
+`Fix powershell script argument passing error`
+
+---
+
+### **O β€” Obvious**
+
+The commit message should explain what changed without needing a diff.
+
+**Bad:**
+`misc`
+
+**Good:**
+`Add dark mode CSS to healthcheck dashboard`
+
+---
+
+### **W β€” Well-Scoped**
+
+One logical change per commit.
+Don't fix five things at once unless they're tightly related.
+
+**Bad:**
+`fix mount issues, added healthcheck, tweaked retry`
+
+**Good:**
+`Fix asset mount detection timing issue`
+
+(And then a separate commit for healthcheck tweaks.)
+
+---
+
+### **L β€” Logical**
+
+Commits should build logically.
+Each one should bring the repo to a **better, deployable state** β€” not leave it broken.
+
+**Bad:**
+Commit partial broken code just because "I need to leave soon."
+
+**Good:**
+Finish a working block, then commit.
+
+---
+
+## πŸ“‹ Quick GROWL Checklist Before You Push:
+
+- [ ] Is my message clear to a stranger?
+- [ ] Did I only change one logical thing?
+- [ ] Can I tell from the commit what changed, without a diff?
+- [ ] Would sleepy me at 3AM thank me for writing this?
+
+---
+
+## πŸŽ™οΈ Why We GROWL
+
+Because panic, fatigue, or adrenaline can't be avoided β€”
+but **good habits under pressure can save a system** (and a future you) every time.
+
+Stay calm.
+Make it obvious.
+Let it GROWL.
+
+---
+
+# 🐺 Genesis Radio Operations
+*Built with pride. Built to last.*
diff --git a/procedures/OPS.md b/procedures/OPS.md
new file mode 100644
index 0000000..63f0e28
--- /dev/null
+++ b/procedures/OPS.md
@@ -0,0 +1,154 @@
+# πŸš€ Genesis Radio - Healthcheck Response Runbook
+
+## Purpose
+When an alert fires (Critical or Warning), this guide tells you what to do so that **any team member** can react quickly, even if the admin is not available.
+
+---
+
+## πŸ› οΈ How to Use
+- Every Mastodon DM or Dashboard alert gives you a **timestamp**, **server name**, and **issue**.
+- Look up the type of issue in the table below.
+- Follow the recommended action immediately.
+
+---
+
+## πŸ“‹ Quick Response Table
+
+| Type of Alert | Emoji | What it Means | Immediate Action |
+|:---|:---|:---|:---|
+| [Critical Service Failure](#critical-service-failure-) | πŸ”š | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. |
+| [Disk Filling Up](#disk-filling-up-) | πŸ“ˆ | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. |
+| [Rclone Mount Error](#rclone-mount-error-) | 🐒 | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) |
+| [PostgreSQL Replication Lag](#postgresql-replication-lag-) | πŸ’₯ | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. |
+| [RAID Degraded](#raid-degraded-) | 🧸 | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. |
+| [Log File Warnings](#log-file-warnings-) | ⚠️ | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. |
+
+---
+
+## πŸ’» If Dashboard Shows
+- βœ… **All Green** = No action needed.
+- ⚠️ **Warnings** = Investigate soon. Not urgent unless repeated.
+- 🚨 **Criticals** = Drop everything and act immediately.
+
+---
+
+## πŸ›‘οΈ Emergency Contacts
+| Role | Name | Contact |
+|:----|:-----|:--------|
+| Primary Admin | (You) | [845-453-0820] |
+| Secondary | Brice | [BRICE CONTACT INFO] |
+
+(Replace placeholders with actual contact details.)
+
+---
+
+## ✍️ Example Cheat Sheet for Brice
+
+**Sample Mastodon DM:**
+> 🚨 Genesis Radio Critical Healthcheck 2025-04-28 14:22:33 🚨
+> ⚑ 1 critical issue found:
+> - πŸ”š [mastodon] CRITICAL: Service mastodon-web not running!
+
+**Brice should:**
+1. SSH into Mastodon server.
+2. Run `systemctl restart mastodon-web`.
+3. Confirm the service is running again.
+4. If it fails or stays down, escalate to admin.
+
+---
+
+# 🌟 TL;DR
+- 🚨 Criticals: Act immediately.
+- ⚠️ Warnings: Investigate soon.
+- βœ… Healthy: No action needed.
+
+---
+
+# πŸ› οΈ Genesis Radio - Detailed Ops Playbook
+
+## Critical Service Failure (πŸ”š)
+**Symptoms:** Service marked as CRITICAL.
+
+**Fix:**
+1. SSH into server.
+2. `sudo systemctl status <service>`
+3. `sudo systemctl restart <service>`
+4. Confirm running. Check logs if it fails.
+
+---
+
+## Disk Filling Up (πŸ“ˆ)
+**Symptoms:** Disk space critically low.
+
+**Fix:**
+1. SSH into server.
+2. `df -h`
+3. Delete old logs:
+ ```bash
+ sudo rm -rf /var/log/*.gz /var/log/*.[0-9]
+ sudo journalctl --vacuum-time=2d
+ ```
+4. If still low, find big files and clean.
+
+---
+
+## Rclone Mount Error (🐒)
+**Symptoms:** Mount failure or slowness.
+
+**Fix:**
+1. SSH into SPL server.
+2. Unmount & remount:
+ ```bash
+ sudo fusermount -uz /path/to/mount
+ sudo systemctl restart rclone@<mount>
+ ```
+3. Confirm mount is active.
+
+---
+
+## PostgreSQL Replication Lag (πŸ’₯)
+**Symptoms:** Replica database lagging.
+
+**Fix:**
+1. SSH into replica server.
+2. Check lag:
+ ```bash
+ sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
+ ```
+3. Restart PostgreSQL if stuck.
+4. Monitor replication logs.
+
+---
+
+## RAID Degraded (🧸)
+**Symptoms:** RAID missing a disk.
+
+**Fix:**
+1. SSH into server.
+2. `cat /proc/mdstat`
+3. Find failed drive:
+ ```bash
+ sudo mdadm --detail /dev/md0
+ ```
+4. Replace failed disk, rebuild array:
+ ```bash
+ sudo mdadm --add /dev/md0 /dev/sdX
+ ```
+
+---
+
+## Log File Warnings (⚠️)
+**Symptoms:** Errors in syslog or nginx.
+
+**Fix:**
+1. SSH into server.
+2. Review logs:
+ ```bash
+ grep ERROR /var/log/syslog
+ ```
+3. Investigate. Escalate if necessary.
+
+---
+
+**Stay sharp. Early fixes prevent major downtime!** πŸ›‘οΈπŸ’ͺ
+
diff --git a/procedures/buildandbpack.sh b/procedures/buildandbpack.sh
new file mode 100755
index 0000000..cdc3564
--- /dev/null
+++ b/procedures/buildandbpack.sh
@@ -0,0 +1,59 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# πŸ”§ CONFIGURATION
+FREEBSD_BRANCH="stable/14"
+KERNCONF="STORAGE_ZFS"
+MAKEJOBS=$(nproc)
+BUILDROOT="$HOME/freebsd-kernel-build"
+OBJDIR="/tmp/obj"
+TOOLCHAIN_BIN="/tmp/amd64.amd64/usr/bin"
+
+# 🌱 Step 1: Prep Environment
+mkdir -p "$BUILDROOT"
+cd "$BUILDROOT"
+
+# πŸ”» Step 2: Get FreeBSD source
+if [ ! -d "src" ]; then
+ git clone https://git.freebsd.org/src.git
+ cd src
+ git checkout "$FREEBSD_BRANCH"
+else
+ cd src
+ git fetch
+ git checkout "$FREEBSD_BRANCH"
+ git pull
+fi
+
+# πŸ› οΈ Step 3: Build FreeBSD toolchain (only once)
+if [ ! -d "$TOOLCHAIN_BIN" ]; then
+ echo "[*] Bootstrapping FreeBSD native-xtools..."
+ bmake XDEV=amd64 XDEV_ARCH=amd64 native-xtools
+else
+ echo "[*] Toolchain already built. Skipping..."
+fi
+
+# πŸ” Step 4: Prepare kernel config
+cd "$BUILDROOT/src/sys/amd64/conf"
+if [ ! -f "$KERNCONF" ]; then
+ cp GENERIC "$KERNCONF"
+ echo "[*] Created new kernel config from GENERIC: $KERNCONF"
+fi
+
+# 🧠 Step 5: Build the kernel
+export PATH="$TOOLCHAIN_BIN:$PATH"
+export MAKEOBJDIRPREFIX="$OBJDIR"
+
+cd "$BUILDROOT/src"
+bmake -j"$MAKEJOBS" buildkernel TARGET=amd64 TARGET_ARCH=amd64 KERNCONF="$KERNCONF"
+
+# πŸ“¦ Step 6: Package the kernel
+KERNEL_OUT="$OBJDIR/$BUILDROOT/src/amd64.amd64/sys/$KERNCONF"
+PACKAGE_NAME="freebsd-kernel-$(date +%Y%m%d-%H%M%S).tar.gz"
+
+tar czf "$BUILDROOT/$PACKAGE_NAME" -C "$KERNEL_OUT" kernel
+
+# πŸ“£ Done
+echo "βœ… Kernel build and package complete."
+echo "➑️ Output: $BUILDROOT/$PACKAGE_NAME"
+
diff --git a/procedures/databasecluster.md b/procedures/databasecluster.md
new file mode 100644
index 0000000..1c26165
--- /dev/null
+++ b/procedures/databasecluster.md
@@ -0,0 +1,87 @@
+# Database Cluster (baboon.sshjunkie.com)
+
+## Overview
+The database cluster consists of two PostgreSQL database servers hosted on `baboon.sshjunkie.com`. These servers are used to store data for services such as Mastodon and AzuraCast. The cluster ensures high availability and fault tolerance through replication and backup strategies.
+
+## Installation
+Install PostgreSQL on both nodes in the cluster:
+
+```bash
+# Update package list and install PostgreSQL
+sudo apt update
+sudo apt install -y postgresql postgresql-contrib
+
+# Ensure PostgreSQL is running
+sudo systemctl start postgresql
+sudo systemctl enable postgresql
+```
+
+## Configuration
+### PostgreSQL Configuration Files:
+- **pg_hba.conf**:
+ - Allow replication and local connections.
+ - Example:
+ ```ini
+ local all postgres md5
+ host replication all 192.168.0.0/16 md5
+ ```
+- **postgresql.conf**:
+ - Set `wal_level` for replication:
+ ```ini
+ wal_level = hot_standby
+ max_wal_senders = 3
+ ```
+
+### Replication Configuration:
+- Set up streaming replication between the two nodes (`baboon.sshjunkie.com` as the master and the second node as the replica).
+
+1. On the master node, enable replication and restart PostgreSQL.
+2. On the replica node, set up replication by copying the data directory from the master node and configure the `recovery.conf` file.
+
+Example `recovery.conf` on the replica:
+```ini
+standby_mode = on
+primary_conninfo = 'host=baboon.sshjunkie.com port=5432 user=replicator password=your_password'
+trigger_file = '/tmp/postgresql.trigger.5432'
+```
+
+## Usage
+- **Check the status of PostgreSQL**:
+ ```bash
+ sudo systemctl status postgresql
+ ```
+
+- **Promote the replica to master**:
+ ```bash
+ pg_ctl promote -D /var/lib/postgresql/data
+ ```
+
+## Backups
+Use `pg_basebackup` to create full backups of the cluster. Example:
+
+```bash
+pg_basebackup -h baboon.sshjunkie.com -U replicator -D /backups/db_backup -Ft -z -P
+```
+
+Automate backups with cronjobs for regular snapshots.
+
+## Troubleshooting
+- **Issue**: Replica is lagging behind.
+ - **Solution**: Check network connectivity and ensure the replica is able to connect to the master node. Monitor replication lag with:
+ ```bash
+ SELECT * FROM pg_stat_replication;
+ ```
+
+## Monitoring
+- **Monitor replication status**:
+ ```bash
+ SELECT * FROM pg_stat_replication;
+ ```
+
+- **Monitor database health**:
+ ```bash
+ pg_isready
+ ```
+
+## Additional Information
+- [PostgreSQL Streaming Replication Documentation](https://www.postgresql.org/docs/current/warm-standby.html)
diff --git a/procedures/decom.md b/procedures/decom.md
new file mode 100644
index 0000000..525e295
--- /dev/null
+++ b/procedures/decom.md
@@ -0,0 +1,57 @@
+# πŸ—‘οΈ Decommissioning Checklist for `shredderv1`
+
+**Date:** 2025-05-01
+
+---
+
+## πŸ” 1. Verify Nothing Critical Is Running
+- [ ] Confirm all services (e.g., AzuraCast, Docker containers, media playback) have **been migrated**
+- [ ] Double-check DNS entries (e.g., CNAMEs or A records) have been **updated to the new server**
+- [ ] Ensure any **active mounts, Rclone remotes, or scheduled tasks** are disabled
+
+---
+
+## πŸ“¦ 2. Migrate/Preserve Data
+- [ ] Backup and copy remaining relevant files (station configs, logs, recordings, playlists)
+- [ ] Verify data was successfully migrated to the new ZFS-based AzuraCast VM
+- [ ] Remove temporary backup files and export archives
+
+---
+
+## 🧹 3. Remove from Infrastructure
+- [ ] Remove from monitoring tools (e.g., Prometheus, Nagios, Grafana)
+- [ ] Remove from Ansible inventory or configuration management systems
+- [ ] Remove any scheduled crons or automation hooks targeting this VM
+
+---
+
+## πŸ”§ 4. Disable and Secure
+- [ ] Power down services (`docker stop`, `systemctl disable`, etc.)
+- [ ] Disable remote access (e.g., SSH keys, user accounts)
+- [ ] Lock or archive internal credentials (e.g., API tokens, DB creds, rclone configs)
+
+---
+
+## 🧽 5. Wipe or Reclaim Resources
+- [ ] If VM: Delete or archive VM snapshot in Proxmox or hypervisor
+- [ ] If physical: Securely wipe disks (e.g., `shred`, `blkdiscard`, or DBAN)
+- [ ] Reclaim IP address (e.g., assign to new ZFS-based VM)
+
+---
+
+## πŸ“œ 6. Documentation & Closure
+- [ ] Log the decommission date in your infrastructure inventory or documentation
+- [ ] Tag any previous support tickets/issues as β€œResolved (Decommissioned)”
+- [ ] Inform team members that `shredderv1` has been retired
+
+---
+
+## 🚫 Final Step
+```bash
+shutdown -h now
+```
+
+Or if you're feeling dramatic:
+```bash
+echo "Goodnight, sweet prince." && shutdown -h now
+```
diff --git a/procedures/genesis_uptime_monitor.md b/procedures/genesis_uptime_monitor.md
new file mode 100644
index 0000000..6505f06
--- /dev/null
+++ b/procedures/genesis_uptime_monitor.md
@@ -0,0 +1,57 @@
+# Genesis Uptime Monitor
+
+This package sets up a simple service uptime tracker on your local server (e.g., Krang). It includes:
+
+- A Python Flask API to report 24-hour uptime
+- A bash script to log uptime results every 5 minutes
+- A systemd unit to keep the API running
+
+## Setup Instructions
+
+### 1. Install Requirements
+
+```bash
+sudo apt install python3-venv curl
+cd ~
+python3 -m venv genesis_api
+source genesis_api/bin/activate
+pip install flask
+```
+
+### 2. Place Files
+
+- `uptime_server.py` β†’ `/home/doc/uptime_server.py`
+- `genesis_check.sh` β†’ `/usr/local/bin/genesis_check.sh` (make it executable)
+- `genesis_uptime_api.service` β†’ `/etc/systemd/system/genesis_uptime_api.service`
+
+### 3. Enable Cron
+
+Edit your crontab with `crontab -e` and add:
+
+```cron
+*/5 * * * * /usr/local/bin/genesis_check.sh
+```
+
+### 4. Start API Service
+
+```bash
+sudo systemctl daemon-reload
+sudo systemctl enable --now genesis_uptime_api
+```
+
+Then browse to `http://localhost:5000/api/uptime/radio`
+
+## Web Integration
+
+In your HTML, add a div and script like this:
+
+```html
+<div id="radioUptime"><small>Uptime: Loading…</small></div>
+<script>
+fetch('/api/uptime/radio')
+ .then(r => r.json())
+ .then(data => {
+ document.getElementById('radioUptime').innerHTML = `<small>Uptime: ${data.uptime}% (24h)</small>`;
+ });
+</script>
+```
diff --git a/procedures/infrastructure.md b/procedures/infrastructure.md
new file mode 100644
index 0000000..65c8eb8
--- /dev/null
+++ b/procedures/infrastructure.md
@@ -0,0 +1,86 @@
+# πŸ“Š Genesis Radio Infrastructure Overview
+**Date:** April 30, 2025
+**Prepared by:** Doc
+
+---
+
+## πŸ—οΈ Infrastructure Summary
+
+Genesis Radio now operates a fully segmented, secure, and performance-tuned backend suitable for enterprise-grade broadcasting and media delivery. The infrastructure supports high availability (HA) principles for storage and platform independence for core services.
+
+---
+
+## 🧱 Core Components
+
+### πŸŽ™οΈ Genesis Radio Services
+- **StationPlaylist (SPL)**: Windows-based automation system, mounts secure object storage as drives via rclone
+- **Voice Tracker (Remote Access)**: Synced with SPL backend and available to authorized remote users
+- **Azuracast (Secondary automation)**: Dockerized platform running on dedicated VM
+- **Mastodon (Community)**: Hosted in Docker with separate PostgreSQL cluster and MinIO object storage
+
+---
+
+## πŸ’Ύ Storage Architecture
+
+| Feature | Status |
+|-----------------------------|---------------------------|
+| Primary Storage Backend | MinIO on `shredderv2` |
+| Storage Filesystem | ZFS RAID-Z1 |
+| Encryption | Enabled (per-bucket S3 SSE) |
+| Buckets (Scoped) | `genesislibrary-secure`, `genesisassets-secure`, `genesisshows-secure`, `mastodonassets-secure` |
+| Snapshot Capability | βœ… (ZFS native snapshots) |
+| Caching | SSD-backed rclone VFS cache per mount |
+
+---
+
+## πŸ›‘οΈ Security & Access Control
+
+- TLS for all services (Let's Encrypt)
+- MinIO Console behind HTTPS (`consolev2.sshjunkie.com`)
+- User policies applied per-bucket (read/write scoped)
+- Server-to-server rsync/rclone over SSH
+
+---
+
+## πŸ”„ Backup & Recovery
+
+- Dedicated backup server with SSH access
+- Nightly rsync for show archives and Mastodon data
+- Snapshot replication via `zfs send | ssh backup zfs recv` planned
+- Manual and automated snapshot tools
+
+---
+
+## πŸ” Monitoring & Observability
+
+| Component | Status | Notes |
+|------------------|--------------|------------------------------|
+| System Monitoring| `vmstat`, `watch`, custom CLI tools |
+| Log Aggregation | Centralized on pyapps VM |
+| Prometheus | Partial (used with ClusterControl) |
+| Alerts | Mastodon warning bot, Telegram planned |
+
+---
+
+## 🚦 Current Migration Status
+
+| Component | Status | Notes |
+|------------------|----------------|---------------------------------|
+| Mastodon Assets | βœ… Migrated | Verified, encrypted, ZFS snapshotted |
+| Genesis Library | βœ… Migrated | Synced from backup server |
+| Genesis Assets | βœ… Migrated | Cleanup of shows in progress |
+| Genesis Shows | βœ… Migrated | Pulled from same source, cleanup to follow |
+| Azuracast | Migrated | Staged and restored from staging
+
+---
+
+## 🧭 Next Steps
+
+- Clean up misplaced show files in assets bucket
+- Automate ZFS snapshot replication
+- Consider Grafana/Prometheus dashboard for real-time metrics
+- Continue phasing out legacy containers (LXC β†’ full VMs)
+
+---
+
+This infrastructure is stable, secure, and built for scale. Further improvements will refine observability, automate recovery, and enhance multi-user coordination.
diff --git a/procedures/map.md b/procedures/map.md
new file mode 100644
index 0000000..3fd39a7
--- /dev/null
+++ b/procedures/map.md
@@ -0,0 +1,85 @@
+# Genesis Radio Internal Architecture Map
+
+---
+
+## 🏒 Core Infrastructure
+
+| System | Purpose | Location |
+|:---|:---|:---|
+| Krang | Main admin server / script runner / monitoring node | On-premises / VM |
+| SPL Server (Windows) | StationPlaylist Studio automation and playout system | On-premises / VM |
+| Shredder | MinIO Object Storage / Cache server | On-premises / VM |
+| PostgreSQL Cluster (db1/db2) | Mastodon database backend / Other app storage | Clustered VMs |
+| Mastodon Server | Frontend social interface for alerts, community | Hosted at `chatwithus.live` |
+
+---
+
+## 🧠 Automation Components
+
+| Component | Description | Hosted Where |
+|:---|:---|:---|
+| `mount_guardian.ps1` | Automatically ensures Rclone mounts (Q:\ and R:\) are up | SPL Server (Windows) |
+| `rotate_mount_logs.ps1` | Weekly log rotation for mount logs | SPL Server (Windows) |
+| `healthcheck.py` | Multi-node health and service monitor | Krang |
+| Mastodon DM Alerts | Immediate alerting if something breaks (Mounts, Services) | Krang via API |
+| Genesis Mission Control Landing Page | Web dashboard with Commandments + Live Healthcheck | Hosted on Krang's Nginx |
+
+---
+
+## πŸŽ™οΈ Storage and Streaming
+
+| Mount | Purpose | Backed by |
+|:---|:---|:---|
+| Q:\ (Assets) | Station IDs, sweepers, intro/outros, promos | GenesisAssets Bucket (Rclone) |
+| R:\ (Library) | Full music library content | GenesisLibrary Bucket (Rclone) |
+
+βœ… Primary Cache: `L:\` (SSD)
+βœ… Secondary Cache: `X:\` (Spinning HDD)
+
+---
+
+## πŸ“‘ Communications
+
+| Alert Type | How Sent |
+|:---|:---|
+| Mount Failures | Direct Mastodon DM |
+| Healthcheck Failures (Disk, Service, SMART, RAID) | Direct Mastodon DM |
+| Git Push Auto-Retry Failures (optional future upgrade) | Potential Mastodon DM |
+
+---
+
+## πŸ“‹ GitOps Flow
+
+| Step | Script | Behavior |
+|:---|:---|:---|
+| Save changes | giteapush.sh | Auto stage, commit (timestamped), push to Gitea |
+| Retry failed push | giteapush.sh auto-retry block | Up to 3x tries with 5-second gaps |
+| Repo status summary | giteapush.sh final step | Clean `git status -sb` displayed |
+
+βœ… Follows GROWL commit style:
+Good, Readable, Obvious, Well-Scoped, Logical.
+
+---
+
+## πŸ“œ Policies and Procedures
+
+| Document | Purpose |
+|:---|:---|
+| `OPS.md` | Healthcheck Runbook and Service Recovery Instructions |
+| `GROWL.md` | Git Commit Message Style Guide |
+| `Mission Control Landing Page` | Browser homepage with live dashboard + ops philosophy |
+
+---
+
+## πŸ›‘οΈ Key Principles
+
+- Calm is Contagious.
+- Go Slow to Go Fast.
+- Snappy Snaps Save Lives.
+- Scripts are Smarter Than Sleepy Admins.
+- If You Didn't Write It Down, It Didn't Happen.
+
+---
+
+# πŸŽ™οΈ Genesis Radio Ops
+Built with pride. Built to last. πŸ›‘οΈπŸš€
diff --git a/procedures/mastodon/mastodon-content-policy.md b/procedures/mastodon/mastodon-content-policy.md
new file mode 100644
index 0000000..09bb359
--- /dev/null
+++ b/procedures/mastodon/mastodon-content-policy.md
@@ -0,0 +1,24 @@
+# Mastodon Content Policy
+
+Genesis Hosting Technologies supports a variety of voices on **chatwithus.live** β€” but not at the cost of safety or legality.
+
+## Allowed Content
+
+- Personal posts, art, tech content, memes, news
+
+
+## Prohibited Content
+
+- Hate speech or glorification of hate groups
+- Violent extremism
+- Sexual content involving minors (real or fictional)
+- Cryptocurrency scams, pyramid schemes
+
+## Bots & Automation
+
+- Allowed only with prior approval
+- Must include a descriptive profile and clear opt-out methods
+
+## Creative Commons / Attribution
+
+- Users posting CC-licensed or open-source content should include attribution where applicable
diff --git a/procedures/mastodon/mastodon-maintenance-policy.md b/procedures/mastodon/mastodon-maintenance-policy.md
new file mode 100644
index 0000000..7dc56c9
--- /dev/null
+++ b/procedures/mastodon/mastodon-maintenance-policy.md
@@ -0,0 +1,24 @@
+# Mastodon Maintenance Policy
+
+We adhere to structured maintenance windows for **chatwithus.live** to ensure reliability without disrupting users.
+
+## Weekly Maintenance
+
+- **Window**: Sundays, 7 PM – 9 PM Eastern Time
+- Routine updates (OS, Docker images, dependencies)
+- Asset rebuilds, minor database tune-ups
+
+## Emergency Maintenance
+
+- Patching vulnerabilities (e.g., CVEs)
+- Redis/PostgreSQL crash recovery
+- Federation or relay failures
+
+## Notifications
+
+- Posted to Mastodon via @administration at least 1 hour in advance
+- Maintenance announcements also pushed to the server status page
+
+## Failures During Maintenance
+
+- If the instance does not recover within 30 minutes, full rollback initiated
diff --git a/procedures/mastodon/mastodon-moderation-policy.md b/procedures/mastodon/mastodon-moderation-policy.md
new file mode 100644
index 0000000..4ac13da
--- /dev/null
+++ b/procedures/mastodon/mastodon-moderation-policy.md
@@ -0,0 +1,26 @@
+# Mastodon Moderation Policy
+
+Moderation is essential to protecting the health of **chatwithus.live**.
+
+## Enforcement
+
+- Reports reviewed by admin/mod team within 24 hours
+- Immediate suspension for:
+ - Threats of violence
+ - Doxxing or credible harassment
+ - Hosting or linking CSAM, gore, or hate groups
+
+## Report Processing
+
+- All reports logged with timestamps and notes
+- Outcomes recorded and reviewed monthly for fairness
+
+## Appeal Process
+
+- Users may appeal a moderation decision by opening a ticket via WHMCS
+- Appeals are reviewed by at least two moderators
+
+## Transparency
+
+- Moderation decisions and defederation actions are optionally listed at `/about/more`
+- Annual transparency reports summarize key moderation stats
diff --git a/procedures/mastodon/mastodon-uptime-policy.md b/procedures/mastodon/mastodon-uptime-policy.md
new file mode 100644
index 0000000..58fc5bf
--- /dev/null
+++ b/procedures/mastodon/mastodon-uptime-policy.md
@@ -0,0 +1,22 @@
+# Mastodon Uptime Policy
+
+Genesis Hosting Technologies strives to maintain high availability for our Mastodon instance at **chatwithus.live**.
+
+## Availability Target
+
+- **Uptime Goal**: 99.5% monthly (approx. 3.5 hours of downtime max)
+- We consider chatwithus.live "unavailable" when:
+ - The web UI fails to load or times out
+ - Toot delivery is delayed by >10 minutes
+ - Federation is broken for more than 30 minutes
+
+## Redundancy
+
+- PostgreSQL cluster with HA failover
+- Redis and Sidekiq monitored 24/7
+- Mastodon is backed by ZFS storage and hourly snapshots
+
+## Exceptions
+
+- Scheduled maintenance (see Maintenance Policy)
+- DDoS or external platform failures (e.g., relay outages)
diff --git a/procedures/mastodon/mastodon-user-policy.md b/procedures/mastodon/mastodon-user-policy.md
new file mode 100644
index 0000000..139b53c
--- /dev/null
+++ b/procedures/mastodon/mastodon-user-policy.md
@@ -0,0 +1,26 @@
+# Mastodon User Policy
+
+This document governs behavior on our Mastodon instance **chatwithus.live**.
+
+## Behavior Expectations
+
+- No harassment, hate speech, or targeted abuse
+- No spam, bots, or auto-posting without permission
+- No doxxing or sharing of private information
+
+## Federation
+
+- Defederated instances may not be interacted with via this server
+- Federation decisions are made by the moderation team
+
+## Account Management
+
+- Inactive accounts with 0 posts may be purged after 90 days
+- Users must keep a valid email address on file
+- Multiple accounts are allowed, but abuse may result in bans
+
+## Banned Activities
+
+- Disruptive scraping or crawling of the API
+- Hosting or linking to malware/phishing content
+- Evading moderation decisions with alternate accounts
diff --git a/procedures/planned_db_cluster_ZFS.md b/procedures/planned_db_cluster_ZFS.md
new file mode 100644
index 0000000..63be9d6
--- /dev/null
+++ b/procedures/planned_db_cluster_ZFS.md
@@ -0,0 +1,34 @@
+# πŸ—ΊοΈ PostgreSQL High-Availability Architecture with ZFS (Genesis Hosting)
+
+```plaintext
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ Client Applications β”‚
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
+ β”‚
+ β–Ό
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ HAProxy β”‚
+ β”‚ (Load Balancer) β”‚
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
+ β”‚
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ β”‚
+ β–Ό β–Ό
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ Primary Node β”‚ β”‚ Replica Node β”‚
+ β”‚ (DB Server) β”‚ β”‚ (DB Server) β”‚
+ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
+ β”‚ β”‚
+ β–Ό β–Ό
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ ZFS Storage β”‚ β”‚ ZFS Storage β”‚
+ β”‚ (RAIDZ1) β”‚ β”‚ (RAIDZ1) β”‚
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
+ β”‚ β”‚
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
+ β”‚ β”‚
+ β–Ό β–Ό
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
+ β”‚ Backup Node β”‚
+ β”‚ (ZFS RAIDZ1) β”‚
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
diff --git a/procedures/runv1.md b/procedures/runv1.md
new file mode 100644
index 0000000..6c78a31
--- /dev/null
+++ b/procedures/runv1.md
@@ -0,0 +1,107 @@
+πŸ“œ Genesis Radio Mission Control Runbook (v1)
+πŸ›‘οΈ Genesis Radio Mission Control: Ops Runbook
+
+ Purpose:
+ Quickly diagnose and fix common Genesis Radio infrastructure issues without guesswork, even under pressure.
+
+🚨 If a Mount is Lost (Q:\ or R:)
+
+Symptoms:
+
+ Station playback errors
+
+ Skipping or dead air after a Station ID
+
+ Log shows: Audio Engine Timeout on Q:\ or R:\ paths
+
+Immediate Actions:
+
+ Check if drives Q:\ and R:\ are visible in Windows Explorer.
+
+ Open C:\genesis_rclone_mount.log and check last 10 lines.
+
+ Run Mount Guardian manually:
+
+ powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\mount_guardian.ps1"
+
+ Wait 15 seconds.
+
+ Verify that Q:\ and R:\ reappear.
+
+ If re-mounted, check logs for successful βœ… mount entry.
+
+If Mount Guardian fails to remount:
+
+ Check if rclone.exe is missing or updated incorrectly.
+
+ Check disk space on L:\ and X:\ cache drives.
+
+ Manually run rclone mounts with correct flags (see below).
+
+πŸ› οΈ Manual Rclone Mount Commands (Emergency)
+
+rclone mount genesisassets:genesisassets Q:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572
+
+rclone mount genesislibrary:genesislibrary R:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572
+
+βœ… Always mount assets (Q:) first, then library (R:).
+πŸ“¬ If Mastodon DMs a Mount Failure Alert
+
+Message example:
+
+ 🚨 Genesis Radio Ops: Failed to mount Q:\ after recovery attempt!
+
+Actions:
+
+ Immediately check C:\genesis_rclone_mount.log
+
+ Verify if the mount succeeded after retry
+
+ If not: manually run Mount Guardian
+
+ Escalate if disk space or critical cache drive failure suspected
+
+πŸ“Š If Dashboard Data Looks Broken
+
+Symptoms:
+
+ Health dashboard empty
+
+ No refresh
+
+ Tables missing
+
+Actions:
+
+ Check that healthcheck HTML generator is still scheduled.
+
+ SSH into Krang:
+
+systemctl status healthcheck.timer
+
+Restart healthcheck if necessary:
+
+ systemctl restart healthcheck.timer
+
+ Check /var/www/html/healthcheck.html timestamp.
+
+🧹 Log Rotation and Space
+
+ Logfile is rotated automatically weekly if over 5MB.
+
+ If needed manually:
+
+ powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\rotate_mount_logs.ps1"
+
+🐒 Critical Reminders (Go Slow to Go Fast)
+
+ Breathe. Double-check before restarting services.
+
+ Don't panic-restart Windows unless all mount attempts fail.
+
+ Document what you changed. Always.
+
+πŸ›‘οΈ Mission: Keep Genesis Radio running, clean, and stable.
+
+Scripters are smarter than panickers.
+Calm is contagious.