diff options
author | doc <doc@filenotfound.org> | 2025-06-30 20:06:28 +0000 |
---|---|---|
committer | doc <doc@filenotfound.org> | 2025-06-30 20:06:28 +0000 |
commit | 717fcb9c81d2bc3cc7a84a3ebea6572d7ff0f5cf (patch) | |
tree | 7cbd6a8d5046409a82b22d34b01aac93b3e24818 /procedures | |
parent | 8368ff389ec596dee6212ebeb85e01c638364fb3 (diff) |
Diffstat (limited to 'procedures')
-rw-r--r-- | procedures/GROWL.md | 111 | ||||
-rw-r--r-- | procedures/OPS.md | 154 | ||||
-rwxr-xr-x | procedures/buildandbpack.sh | 59 | ||||
-rw-r--r-- | procedures/databasecluster.md | 87 | ||||
-rw-r--r-- | procedures/decom.md | 57 | ||||
-rw-r--r-- | procedures/genesis_uptime_monitor.md | 57 | ||||
-rw-r--r-- | procedures/infrastructure.md | 86 | ||||
-rw-r--r-- | procedures/map.md | 85 | ||||
-rw-r--r-- | procedures/mastodon/mastodon-content-policy.md | 24 | ||||
-rw-r--r-- | procedures/mastodon/mastodon-maintenance-policy.md | 24 | ||||
-rw-r--r-- | procedures/mastodon/mastodon-moderation-policy.md | 26 | ||||
-rw-r--r-- | procedures/mastodon/mastodon-uptime-policy.md | 22 | ||||
-rw-r--r-- | procedures/mastodon/mastodon-user-policy.md | 26 | ||||
-rw-r--r-- | procedures/planned_db_cluster_ZFS.md | 34 | ||||
-rw-r--r-- | procedures/runv1.md | 107 |
15 files changed, 959 insertions, 0 deletions
diff --git a/procedures/GROWL.md b/procedures/GROWL.md new file mode 100644 index 0000000..119682d --- /dev/null +++ b/procedures/GROWL.md @@ -0,0 +1,111 @@ +# GROWL β Genesis Radio Commit Style Guide + +--- + +## π‘οΈ Purpose + +To keep our Git commit history **clean, calm, and clear** β +even during chaos, downtime, or tired late-night edits. + +Every commit should **GROWL**: + +| Letter | Meaning | +|:---|:---| +| **G** | Good | +| **R** | Readable | +| **O** | Obvious | +| **W** | Well-Scoped | +| **L** | Logical | + +--- + +## π§ GROWL Principles + +### **G β Good** + +Write clear, helpful commit messages. +Imagine your future self β tired, panicked β trying to understand what you did. + +**Bad:** +`update` + +**Good:** +`Fix retry logic for mount guardian script` + +--- + +### **R β Readable** + +Use short, plain English sentences. +No cryptic shorthand. No weird abbreviations. + +**Bad:** +`fx psh scrpt` + +**Good:** +`Fix powershell script argument passing error` + +--- + +### **O β Obvious** + +The commit message should explain what changed without needing a diff. + +**Bad:** +`misc` + +**Good:** +`Add dark mode CSS to healthcheck dashboard` + +--- + +### **W β Well-Scoped** + +One logical change per commit. +Don't fix five things at once unless they're tightly related. + +**Bad:** +`fix mount issues, added healthcheck, tweaked retry` + +**Good:** +`Fix asset mount detection timing issue` + +(And then a separate commit for healthcheck tweaks.) + +--- + +### **L β Logical** + +Commits should build logically. +Each one should bring the repo to a **better, deployable state** β not leave it broken. + +**Bad:** +Commit partial broken code just because "I need to leave soon." + +**Good:** +Finish a working block, then commit. + +--- + +## π Quick GROWL Checklist Before You Push: + +- [ ] Is my message clear to a stranger? +- [ ] Did I only change one logical thing? +- [ ] Can I tell from the commit what changed, without a diff? +- [ ] Would sleepy me at 3AM thank me for writing this? + +--- + +## ποΈ Why We GROWL + +Because panic, fatigue, or adrenaline can't be avoided β +but **good habits under pressure can save a system** (and a future you) every time. + +Stay calm. +Make it obvious. +Let it GROWL. + +--- + +# πΊ Genesis Radio Operations +*Built with pride. Built to last.* diff --git a/procedures/OPS.md b/procedures/OPS.md new file mode 100644 index 0000000..63f0e28 --- /dev/null +++ b/procedures/OPS.md @@ -0,0 +1,154 @@ +# π Genesis Radio - Healthcheck Response Runbook + +## Purpose +When an alert fires (Critical or Warning), this guide tells you what to do so that **any team member** can react quickly, even if the admin is not available. + +--- + +## π οΈ How to Use +- Every Mastodon DM or Dashboard alert gives you a **timestamp**, **server name**, and **issue**. +- Look up the type of issue in the table below. +- Follow the recommended action immediately. + +--- + +## π Quick Response Table + +| Type of Alert | Emoji | What it Means | Immediate Action | +|:---|:---|:---|:---| +| [Critical Service Failure](#critical-service-failure-) | π | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | +| [Disk Filling Up](#disk-filling-up-) | π | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | +| [Rclone Mount Error](#rclone-mount-error-) | π’ | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | +| [PostgreSQL Replication Lag](#postgresql-replication-lag-) | π₯ | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | +| [RAID Degraded](#raid-degraded-) | π§Έ | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | +| [Log File Warnings](#log-file-warnings-) | β οΈ | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | + +--- + +## π» If Dashboard Shows +- β
**All Green** = No action needed. +- β οΈ **Warnings** = Investigate soon. Not urgent unless repeated. +- π¨ **Criticals** = Drop everything and act immediately. + +--- + +## π‘οΈ Emergency Contacts +| Role | Name | Contact | +|:----|:-----|:--------| +| Primary Admin | (You) | [845-453-0820] | +| Secondary | Brice | [BRICE CONTACT INFO] | + +(Replace placeholders with actual contact details.) + +--- + +## βοΈ Example Cheat Sheet for Brice + +**Sample Mastodon DM:** +> π¨ Genesis Radio Critical Healthcheck 2025-04-28 14:22:33 π¨ +> β‘ 1 critical issue found: +> - π [mastodon] CRITICAL: Service mastodon-web not running! + +**Brice should:** +1. SSH into Mastodon server. +2. Run `systemctl restart mastodon-web`. +3. Confirm the service is running again. +4. If it fails or stays down, escalate to admin. + +--- + +# π TL;DR +- π¨ Criticals: Act immediately. +- β οΈ Warnings: Investigate soon. +- β
Healthy: No action needed. + +--- + +# π οΈ Genesis Radio - Detailed Ops Playbook + +## Critical Service Failure (π) +**Symptoms:** Service marked as CRITICAL. + +**Fix:** +1. SSH into server. +2. `sudo systemctl status <service>` +3. `sudo systemctl restart <service>` +4. Confirm running. Check logs if it fails. + +--- + +## Disk Filling Up (π) +**Symptoms:** Disk space critically low. + +**Fix:** +1. SSH into server. +2. `df -h` +3. Delete old logs: + ```bash + sudo rm -rf /var/log/*.gz /var/log/*.[0-9] + sudo journalctl --vacuum-time=2d + ``` +4. If still low, find big files and clean. + +--- + +## Rclone Mount Error (π’) +**Symptoms:** Mount failure or slowness. + +**Fix:** +1. SSH into SPL server. +2. Unmount & remount: + ```bash + sudo fusermount -uz /path/to/mount + sudo systemctl restart rclone@<mount> + ``` +3. Confirm mount is active. + +--- + +## PostgreSQL Replication Lag (π₯) +**Symptoms:** Replica database lagging. + +**Fix:** +1. SSH into replica server. +2. Check lag: + ```bash + sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;" + ``` +3. Restart PostgreSQL if stuck. +4. Monitor replication logs. + +--- + +## RAID Degraded (π§Έ) +**Symptoms:** RAID missing a disk. + +**Fix:** +1. SSH into server. +2. `cat /proc/mdstat` +3. Find failed drive: + ```bash + sudo mdadm --detail /dev/md0 + ``` +4. Replace failed disk, rebuild array: + ```bash + sudo mdadm --add /dev/md0 /dev/sdX + ``` + +--- + +## Log File Warnings (β οΈ) +**Symptoms:** Errors in syslog or nginx. + +**Fix:** +1. SSH into server. +2. Review logs: + ```bash + grep ERROR /var/log/syslog + ``` +3. Investigate. Escalate if necessary. + +--- + +**Stay sharp. Early fixes prevent major downtime!** π‘οΈπͺ + diff --git a/procedures/buildandbpack.sh b/procedures/buildandbpack.sh new file mode 100755 index 0000000..cdc3564 --- /dev/null +++ b/procedures/buildandbpack.sh @@ -0,0 +1,59 @@ +#!/usr/bin/env bash +set -euo pipefail + +# π§ CONFIGURATION +FREEBSD_BRANCH="stable/14" +KERNCONF="STORAGE_ZFS" +MAKEJOBS=$(nproc) +BUILDROOT="$HOME/freebsd-kernel-build" +OBJDIR="/tmp/obj" +TOOLCHAIN_BIN="/tmp/amd64.amd64/usr/bin" + +# π± Step 1: Prep Environment +mkdir -p "$BUILDROOT" +cd "$BUILDROOT" + +# π» Step 2: Get FreeBSD source +if [ ! -d "src" ]; then + git clone https://git.freebsd.org/src.git + cd src + git checkout "$FREEBSD_BRANCH" +else + cd src + git fetch + git checkout "$FREEBSD_BRANCH" + git pull +fi + +# π οΈ Step 3: Build FreeBSD toolchain (only once) +if [ ! -d "$TOOLCHAIN_BIN" ]; then + echo "[*] Bootstrapping FreeBSD native-xtools..." + bmake XDEV=amd64 XDEV_ARCH=amd64 native-xtools +else + echo "[*] Toolchain already built. Skipping..." +fi + +# π Step 4: Prepare kernel config +cd "$BUILDROOT/src/sys/amd64/conf" +if [ ! -f "$KERNCONF" ]; then + cp GENERIC "$KERNCONF" + echo "[*] Created new kernel config from GENERIC: $KERNCONF" +fi + +# π§ Step 5: Build the kernel +export PATH="$TOOLCHAIN_BIN:$PATH" +export MAKEOBJDIRPREFIX="$OBJDIR" + +cd "$BUILDROOT/src" +bmake -j"$MAKEJOBS" buildkernel TARGET=amd64 TARGET_ARCH=amd64 KERNCONF="$KERNCONF" + +# π¦ Step 6: Package the kernel +KERNEL_OUT="$OBJDIR/$BUILDROOT/src/amd64.amd64/sys/$KERNCONF" +PACKAGE_NAME="freebsd-kernel-$(date +%Y%m%d-%H%M%S).tar.gz" + +tar czf "$BUILDROOT/$PACKAGE_NAME" -C "$KERNEL_OUT" kernel + +# π£ Done +echo "β
Kernel build and package complete." +echo "β‘οΈ Output: $BUILDROOT/$PACKAGE_NAME" + diff --git a/procedures/databasecluster.md b/procedures/databasecluster.md new file mode 100644 index 0000000..1c26165 --- /dev/null +++ b/procedures/databasecluster.md @@ -0,0 +1,87 @@ +# Database Cluster (baboon.sshjunkie.com) + +## Overview +The database cluster consists of two PostgreSQL database servers hosted on `baboon.sshjunkie.com`. These servers are used to store data for services such as Mastodon and AzuraCast. The cluster ensures high availability and fault tolerance through replication and backup strategies. + +## Installation +Install PostgreSQL on both nodes in the cluster: + +```bash +# Update package list and install PostgreSQL +sudo apt update +sudo apt install -y postgresql postgresql-contrib + +# Ensure PostgreSQL is running +sudo systemctl start postgresql +sudo systemctl enable postgresql +``` + +## Configuration +### PostgreSQL Configuration Files: +- **pg_hba.conf**: + - Allow replication and local connections. + - Example: + ```ini + local all postgres md5 + host replication all 192.168.0.0/16 md5 + ``` +- **postgresql.conf**: + - Set `wal_level` for replication: + ```ini + wal_level = hot_standby + max_wal_senders = 3 + ``` + +### Replication Configuration: +- Set up streaming replication between the two nodes (`baboon.sshjunkie.com` as the master and the second node as the replica). + +1. On the master node, enable replication and restart PostgreSQL. +2. On the replica node, set up replication by copying the data directory from the master node and configure the `recovery.conf` file. + +Example `recovery.conf` on the replica: +```ini +standby_mode = on +primary_conninfo = 'host=baboon.sshjunkie.com port=5432 user=replicator password=your_password' +trigger_file = '/tmp/postgresql.trigger.5432' +``` + +## Usage +- **Check the status of PostgreSQL**: + ```bash + sudo systemctl status postgresql + ``` + +- **Promote the replica to master**: + ```bash + pg_ctl promote -D /var/lib/postgresql/data + ``` + +## Backups +Use `pg_basebackup` to create full backups of the cluster. Example: + +```bash +pg_basebackup -h baboon.sshjunkie.com -U replicator -D /backups/db_backup -Ft -z -P +``` + +Automate backups with cronjobs for regular snapshots. + +## Troubleshooting +- **Issue**: Replica is lagging behind. + - **Solution**: Check network connectivity and ensure the replica is able to connect to the master node. Monitor replication lag with: + ```bash + SELECT * FROM pg_stat_replication; + ``` + +## Monitoring +- **Monitor replication status**: + ```bash + SELECT * FROM pg_stat_replication; + ``` + +- **Monitor database health**: + ```bash + pg_isready + ``` + +## Additional Information +- [PostgreSQL Streaming Replication Documentation](https://www.postgresql.org/docs/current/warm-standby.html) diff --git a/procedures/decom.md b/procedures/decom.md new file mode 100644 index 0000000..525e295 --- /dev/null +++ b/procedures/decom.md @@ -0,0 +1,57 @@ +# ποΈ Decommissioning Checklist for `shredderv1` + +**Date:** 2025-05-01 + +--- + +## π 1. Verify Nothing Critical Is Running +- [ ] Confirm all services (e.g., AzuraCast, Docker containers, media playback) have **been migrated** +- [ ] Double-check DNS entries (e.g., CNAMEs or A records) have been **updated to the new server** +- [ ] Ensure any **active mounts, Rclone remotes, or scheduled tasks** are disabled + +--- + +## π¦ 2. Migrate/Preserve Data +- [ ] Backup and copy remaining relevant files (station configs, logs, recordings, playlists) +- [ ] Verify data was successfully migrated to the new ZFS-based AzuraCast VM +- [ ] Remove temporary backup files and export archives + +--- + +## π§Ή 3. Remove from Infrastructure +- [ ] Remove from monitoring tools (e.g., Prometheus, Nagios, Grafana) +- [ ] Remove from Ansible inventory or configuration management systems +- [ ] Remove any scheduled crons or automation hooks targeting this VM + +--- + +## π§ 4. Disable and Secure +- [ ] Power down services (`docker stop`, `systemctl disable`, etc.) +- [ ] Disable remote access (e.g., SSH keys, user accounts) +- [ ] Lock or archive internal credentials (e.g., API tokens, DB creds, rclone configs) + +--- + +## π§½ 5. Wipe or Reclaim Resources +- [ ] If VM: Delete or archive VM snapshot in Proxmox or hypervisor +- [ ] If physical: Securely wipe disks (e.g., `shred`, `blkdiscard`, or DBAN) +- [ ] Reclaim IP address (e.g., assign to new ZFS-based VM) + +--- + +## π 6. Documentation & Closure +- [ ] Log the decommission date in your infrastructure inventory or documentation +- [ ] Tag any previous support tickets/issues as βResolved (Decommissioned)β +- [ ] Inform team members that `shredderv1` has been retired + +--- + +## π« Final Step +```bash +shutdown -h now +``` + +Or if you're feeling dramatic: +```bash +echo "Goodnight, sweet prince." && shutdown -h now +``` diff --git a/procedures/genesis_uptime_monitor.md b/procedures/genesis_uptime_monitor.md new file mode 100644 index 0000000..6505f06 --- /dev/null +++ b/procedures/genesis_uptime_monitor.md @@ -0,0 +1,57 @@ +# Genesis Uptime Monitor + +This package sets up a simple service uptime tracker on your local server (e.g., Krang). It includes: + +- A Python Flask API to report 24-hour uptime +- A bash script to log uptime results every 5 minutes +- A systemd unit to keep the API running + +## Setup Instructions + +### 1. Install Requirements + +```bash +sudo apt install python3-venv curl +cd ~ +python3 -m venv genesis_api +source genesis_api/bin/activate +pip install flask +``` + +### 2. Place Files + +- `uptime_server.py` β `/home/doc/uptime_server.py` +- `genesis_check.sh` β `/usr/local/bin/genesis_check.sh` (make it executable) +- `genesis_uptime_api.service` β `/etc/systemd/system/genesis_uptime_api.service` + +### 3. Enable Cron + +Edit your crontab with `crontab -e` and add: + +```cron +*/5 * * * * /usr/local/bin/genesis_check.sh +``` + +### 4. Start API Service + +```bash +sudo systemctl daemon-reload +sudo systemctl enable --now genesis_uptime_api +``` + +Then browse to `http://localhost:5000/api/uptime/radio` + +## Web Integration + +In your HTML, add a div and script like this: + +```html +<div id="radioUptime"><small>Uptime: Loadingβ¦</small></div> +<script> +fetch('/api/uptime/radio') + .then(r => r.json()) + .then(data => { + document.getElementById('radioUptime').innerHTML = `<small>Uptime: ${data.uptime}% (24h)</small>`; + }); +</script> +``` diff --git a/procedures/infrastructure.md b/procedures/infrastructure.md new file mode 100644 index 0000000..65c8eb8 --- /dev/null +++ b/procedures/infrastructure.md @@ -0,0 +1,86 @@ +# π Genesis Radio Infrastructure Overview +**Date:** April 30, 2025 +**Prepared by:** Doc + +--- + +## ποΈ Infrastructure Summary + +Genesis Radio now operates a fully segmented, secure, and performance-tuned backend suitable for enterprise-grade broadcasting and media delivery. The infrastructure supports high availability (HA) principles for storage and platform independence for core services. + +--- + +## π§± Core Components + +### ποΈ Genesis Radio Services +- **StationPlaylist (SPL)**: Windows-based automation system, mounts secure object storage as drives via rclone +- **Voice Tracker (Remote Access)**: Synced with SPL backend and available to authorized remote users +- **Azuracast (Secondary automation)**: Dockerized platform running on dedicated VM +- **Mastodon (Community)**: Hosted in Docker with separate PostgreSQL cluster and MinIO object storage + +--- + +## πΎ Storage Architecture + +| Feature | Status | +|-----------------------------|---------------------------| +| Primary Storage Backend | MinIO on `shredderv2` | +| Storage Filesystem | ZFS RAID-Z1 | +| Encryption | Enabled (per-bucket S3 SSE) | +| Buckets (Scoped) | `genesislibrary-secure`, `genesisassets-secure`, `genesisshows-secure`, `mastodonassets-secure` | +| Snapshot Capability | β
(ZFS native snapshots) | +| Caching | SSD-backed rclone VFS cache per mount | + +--- + +## π‘οΈ Security & Access Control + +- TLS for all services (Let's Encrypt) +- MinIO Console behind HTTPS (`consolev2.sshjunkie.com`) +- User policies applied per-bucket (read/write scoped) +- Server-to-server rsync/rclone over SSH + +--- + +## π Backup & Recovery + +- Dedicated backup server with SSH access +- Nightly rsync for show archives and Mastodon data +- Snapshot replication via `zfs send | ssh backup zfs recv` planned +- Manual and automated snapshot tools + +--- + +## π Monitoring & Observability + +| Component | Status | Notes | +|------------------|--------------|------------------------------| +| System Monitoring| `vmstat`, `watch`, custom CLI tools | +| Log Aggregation | Centralized on pyapps VM | +| Prometheus | Partial (used with ClusterControl) | +| Alerts | Mastodon warning bot, Telegram planned | + +--- + +## π¦ Current Migration Status + +| Component | Status | Notes | +|------------------|----------------|---------------------------------| +| Mastodon Assets | β
Migrated | Verified, encrypted, ZFS snapshotted | +| Genesis Library | β
Migrated | Synced from backup server | +| Genesis Assets | β
Migrated | Cleanup of shows in progress | +| Genesis Shows | β
Migrated | Pulled from same source, cleanup to follow | +| Azuracast | Migrated | Staged and restored from staging + +--- + +## π§ Next Steps + +- Clean up misplaced show files in assets bucket +- Automate ZFS snapshot replication +- Consider Grafana/Prometheus dashboard for real-time metrics +- Continue phasing out legacy containers (LXC β full VMs) + +--- + +This infrastructure is stable, secure, and built for scale. Further improvements will refine observability, automate recovery, and enhance multi-user coordination. diff --git a/procedures/map.md b/procedures/map.md new file mode 100644 index 0000000..3fd39a7 --- /dev/null +++ b/procedures/map.md @@ -0,0 +1,85 @@ +# Genesis Radio Internal Architecture Map + +--- + +## π’ Core Infrastructure + +| System | Purpose | Location | +|:---|:---|:---| +| Krang | Main admin server / script runner / monitoring node | On-premises / VM | +| SPL Server (Windows) | StationPlaylist Studio automation and playout system | On-premises / VM | +| Shredder | MinIO Object Storage / Cache server | On-premises / VM | +| PostgreSQL Cluster (db1/db2) | Mastodon database backend / Other app storage | Clustered VMs | +| Mastodon Server | Frontend social interface for alerts, community | Hosted at `chatwithus.live` | + +--- + +## π§ Automation Components + +| Component | Description | Hosted Where | +|:---|:---|:---| +| `mount_guardian.ps1` | Automatically ensures Rclone mounts (Q:\ and R:\) are up | SPL Server (Windows) | +| `rotate_mount_logs.ps1` | Weekly log rotation for mount logs | SPL Server (Windows) | +| `healthcheck.py` | Multi-node health and service monitor | Krang | +| Mastodon DM Alerts | Immediate alerting if something breaks (Mounts, Services) | Krang via API | +| Genesis Mission Control Landing Page | Web dashboard with Commandments + Live Healthcheck | Hosted on Krang's Nginx | + +--- + +## ποΈ Storage and Streaming + +| Mount | Purpose | Backed by | +|:---|:---|:---| +| Q:\ (Assets) | Station IDs, sweepers, intro/outros, promos | GenesisAssets Bucket (Rclone) | +| R:\ (Library) | Full music library content | GenesisLibrary Bucket (Rclone) | + +β
Primary Cache: `L:\` (SSD) +β
Secondary Cache: `X:\` (Spinning HDD) + +--- + +## π‘ Communications + +| Alert Type | How Sent | +|:---|:---| +| Mount Failures | Direct Mastodon DM | +| Healthcheck Failures (Disk, Service, SMART, RAID) | Direct Mastodon DM | +| Git Push Auto-Retry Failures (optional future upgrade) | Potential Mastodon DM | + +--- + +## π GitOps Flow + +| Step | Script | Behavior | +|:---|:---|:---| +| Save changes | giteapush.sh | Auto stage, commit (timestamped), push to Gitea | +| Retry failed push | giteapush.sh auto-retry block | Up to 3x tries with 5-second gaps | +| Repo status summary | giteapush.sh final step | Clean `git status -sb` displayed | + +β
Follows GROWL commit style: +Good, Readable, Obvious, Well-Scoped, Logical. + +--- + +## π Policies and Procedures + +| Document | Purpose | +|:---|:---| +| `OPS.md` | Healthcheck Runbook and Service Recovery Instructions | +| `GROWL.md` | Git Commit Message Style Guide | +| `Mission Control Landing Page` | Browser homepage with live dashboard + ops philosophy | + +--- + +## π‘οΈ Key Principles + +- Calm is Contagious. +- Go Slow to Go Fast. +- Snappy Snaps Save Lives. +- Scripts are Smarter Than Sleepy Admins. +- If You Didn't Write It Down, It Didn't Happen. + +--- + +# ποΈ Genesis Radio Ops +Built with pride. Built to last. π‘οΈπ diff --git a/procedures/mastodon/mastodon-content-policy.md b/procedures/mastodon/mastodon-content-policy.md new file mode 100644 index 0000000..09bb359 --- /dev/null +++ b/procedures/mastodon/mastodon-content-policy.md @@ -0,0 +1,24 @@ +# Mastodon Content Policy + +Genesis Hosting Technologies supports a variety of voices on **chatwithus.live** β but not at the cost of safety or legality. + +## Allowed Content + +- Personal posts, art, tech content, memes, news + + +## Prohibited Content + +- Hate speech or glorification of hate groups +- Violent extremism +- Sexual content involving minors (real or fictional) +- Cryptocurrency scams, pyramid schemes + +## Bots & Automation + +- Allowed only with prior approval +- Must include a descriptive profile and clear opt-out methods + +## Creative Commons / Attribution + +- Users posting CC-licensed or open-source content should include attribution where applicable diff --git a/procedures/mastodon/mastodon-maintenance-policy.md b/procedures/mastodon/mastodon-maintenance-policy.md new file mode 100644 index 0000000..7dc56c9 --- /dev/null +++ b/procedures/mastodon/mastodon-maintenance-policy.md @@ -0,0 +1,24 @@ +# Mastodon Maintenance Policy + +We adhere to structured maintenance windows for **chatwithus.live** to ensure reliability without disrupting users. + +## Weekly Maintenance + +- **Window**: Sundays, 7 PM β 9 PM Eastern Time +- Routine updates (OS, Docker images, dependencies) +- Asset rebuilds, minor database tune-ups + +## Emergency Maintenance + +- Patching vulnerabilities (e.g., CVEs) +- Redis/PostgreSQL crash recovery +- Federation or relay failures + +## Notifications + +- Posted to Mastodon via @administration at least 1 hour in advance +- Maintenance announcements also pushed to the server status page + +## Failures During Maintenance + +- If the instance does not recover within 30 minutes, full rollback initiated diff --git a/procedures/mastodon/mastodon-moderation-policy.md b/procedures/mastodon/mastodon-moderation-policy.md new file mode 100644 index 0000000..4ac13da --- /dev/null +++ b/procedures/mastodon/mastodon-moderation-policy.md @@ -0,0 +1,26 @@ +# Mastodon Moderation Policy + +Moderation is essential to protecting the health of **chatwithus.live**. + +## Enforcement + +- Reports reviewed by admin/mod team within 24 hours +- Immediate suspension for: + - Threats of violence + - Doxxing or credible harassment + - Hosting or linking CSAM, gore, or hate groups + +## Report Processing + +- All reports logged with timestamps and notes +- Outcomes recorded and reviewed monthly for fairness + +## Appeal Process + +- Users may appeal a moderation decision by opening a ticket via WHMCS +- Appeals are reviewed by at least two moderators + +## Transparency + +- Moderation decisions and defederation actions are optionally listed at `/about/more` +- Annual transparency reports summarize key moderation stats diff --git a/procedures/mastodon/mastodon-uptime-policy.md b/procedures/mastodon/mastodon-uptime-policy.md new file mode 100644 index 0000000..58fc5bf --- /dev/null +++ b/procedures/mastodon/mastodon-uptime-policy.md @@ -0,0 +1,22 @@ +# Mastodon Uptime Policy + +Genesis Hosting Technologies strives to maintain high availability for our Mastodon instance at **chatwithus.live**. + +## Availability Target + +- **Uptime Goal**: 99.5% monthly (approx. 3.5 hours of downtime max) +- We consider chatwithus.live "unavailable" when: + - The web UI fails to load or times out + - Toot delivery is delayed by >10 minutes + - Federation is broken for more than 30 minutes + +## Redundancy + +- PostgreSQL cluster with HA failover +- Redis and Sidekiq monitored 24/7 +- Mastodon is backed by ZFS storage and hourly snapshots + +## Exceptions + +- Scheduled maintenance (see Maintenance Policy) +- DDoS or external platform failures (e.g., relay outages) diff --git a/procedures/mastodon/mastodon-user-policy.md b/procedures/mastodon/mastodon-user-policy.md new file mode 100644 index 0000000..139b53c --- /dev/null +++ b/procedures/mastodon/mastodon-user-policy.md @@ -0,0 +1,26 @@ +# Mastodon User Policy + +This document governs behavior on our Mastodon instance **chatwithus.live**. + +## Behavior Expectations + +- No harassment, hate speech, or targeted abuse +- No spam, bots, or auto-posting without permission +- No doxxing or sharing of private information + +## Federation + +- Defederated instances may not be interacted with via this server +- Federation decisions are made by the moderation team + +## Account Management + +- Inactive accounts with 0 posts may be purged after 90 days +- Users must keep a valid email address on file +- Multiple accounts are allowed, but abuse may result in bans + +## Banned Activities + +- Disruptive scraping or crawling of the API +- Hosting or linking to malware/phishing content +- Evading moderation decisions with alternate accounts diff --git a/procedures/planned_db_cluster_ZFS.md b/procedures/planned_db_cluster_ZFS.md new file mode 100644 index 0000000..63be9d6 --- /dev/null +++ b/procedures/planned_db_cluster_ZFS.md @@ -0,0 +1,34 @@ +# πΊοΈ PostgreSQL High-Availability Architecture with ZFS (Genesis Hosting) + +```plaintext + ββββββββββββββββββββββββββββββββ + β Client Applications β + ββββββββββββββ¬ββββββββββββββββββ + β + βΌ + βββββββββββββββββββ + β HAProxy β + β (Load Balancer) β + ββββββββββ¬βββββββββ + β + ββββββββββββββ΄βββββββββββββ + β β + βΌ βΌ + ββββββββββββββββ ββββββββββββββββ + β Primary Node β β Replica Node β + β (DB Server) β β (DB Server) β + ββββββββ¬ββββββββ ββββββββ¬ββββββββ + β β + βΌ βΌ + ββββββββββββββββ ββββββββββββββββ + β ZFS Storage β β ZFS Storage β + β (RAIDZ1) β β (RAIDZ1) β + ββββββββββββββββ ββββββββββββββββ + β β + ββββββββββ¬βββββββββ¬βββββββββ + β β + βΌ βΌ + ββββββββββββββββ + β Backup Node β + β (ZFS RAIDZ1) β + ββββββββββββββββ diff --git a/procedures/runv1.md b/procedures/runv1.md new file mode 100644 index 0000000..6c78a31 --- /dev/null +++ b/procedures/runv1.md @@ -0,0 +1,107 @@ +π Genesis Radio Mission Control Runbook (v1) +π‘οΈ Genesis Radio Mission Control: Ops Runbook + + Purpose: + Quickly diagnose and fix common Genesis Radio infrastructure issues without guesswork, even under pressure. + +π¨ If a Mount is Lost (Q:\ or R:) + +Symptoms: + + Station playback errors + + Skipping or dead air after a Station ID + + Log shows: Audio Engine Timeout on Q:\ or R:\ paths + +Immediate Actions: + + Check if drives Q:\ and R:\ are visible in Windows Explorer. + + Open C:\genesis_rclone_mount.log and check last 10 lines. + + Run Mount Guardian manually: + + powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\mount_guardian.ps1" + + Wait 15 seconds. + + Verify that Q:\ and R:\ reappear. + + If re-mounted, check logs for successful β
mount entry. + +If Mount Guardian fails to remount: + + Check if rclone.exe is missing or updated incorrectly. + + Check disk space on L:\ and X:\ cache drives. + + Manually run rclone mounts with correct flags (see below). + +π οΈ Manual Rclone Mount Commands (Emergency) + +rclone mount genesisassets:genesisassets Q:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572 + +rclone mount genesislibrary:genesislibrary R:\ --vfs-cache-mode writes --vfs-cache-max-size 3T --vfs-cache-max-age 48h --vfs-read-ahead 1G --buffer-size 1G --cache-dir L:\assetcache --cache-dir X:\cache --no-traverse --rc --rc-addr :5572 + +β
Always mount assets (Q:) first, then library (R:). +π¬ If Mastodon DMs a Mount Failure Alert + +Message example: + + π¨ Genesis Radio Ops: Failed to mount Q:\ after recovery attempt! + +Actions: + + Immediately check C:\genesis_rclone_mount.log + + Verify if the mount succeeded after retry + + If not: manually run Mount Guardian + + Escalate if disk space or critical cache drive failure suspected + +π If Dashboard Data Looks Broken + +Symptoms: + + Health dashboard empty + + No refresh + + Tables missing + +Actions: + + Check that healthcheck HTML generator is still scheduled. + + SSH into Krang: + +systemctl status healthcheck.timer + +Restart healthcheck if necessary: + + systemctl restart healthcheck.timer + + Check /var/www/html/healthcheck.html timestamp. + +π§Ή Log Rotation and Space + + Logfile is rotated automatically weekly if over 5MB. + + If needed manually: + + powershell.exe -ExecutionPolicy Bypass -File "C:\scripts\rotate_mount_logs.ps1" + +π’ Critical Reminders (Go Slow to Go Fast) + + Breathe. Double-check before restarting services. + + Don't panic-restart Windows unless all mount attempts fail. + + Document what you changed. Always. + +π‘οΈ Mission: Keep Genesis Radio running, clean, and stable. + +Scripters are smarter than panickers. +Calm is contagious. |