141 lines
8.7 KiB
Markdown
141 lines
8.7 KiB
Markdown
# Architecture & Design Decisions
|
|
|
|
This document explains the deeper design choices behind cg.cx - the trade-offs, threat models, and engineering rationale that shaped the system.
|
|
|
|
---
|
|
|
|
## Why XChaCha20-Poly1305 over AES-GCM?
|
|
|
|
We chose **XChaCha20-Poly1305** (via libsodium's `crypto_secretstream_xchacha20poly1305`) as the bulk encryption primitive for several reasons:
|
|
|
|
1. **Nonce-misuse resistance**: AES-GCM's security collapses catastrophically if a nonce is ever reused. XChaCha20 uses a 192-bit nonce, making accidental collisions statistically impossible even with billions of files. This removes an entire class of operator error.
|
|
2. **No hardware dependency**: AES-GCM performance relies heavily on AES-NI. XChaCha20 performs well on all platforms - including older or virtualized CPUs where AES-NI may be unavailable or disabled.
|
|
3. **Streaming integrity**: libsodium's `secretstream` API provides built-in chunked authenticated encryption with `Message` and `Final` tags. This gives us streaming decryption with per-chunk integrity checks without inventing our own framing protocol.
|
|
4. **Simpler key management**: Because nonce collisions are not a practical concern, we can generate a fresh random key for every file without tracking nonce counters or key lifecycles.
|
|
|
|
AES is still present in the system - we use **AES-256-KW** (Key Wrap) to encrypt the per-file content keys (CEKs) with the master key. AES-KW was chosen because it is a standard, deterministic, and widely audited key-wrapping algorithm with built-in integrity.
|
|
|
|
---
|
|
|
|
## Why SQLite over PostgreSQL?
|
|
|
|
For a self-hosted, single-tenant service handling encrypted file metadata, **SQLite** is the correct default:
|
|
|
|
1. **Operational simplicity**: No separate database server to install, upgrade, or network-secure. A single `.sqlite` file is trivial to back up, replicate, or inspect.
|
|
2. **WAL mode performance**: With `PRAGMA journal_mode = WAL`, SQLite handles concurrent readers and a single writer efficiently - enough for a bot + web server pair.
|
|
3. **Schema simplicity**: The schema is small (10 tables, 7 migration files). The overhead of a client/server RDBMS is unjustified.
|
|
4. **Deployment footprint**: Ideal for running on a small VPS or even an embedded edge device without container orchestration.
|
|
|
|
If future requirements demand horizontal scaling or heavy analytics, the repository pattern in `cgcx-db` makes it straightforward to swap in PostgreSQL without touching the bot or server code.
|
|
|
|
---
|
|
|
|
## Why a Modular 10-Crate Workspace?
|
|
|
|
The crate graph was designed to enforce architectural boundaries at compile time:
|
|
|
|
```
|
|
cgcx-core
|
|
▲
|
|
├── cgcx-config
|
|
├── cgcx-crypto
|
|
├── cgcx-db
|
|
├── cgcx-storage
|
|
├── cgcx-content-typing
|
|
│ ▲
|
|
│ └── cgcx-file-pipeline
|
|
├── cgcx-moderation
|
|
│
|
|
└── binaries: cgcx-bot, cgcx-server
|
|
```
|
|
|
|
- **cgcx-core** sits at the root and contains only pure data types. It has no I/O dependencies, making it safe to import anywhere.
|
|
- **cgcx-crypto** depends only on `cgcx-core`. It is side-effect-free and easy to property-test.
|
|
- **cgcx-db** and **cgcx-storage** are I/O crates but know nothing about Telegram or HTTP.
|
|
- **cgcx-file-pipeline** composes crypto, storage, typing, and DB into the upload workflow.
|
|
- The **binaries** are thin shells that wire configuration to the library crates.
|
|
|
|
This structure makes it impossible for a database query to accidentally invoke Telegram API code, or for HTTP handlers to directly touch the filesystem without going through the storage abstraction.
|
|
|
|
---
|
|
|
|
## Streaming Design for Large Files
|
|
|
|
Uploads from Telegram are bounded by Telegram's own file size limits (currently 2 GB for bots), but we still treat streaming as a first-class concern:
|
|
|
|
### Upload Path
|
|
|
|
1. The bot downloads the file into a `Vec<u8>` in memory.
|
|
2. The file pipeline encrypts the data in 1 MiB chunks, writing ciphertext directly to a temp file on disk.
|
|
3. After the final chunk is written and flushed, the temp file is atomically renamed to its final destination.
|
|
4. Only metadata (original name, MIME type, wrapped key, BLAKE3 hash) hits the database.
|
|
|
|
This ensures that even a 1 GB upload does not require a 1 GB contiguous memory allocation for ciphertext.
|
|
|
|
### Download Path
|
|
|
|
1. The Axum handler spawns a Tokio task that opens the encrypted file.
|
|
2. It reads the 24-byte secretstream header, unwraps the CEK, and initializes a `DecryptStream`.
|
|
3. A bounded MPSC channel (`capacity = 4`) decouples disk I/O from the HTTP response stream.
|
|
4. Ciphertext is read from disk in ~1 MiB chunks, decrypted, and sent through the channel.
|
|
5. Axum's `Body::from_stream` forwards plaintext chunks to the client as they are produced.
|
|
|
|
If the client disconnects mid-stream, the sender half of the channel is dropped and the decryption task exits cleanly. No full-file buffering occurs on the server.
|
|
|
|
---
|
|
|
|
## Security Threat Model
|
|
|
|
### What We Protect Against
|
|
|
|
| Threat | Mitigation |
|
|
|--------|------------|
|
|
| **Server compromise (passive)** | All files are encrypted at rest with per-file keys. An attacker with disk access cannot read plaintext without the master key. |
|
|
| **Database leak** | The database contains only wrapped keys, ciphertext hashes, and metadata. It does not contain plaintext or unwrapped CEKs. |
|
|
| **Ciphertext tampering** | XChaCha20-Poly1305 authenticates every chunk. Tampered files fail decryption and the stream aborts. |
|
|
| **Brute-force password guessing** | Per-content passwords are hashed with bcrypt. Rate limiting on `/api/content/:cxid/verify-password` slows online attacks. |
|
|
| **Cookie forgery** | Password session cookies include a BLAKE3 MAC keyed by the master key. Forging a cookie requires knowledge of the master key. |
|
|
| **Replay / enumeration** | Content IDs are 12-character random strings with ~71 bits of entropy. They are not sequential. |
|
|
| **Malicious uploads** | Content typing flags executable, HTML, and script MIME types. The frontend refuses to inline dangerous files. |
|
|
|
|
### What We Do Not Protect Against
|
|
|
|
| Threat | Rationale |
|
|
|--------|-----------|
|
|
| **Active server compromise (key extraction)** | If an attacker gains code execution and reads the master key from memory or env, they can decrypt all content. This is an inherent limitation of server-side encryption. |
|
|
| **Telegram MitM** | We trust Telegram's bot API transport (HTTPS) and file CDN. |
|
|
| **Client-side malware** | The user's browser or device may be compromised; we cannot protect plaintext after decryption. |
|
|
| **Denial of Service** | Large uploads and high request volumes can exhaust disk or bandwidth. Rate limiting and upload size caps mitigate but do not eliminate this risk. |
|
|
|
|
### Trust Boundaries
|
|
|
|
```
|
|
[User Device] --HTTPS--> [Telegram Cloud] --HTTPS--> [cg.cx Bot]
|
|
|
|
|
[Browser] <--HTTPS--> [cg.cx Server] <--------┘
|
|
|
|
|
Decrypted plaintext rendered in browser
|
|
```
|
|
|
|
The **cg.cx server** is a trusted party for decryption and delivery. It is not a true "end-to-end" system in the Signal sense, because the server must unwrap keys to stream content to browsers that do not possess the master key. The architecture prioritizes **usable sharing** (anyone with a link can view) over **true E2EE** (which would require client-side JavaScript crypto and key distribution).
|
|
|
|
---
|
|
|
|
## Hashing for Deduplication and Blacklist
|
|
|
|
`cgcx-crypto` computes a **BLAKE3 hash over the ciphertext stream** (including the secretstream header) for tamper detection. This hash is stored per-file in `content_files.encrypted_hash`.
|
|
|
|
In addition, the file pipeline now computes a **plaintext BLAKE3 hash** during ingestion:
|
|
1. A running hash of the plaintext chunks is computed alongside encryption.
|
|
2. The resulting `plaintext_hash` is stored in `content_files` and used for deduplication — when identical plaintext is uploaded, the existing encrypted file is reused and its `ref_count` is incremented.
|
|
3. A `hash_blacklist` table (migration `007_hash_blacklist.sql`) allows moderators to block re-uploads of known-banned content by its plaintext hash. The pipeline checks this blacklist before storing any new file and rejects blocked content with a `BlockedHash` error.
|
|
|
|
---
|
|
|
|
## Future Considerations
|
|
|
|
- **Client-side decryption**: A future iteration could deliver the wrapped CEK to the browser and decrypt via WebAssembly / libsodium-js. This would remove the server from the trust boundary for delivery.
|
|
- **S3-compatible backends**: `cgcx-storage` could be abstracted into a trait to support object storage.
|
|
- **PostgreSQL backend**: The repository trait pattern in `cgcx-db` is amenable to an async SQLx implementation.
|
|
- **Metrics and alerting**: Structured tracing is in place; a metrics exporter (Prometheus) could be added to `cgcx-server` without touching business logic.
|