457 lines
22 KiB
Markdown
457 lines
22 KiB
Markdown
# cg.cx
|
|
|
|
> End-to-end encrypted content sharing via Telegram - with a modern web frontend.
|
|
|
|
**cg.cx** is a privacy-first file and text sharing platform built as a Telegram bot and Axum web service. Users upload content through a Telegram bot; the service encrypts every file with unique per-content keys, stores them securely, and shares them via short 12-character IDs. Recipients view or download content through a lightweight Svelte 5 web interface with automatic decryption on the fly.
|
|
|
|
---
|
|
|
|
## Project Overview
|
|
|
|
### What it is
|
|
|
|
cg.cx lets Telegram users upload media, documents, or plain text and receive a short shareable link (`https://cg.cx/?cxid=AbCdEfGhIjKl`). All content is encrypted at rest using **XChaCha20-Poly1305** with per-file content encryption keys (CEKs) wrapped by a master key. The server never sees plaintext.
|
|
|
|
### Key Features
|
|
|
|
| Feature | Description |
|
|
| --------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| **End-to-End Encryption** | Every file is encrypted with a unique CEK using XChaCha20 secretstream; only the server (with the master key) can decrypt for delivery. |
|
|
| **Short Shareable IDs** | Content is addressed by 12-character alphanumeric IDs (e.g., `AbCdEfGhIjKl`). |
|
|
| **Auto-Destruct** | Uploaders can set a max view count; content self-destructs once the limit is reached. |
|
|
| **Password Protection** | Optional per-content passwords with Argon2id-hashed verification and HMAC-SHA256 session cookies. |
|
|
| **Admin Moderation** | Blacklist / whitelist user IDs, delete content, review reports via Telegram admin groups. |
|
|
| **Reporting** | Users can report content via the homepage Misc section or the Telegram bot; reports are routed to review groups with inline admin actions. |
|
|
| **Author Visibility** | Uploaders can toggle whether their Telegram username/ID is shown on the share page. |
|
|
| **Username Tracking** | Username changes are logged to a JSON file for audit and moderation purposes. |
|
|
| **Global Ban Config** | Optional `global_ban` flag propagates punishments across all configured admin groups, review groups, and active forward chats. |
|
|
| **Content Deduplication** | BLAKE3 plaintext hashing enables automatic reuse of existing encrypted files when identical content is re-uploaded. |
|
|
| **Hash Blacklist** | Moderators can block re-uploads of known-banned content by its plaintext hash at ingestion time. |
|
|
| **Streaming Decryption** | Large encrypted files are decrypted and streamed chunk-by-chunk without loading into memory. |
|
|
| **Content Typing & Safety** | Automatic MIME detection and render flags flag dangerous/executable files for safe handling. |
|
|
|
|
### Architecture at a Glance
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Telegram User │────▶│ cgcx-bot │────▶│ cgcx-server │
|
|
│ (upload / cmd) │ │ (Teloxide) │ │ (Axum / web) │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ cgcx-file- │────▶│ Svelte 5 │
|
|
│ pipeline │ │ Frontend │
|
|
│ (encrypt/store) │ │ (viewer) │
|
|
└─────────────────┘ └─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ SQLite3 + WAL │
|
|
│ (metadata) │
|
|
└─────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
cg.cx is organized as a **Rust workspace** with 10 focused crates. This modular design separates concerns, enables independent unit testing, and allows the bot and server binaries to pull in only the crates they need.
|
|
|
|
| Crate | Purpose |
|
|
| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| `cgcx-core` | Shared domain types: `ContentId`, `User`, `Content`, `ContentFile`, `Report`, error enums, and result types. Zero external dependencies beyond `serde` and `chrono`. |
|
|
| `cgcx-config` | Hierarchical configuration loader (`config/default.toml` → `config/local.toml` → `CGCX_*` env vars) with validation. |
|
|
| `cgcx-crypto` | Cryptographic primitives: XChaCha20 secretstream encryption/decryption, AES-KW key wrapping, BLAKE3 hashing, master key loading. |
|
|
| `cgcx-db` | SQLite access layer with `rusqlite`, embedded migrations (`rusqlite_migration`), and async repository patterns for users, content, files, reports, and admin actions. |
|
|
| `cgcx-storage` | Filesystem abstraction: path generation by MIME type, directory creation, temp file handling, and cleanup. |
|
|
| `cgcx-content-typing` | MIME type detection (`infer` + `mime_guess`) and render-flag computation for safe UI handling of dangerous files. |
|
|
| `cgcx-file-pipeline` | High-level upload orchestration: ingests raw bytes, detects type, encrypts via `cgcx-crypto`, stores via `cgcx-storage`, and records metadata via `cgcx-db`. |
|
|
| `cgcx-moderation` | Runtime moderation lists (blacklist / whitelist) loaded from JSON, with configurable share modes (`b` = blocklist, `w` = allowlist) and auto-reload. |
|
|
| `cgcx-bot` | **Binary crate** - Telegram bot built on `teloxide`. Handles dialogue flows, uploads, terms acceptance, reporting, and admin commands. |
|
|
| `cgcx-server` | **Binary crate** - Axum HTTP server. Serves the Svelte frontend, streams decrypted files, enforces view limits, and validates password cookies. |
|
|
|
|
### Why a Modular Crate Structure?
|
|
|
|
- **Separation of concerns**: Crypto logic cannot accidentally depend on Telegram bot internals; the database layer knows nothing about HTTP.
|
|
- **Testability**: Each crate can be unit-tested in isolation. `cgcx-core` and `cgcx-crypto` have no async runtime requirements, making them fast to test.
|
|
- **Independent deployment**: In the future, the bot and server could be built as separate container images sharing only the library crates.
|
|
- **Compile-time enforcement**: The workspace dependency graph guarantees that, for example, `cgcx-crypto` never touches the network or filesystem directly.
|
|
|
|
---
|
|
|
|
## Security Design
|
|
|
|
### Cryptographic Primitives
|
|
|
|
| Layer | Algorithm | Purpose |
|
|
| -------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
|
| **Secretstream** | XChaCha20-Poly1305 (libsodium) | Encrypts file plaintext into an authenticated ciphertext stream. |
|
|
| **Key Wrapping** | AES-KW (AES-256 Key Wrap, RFC 3394) | Wraps each per-file CEK with the master key. |
|
|
| **Integrity Hash** | BLAKE3 | Computes a hash over the ciphertext stream (including the secretstream header) for tamper detection. |
|
|
| **Password Hashing** | Argon2id | Hashes optional per-content passwords. |
|
|
| **Cookie MAC** | HMAC-SHA256 | Integrity MAC for password-verification session cookies using constant-time comparison. |
|
|
| **ID Entropy** | Rejection sampling over `[A-Za-z0-9]` | 12-character IDs provide ~71 bits of entropy. |
|
|
|
|
### Encryption Flow
|
|
|
|
1. **Generate CEK**: For every uploaded file, `cgcx-crypto` generates a random 256-bit `ContentKey`.
|
|
2. **Encrypt**: The file is fed through `sodiumoxide::crypto::secretstream::xchacha20poly1305` in chunks (up to 1 MiB). The final chunk is tagged `Final`.
|
|
3. **Hash**: A running BLAKE3 hash covers the secretstream header and every ciphertext chunk.
|
|
4. **Wrap Key**: The CEK is wrapped with AES-KW using the 256-bit master key. The wrapped key + a version byte is stored in SQLite.
|
|
5. **Store**: The ciphertext file is moved from temp storage to its final path (`data/media|documents|text/<cxid>/...`).
|
|
|
|
### Decryption Flow
|
|
|
|
1. **Unwrap CEK**: The server unwraps the per-file CEK using the master key.
|
|
2. **Init Stream**: `DecryptStream` is initialized with the stored secretstream header.
|
|
3. **Stream**: Ciphertext is read from disk in ~1 MiB chunks, decrypted, and pushed to the HTTP response body via a Tokio channel.
|
|
4. **Verify**: If decryption fails (tampered or truncated data), the stream aborts and the client receives a broken stream.
|
|
|
|
### Password Protection
|
|
|
|
- Passwords are hashed with **Argon2id** and stored in the `contents` table.
|
|
- On successful verification, the server issues an `__Host-pw` cookie containing a base64-encoded `cxid:MAC` pair.
|
|
- The MAC is computed via **HMAC-SHA256** over the content ID using a server-side secret (derived from the master key).
|
|
- Cookie attributes: `Secure`, `HttpOnly`, `SameSite=Strict`, `Max-Age=3600`.
|
|
|
|
### Master Key Handling
|
|
|
|
- The master key is a 256-bit value loaded from either an environment variable (`CGCX_AES_MASTER_KEY`) or a file.
|
|
- If loaded from a file, the key is expected as 64 hex characters.
|
|
- On Unix systems, newly generated key files are automatically chmodded to `0o600`.
|
|
- The key fingerprint (first 8 bytes of BLAKE3 hash) is logged at startup for audit purposes; the full key is never logged.
|
|
|
|
---
|
|
|
|
## Tech Stack
|
|
|
|
| Layer | Technology |
|
|
| ----------------- | --------------------------------------------------------------------------- |
|
|
| **Backend** | Rust (edition 2021), Tokio async runtime |
|
|
| **Web Server** | Axum 0.7, Tower HTTP middleware |
|
|
| **Telegram Bot** | Teloxide 0.13 |
|
|
| **Frontend** | Svelte 5, Vite 5 |
|
|
| **Database** | SQLite 3 (WAL mode), `rusqlite` + `rusqlite_migration` |
|
|
| **Cryptography** | libsodium (via `sodiumoxide`), `aes-kw`, `blake3`, `argon2`, `hmac`, `sha2` |
|
|
| **Serialization** | `serde`, `serde_json` |
|
|
| **Observability** | `tracing` + `tracing-subscriber` |
|
|
|
|
---
|
|
|
|
## Prerequisites
|
|
|
|
- **Rust** toolchain (latest stable or nightly; the project builds on stable Rust 1.78+)
|
|
- **Node.js** 20+ and `npm` (for the frontend)
|
|
- **SQLite 3** (bundled via `rusqlite`, but the CLI is useful for inspection)
|
|
- A **Telegram Bot Token** from [@BotFather](https://t.me/botfather)
|
|
- A 256-bit master key (64 hex characters) for encryption
|
|
|
|
---
|
|
|
|
## Building
|
|
|
|
### Rust Workspace
|
|
|
|
Build all crates (library + binaries):
|
|
|
|
```bash
|
|
cargo build --workspace
|
|
```
|
|
|
|
Build optimized release binaries:
|
|
|
|
```bash
|
|
cargo build --workspace --release
|
|
```
|
|
|
|
The release profile enables thin LTO, single codegen unit, and binary stripping for minimal size.
|
|
|
|
### Frontend
|
|
|
|
```bash
|
|
cd frontend
|
|
npm install
|
|
npm run build
|
|
```
|
|
|
|
The static assets are emitted to `frontend/dist/` and served by `cgcx-server` at runtime.
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
cg.cx uses a layered configuration system:
|
|
|
|
1. `config/default.toml` - committed defaults
|
|
2. `config/default.example.toml` - local overrides (gitignored)
|
|
3. `CGCX_*` environment variables - runtime overrides
|
|
|
|
Environment variables use double-underscore as a separator, e.g.:
|
|
|
|
```bash
|
|
export CGCX_SERVER__PORT=3000
|
|
export CGCX_TELEGRAM__BOT_TOKEN="your_token_here"
|
|
export CGCX_CRYPTO__AES_MASTER_KEY_SOURCE__TYPE="env"
|
|
export CGCX_CRYPTO__AES_MASTER_KEY_SOURCE__VAR="CGCX_AES_MASTER_KEY"
|
|
export CGCX_AES_MASTER_KEY="aabbccdd..." # 64 hex chars
|
|
```
|
|
|
|
### Config Sections
|
|
|
|
| Section | Description |
|
|
| ----------------------------- | ----------------------------------------------------------------------------------------------------- |
|
|
| `[content]` | Auto-destruct behavior (`keep_content`, `share_mode`, `default_allow_download`, `default_max_views`). |
|
|
| `[crypto]` | Master key source (`env` or `file`). |
|
|
| `[telegram]` | Bot token and optional custom API URL. |
|
|
| `[groups]` | `admin_group_ids` and `review_group_ids` (Telegram chat IDs). |
|
|
| `[storage]` | Filesystem paths for `media`, `documents`, `text`, `temp`, and the streaming chunk size. |
|
|
| `[upload_limits]` | `max_batch_size`, `max_file_size_bytes`, `max_total_batch_bytes`. |
|
|
| `[server]` | `base_url`, `bind_address`, `port`. |
|
|
| `[rate_limiting]` | Per-minute request limits, burst capacity, and password-attempt limits. |
|
|
| `[logging]` | `level` (e.g., `info`, `debug`). |
|
|
| `[frontend.behavior_toggles]` | Feature flags for retro animations and particles. |
|
|
|
|
### Validating Config
|
|
|
|
Both binaries validate configuration on startup. Key checks include:
|
|
|
|
- Chunk size between 8 MiB and 256 MiB
|
|
- Bot token is set and not the placeholder
|
|
- Upload and rate-limiting values are non-zero
|
|
- Master key source is fully specified
|
|
|
|
---
|
|
|
|
## Running
|
|
|
|
### Run the Web Server
|
|
|
|
```bash
|
|
cargo run -p cgcx-server
|
|
```
|
|
|
|
The server binds to `127.0.0.1:8080` by default and serves:
|
|
|
|
- `/` - Svelte frontend
|
|
- `/api/health` - health check
|
|
- `/api/content/:cxid` - metadata JSON
|
|
- `/api/content/:cxid/verify-password` - password verification
|
|
- `/api/content/:cxid/file/:file_idx` - streamed decrypted file
|
|
- `/assets/*` - static frontend assets
|
|
|
|
### Run the Telegram Bot
|
|
|
|
```bash
|
|
cargo run -p cgcx-bot
|
|
```
|
|
|
|
The bot processes updates from Telegram, handles user dialogues, and triggers the file pipeline for uploads.
|
|
|
|
### Run Both Simultaneously
|
|
|
|
Because the bot and server are separate binaries, they can run side-by-side sharing the same SQLite database and data directories:
|
|
|
|
```bash
|
|
# Terminal 1
|
|
cargo run -p cgcx-server
|
|
|
|
# Terminal 2
|
|
cargo run -p cgcx-bot
|
|
```
|
|
|
|
Ensure both processes point to the same database path and storage directories via shared configuration.
|
|
|
|
---
|
|
|
|
## Database Migrations
|
|
|
|
Migrations are managed by `rusqlite_migration` and embedded into the `cgcx-db` crate at compile time.
|
|
|
|
- `migrations/001_init.sql` - Creates `users`, `contents`, `content_files`, `reports`, and `admin_actions` tables.
|
|
- `migrations/002_indexes.sql` - Adds performance indexes on foreign keys, status columns, and report state.
|
|
- `migrations/003_forward_system.sql` - Forward definitions, submissions, and per-forward access lists.
|
|
- `migrations/004_punishments.sql` - Punishment records with auto-expiration support.
|
|
- `migrations/005_show_author.sql` - Adds `show_author` column to `contents`.
|
|
- `migrations/006_dedup.sql` - Adds `plaintext_hash` and `ref_count` to `content_files` for deduplication.
|
|
- `migrations/007_hash_blacklist.sql` - Creates the `hash_blacklist` table for blocked content hashes.
|
|
|
|
On startup, both the bot and server call `db.run_migrations()`, which applies any pending migrations automatically. The database is opened with:
|
|
|
|
- `PRAGMA journal_mode = WAL;`
|
|
- `PRAGMA foreign_keys = ON;`
|
|
- `PRAGMA busy_timeout = 5000;`
|
|
|
|
### Manual Inspection
|
|
|
|
```bash
|
|
sqlite3 data/db.sqlite ".schema"
|
|
sqlite3 data/cgcx.db ".indexes"
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### systemd Service
|
|
|
|
Create `/etc/systemd/system/cgcx-server.service`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=cg.cx Web Server
|
|
After=network.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=cgcx
|
|
Group=cgcx
|
|
WorkingDirectory=/opt/cgcx
|
|
Environment="RUST_LOG=info"
|
|
Environment="CGCX_AES_MASTER_KEY=<64-hex-chars>"
|
|
ExecStart=/opt/cgcx/cgcx-server
|
|
Restart=on-failure
|
|
RestartSec=5
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
Create a similar service for `cgcx-bot`. Reload and enable:
|
|
|
|
```bash
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable --now cgcx-server cgcx-bot
|
|
```
|
|
|
|
### Reverse Proxy (nginx)
|
|
|
|
```nginx
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name cg.cx;
|
|
|
|
ssl_certificate /path/to/cert.pem;
|
|
ssl_certificate_key /path/to/key.pem;
|
|
|
|
location / {
|
|
proxy_pass http://127.0.0.1:8080;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
|
|
# Support streaming
|
|
proxy_buffering off;
|
|
proxy_request_buffering off;
|
|
}
|
|
}
|
|
```
|
|
|
|
### TLS
|
|
|
|
Use Let's Encrypt (certbot) or a managed TLS terminator. The `__Host-pw` cookie requires HTTPS (`Secure` flag).
|
|
|
|
### File Permissions
|
|
|
|
- The master key file (if used instead of env) **must** be readable only by the service user:
|
|
```bash
|
|
chmod 600 /opt/cgcx/master.key
|
|
chown cgcx:cgcx /opt/cgcx/master.key
|
|
```
|
|
- Data directories (`data/media`, `data/documents`, `data/text`, `data/temp`) should be owned by the service user.
|
|
|
|
---
|
|
|
|
## Administration
|
|
|
|
### Admin Commands
|
|
|
|
Admin commands are restricted to users in configured `admin_group_ids` who also have the `admin` role in the database.
|
|
|
|
| Command | Usage | Description |
|
|
| ---------------- | -------------------------- | ---------------------------------------------------------------------------------------------- |
|
|
| `/reload` | `/reload` | Reloads moderation lists from disk (`data/blacklisted_ids.json`, `data/whitelisted_ids.json`). |
|
|
| `/blacklist_uid` | `/blacklist_uid <user_id>` | Blacklists a Telegram user ID globally and sets their role to `banned`. Shows usage info if the ID is missing. |
|
|
| `/whitelist_uid` | `/whitelist_uid <user_id>` | Removes a user from the global blacklist and restores their role to `user`. Shows usage info if the ID is missing. |
|
|
|
|
### Review Groups
|
|
|
|
Reports submitted by users are forwarded to all configured `review_group_ids` with an inline keyboard:
|
|
|
|
- **🗑⛔ Rmv + Ban** - Deletes the reported content and blacklists the uploader.
|
|
- **🗑 Delete Only** - Deletes the reported content.
|
|
- **⛔ Blacklist Only** - Blacklists the uploader and sets their role to `banned`.
|
|
- **📝 Ignore** - Dismisses the report.
|
|
|
|
### Moderation Modes
|
|
|
|
- **Blocklist mode (`share_mode = "b"`)**: Everyone can upload except blacklisted IDs.
|
|
- **Allowlist mode (`share_mode = "w"`)**: Only whitelisted IDs can upload.
|
|
|
|
Moderation lists are hot-reloaded every 30 seconds by a background task, or immediately via `/reload`.
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
### Dev Mode (Frontend)
|
|
|
|
```bash
|
|
cd frontend
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
Vite dev server runs separately; point `config/local.toml` `server.base_url` to your local frontend proxy if needed.
|
|
|
|
### Dev Mode (Backend)
|
|
|
|
```bash
|
|
# Server with tracing
|
|
cargo run -p cgcx-server
|
|
|
|
# Bot with tracing
|
|
cargo run -p cgcx-bot
|
|
```
|
|
|
|
Set `RUST_LOG=debug` for verbose output:
|
|
|
|
```bash
|
|
RUST_LOG=debug cargo run -p cgcx-server
|
|
```
|
|
|
|
### Testing
|
|
|
|
Run workspace tests:
|
|
|
|
```bash
|
|
cargo test --workspace
|
|
```
|
|
|
|
Individual crate tests:
|
|
|
|
```bash
|
|
cargo test -p cgcx-core
|
|
cargo test -p cgcx-crypto
|
|
cargo test -p cgcx-content-typing
|
|
```
|
|
|
|
### Useful Debug Tips
|
|
|
|
- Inspect SQLite directly: `sqlite3 data/db.sqlite "SELECT * FROM contents;"`
|
|
- Check moderation lists: `cat data/blacklisted_ids.json`
|
|
- Verify master key fingerprint in logs on startup.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT License - see [LICENSE](LICENSE) for details.
|
|
|
|
---
|
|
|
|
## Security Disclosure
|
|
|
|
If you discover a security vulnerability, please do not open a public issue. Contact the maintainers directly through the admin channels configured in the bot.
|