Initial commit
This commit is contained in:
445
README.md
Normal file
445
README.md
Normal file
@@ -0,0 +1,445 @@
|
||||
# cg.cx
|
||||
|
||||
> End-to-end encrypted content sharing via Telegram — with a modern web frontend.
|
||||
|
||||
**cg.cx** is a privacy-first file and text sharing platform built as a Telegram bot and Axum web service. Users upload content through a Telegram bot; the service encrypts every file with unique per-content keys, stores them securely, and shares them via short 12-character IDs. Recipients view or download content through a lightweight Svelte 5 web interface with automatic decryption on the fly.
|
||||
|
||||
---
|
||||
|
||||
## Project Overview
|
||||
|
||||
### What it is
|
||||
|
||||
cg.cx lets Telegram users upload media, documents, or plain text and receive a short shareable link (`https://cg.cx/?cxid=AbCdEfGhIjKl`). All content is encrypted at rest using **XChaCha20-Poly1305** with per-file content encryption keys (CEKs) wrapped by a master key. The server never sees plaintext.
|
||||
|
||||
### Key Features
|
||||
|
||||
| Feature | Description |
|
||||
| --------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **End-to-End Encryption** | Every file is encrypted with a unique CEK using XChaCha20 secretstream; only the server (with the master key) can decrypt for delivery. |
|
||||
| **Short Shareable IDs** | Content is addressed by 12-character alphanumeric IDs (e.g., `AbCdEfGhIjKl`). |
|
||||
| **Auto-Destruct** | Uploaders can set a max view count; content self-destructs once the limit is reached. |
|
||||
| **Password Protection** | Optional per-content passwords with Argon2id-hashed verification and HMAC-SHA256 session cookies. |
|
||||
| **Admin Moderation** | Blacklist / whitelist user IDs, delete content, review reports via Telegram admin groups. |
|
||||
| **Reporting** | Users can report content; reports are routed to review groups with inline admin actions. |
|
||||
| **Streaming Decryption** | Large encrypted files are decrypted and streamed chunk-by-chunk without loading into memory. |
|
||||
| **Content Typing & Safety** | Automatic MIME detection and render flags flag dangerous/executable files for safe handling. |
|
||||
|
||||
### Architecture at a Glance
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Telegram User │────▶│ cgcx-bot │────▶│ cgcx-server │
|
||||
│ (upload / cmd) │ │ (Teloxide) │ │ (Axum / web) │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ cgcx-file- │────▶│ Svelte 5 │
|
||||
│ pipeline │ │ Frontend │
|
||||
│ (encrypt/store) │ │ (viewer) │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ SQLite3 + WAL │
|
||||
│ (metadata) │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
cg.cx is organized as a **Rust workspace** with 10 focused crates. This modular design separates concerns, enables independent unit testing, and allows the bot and server binaries to pull in only the crates they need.
|
||||
|
||||
| Crate | Purpose |
|
||||
| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `cgcx-core` | Shared domain types: `ContentId`, `User`, `Content`, `ContentFile`, `Report`, error enums, and result types. Zero external dependencies beyond `serde` and `chrono`. |
|
||||
| `cgcx-config` | Hierarchical configuration loader (`config/default.toml` → `config/local.toml` → `CGCX_*` env vars) with validation. |
|
||||
| `cgcx-crypto` | Cryptographic primitives: XChaCha20 secretstream encryption/decryption, AES-KW key wrapping, BLAKE3 hashing, master key loading. |
|
||||
| `cgcx-db` | SQLite access layer with `rusqlite`, embedded migrations (`rusqlite_migration`), and async repository patterns for users, content, files, reports, and admin actions. |
|
||||
| `cgcx-storage` | Filesystem abstraction: path generation by MIME type, directory creation, temp file handling, and cleanup. |
|
||||
| `cgcx-content-typing` | MIME type detection (`infer` + `mime_guess`) and render-flag computation for safe UI handling of dangerous files. |
|
||||
| `cgcx-file-pipeline` | High-level upload orchestration: ingests raw bytes, detects type, encrypts via `cgcx-crypto`, stores via `cgcx-storage`, and records metadata via `cgcx-db`. |
|
||||
| `cgcx-moderation` | Runtime moderation lists (blacklist / whitelist) loaded from JSON, with configurable share modes (`b` = blocklist, `w` = allowlist) and auto-reload. |
|
||||
| `cgcx-bot` | **Binary crate** — Telegram bot built on `teloxide`. Handles dialogue flows, uploads, terms acceptance, reporting, and admin commands. |
|
||||
| `cgcx-server` | **Binary crate** — Axum HTTP server. Serves the Svelte frontend, streams decrypted files, enforces view limits, and validates password cookies. |
|
||||
|
||||
### Why a Modular Crate Structure?
|
||||
|
||||
- **Separation of concerns**: Crypto logic cannot accidentally depend on Telegram bot internals; the database layer knows nothing about HTTP.
|
||||
- **Testability**: Each crate can be unit-tested in isolation. `cgcx-core` and `cgcx-crypto` have no async runtime requirements, making them fast to test.
|
||||
- **Independent deployment**: In the future, the bot and server could be built as separate container images sharing only the library crates.
|
||||
- **Compile-time enforcement**: The workspace dependency graph guarantees that, for example, `cgcx-crypto` never touches the network or filesystem directly.
|
||||
|
||||
---
|
||||
|
||||
## Security Design
|
||||
|
||||
### Cryptographic Primitives
|
||||
|
||||
| Layer | Algorithm | Purpose |
|
||||
| -------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------------------- |
|
||||
| **Secretstream** | XChaCha20-Poly1305 (libsodium) | Encrypts file plaintext into an authenticated ciphertext stream. |
|
||||
| **Key Wrapping** | AES-KW (AES-256 Key Wrap, RFC 3394) | Wraps each per-file CEK with the master key. |
|
||||
| **Integrity Hash** | BLAKE3 | Computes a hash over the ciphertext stream (including the secretstream header) for tamper detection. |
|
||||
| **Password Hashing** | Argon2id | Hashes optional per-content passwords. |
|
||||
| **Cookie MAC** | HMAC-SHA256 | Integrity MAC for password-verification session cookies using constant-time comparison. |
|
||||
| **ID Entropy** | Rejection sampling over `[A-Za-z0-9]` | 12-character IDs provide ~71 bits of entropy. |
|
||||
|
||||
### Encryption Flow
|
||||
|
||||
1. **Generate CEK**: For every uploaded file, `cgcx-crypto` generates a random 256-bit `ContentKey`.
|
||||
2. **Encrypt**: The file is fed through `sodiumoxide::crypto::secretstream::xchacha20poly1305` in chunks (up to 1 MiB). The final chunk is tagged `Final`.
|
||||
3. **Hash**: A running BLAKE3 hash covers the secretstream header and every ciphertext chunk.
|
||||
4. **Wrap Key**: The CEK is wrapped with AES-KW using the 256-bit master key. The wrapped key + a version byte is stored in SQLite.
|
||||
5. **Store**: The ciphertext file is moved from temp storage to its final path (`data/media|documents|text/<cxid>/...`).
|
||||
|
||||
### Decryption Flow
|
||||
|
||||
1. **Unwrap CEK**: The server unwraps the per-file CEK using the master key.
|
||||
2. **Init Stream**: `DecryptStream` is initialized with the stored secretstream header.
|
||||
3. **Stream**: Ciphertext is read from disk in ~1 MiB chunks, decrypted, and pushed to the HTTP response body via a Tokio channel.
|
||||
4. **Verify**: If decryption fails (tampered or truncated data), the stream aborts and the client receives a broken stream.
|
||||
|
||||
### Password Protection
|
||||
|
||||
- Passwords are hashed with **Argon2id** and stored in the `contents` table.
|
||||
- On successful verification, the server issues an `__Host-pw` cookie containing a base64-encoded `cxid:MAC` pair.
|
||||
- The MAC is computed via **HMAC-SHA256** over the content ID using a server-side secret (derived from the master key).
|
||||
- Cookie attributes: `Secure`, `HttpOnly`, `SameSite=Strict`, `Max-Age=3600`.
|
||||
|
||||
### Master Key Handling
|
||||
|
||||
- The master key is a 256-bit value loaded from either an environment variable (`CGCX_AES_MASTER_KEY`) or a file.
|
||||
- If loaded from a file, the key is expected as 64 hex characters.
|
||||
- On Unix systems, newly generated key files are automatically chmodded to `0o600`.
|
||||
- The key fingerprint (first 8 bytes of BLAKE3 hash) is logged at startup for audit purposes; the full key is never logged.
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Layer | Technology |
|
||||
| ----------------- | ----------------------------------------------------------- |
|
||||
| **Backend** | Rust (edition 2021), Tokio async runtime |
|
||||
| **Web Server** | Axum 0.7, Tower HTTP middleware |
|
||||
| **Telegram Bot** | Teloxide 0.13 |
|
||||
| **Frontend** | Svelte 5, Vite 5 |
|
||||
| **Database** | SQLite 3 (WAL mode), `rusqlite` + `rusqlite_migration` |
|
||||
| **Cryptography** | libsodium (via `sodiumoxide`), `aes-kw`, `blake3`, `argon2`, `hmac`, `sha2` |
|
||||
| **Serialization** | `serde`, `serde_json` |
|
||||
| **Observability** | `tracing` + `tracing-subscriber` |
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Rust** toolchain (latest stable or nightly; the project builds on stable Rust 1.78+)
|
||||
- **Node.js** 20+ and `npm` (for the frontend)
|
||||
- **SQLite 3** (bundled via `rusqlite`, but the CLI is useful for inspection)
|
||||
- A **Telegram Bot Token** from [@BotFather](https://t.me/botfather)
|
||||
- A 256-bit master key (64 hex characters) for encryption
|
||||
|
||||
---
|
||||
|
||||
## Building
|
||||
|
||||
### Rust Workspace
|
||||
|
||||
Build all crates (library + binaries):
|
||||
|
||||
```bash
|
||||
cargo build --workspace
|
||||
```
|
||||
|
||||
Build optimized release binaries:
|
||||
|
||||
```bash
|
||||
cargo build --workspace --release
|
||||
```
|
||||
|
||||
The release profile enables thin LTO, single codegen unit, and binary stripping for minimal size.
|
||||
|
||||
### Frontend
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
npm install
|
||||
npm run build
|
||||
```
|
||||
|
||||
The static assets are emitted to `frontend/dist/` and served by `cgcx-server` at runtime.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
cg.cx uses a layered configuration system:
|
||||
|
||||
1. `config/default.toml` — committed defaults
|
||||
2. `config/default.example.toml` — local overrides (gitignored)
|
||||
3. `CGCX_*` environment variables — runtime overrides
|
||||
|
||||
Environment variables use double-underscore as a separator, e.g.:
|
||||
|
||||
```bash
|
||||
export CGCX_SERVER__PORT=3000
|
||||
export CGCX_TELEGRAM__BOT_TOKEN="your_token_here"
|
||||
export CGCX_CRYPTO__AES_MASTER_KEY_SOURCE__TYPE="env"
|
||||
export CGCX_CRYPTO__AES_MASTER_KEY_SOURCE__VAR="CGCX_AES_MASTER_KEY"
|
||||
export CGCX_AES_MASTER_KEY="aabbccdd..." # 64 hex chars
|
||||
```
|
||||
|
||||
### Config Sections
|
||||
|
||||
| Section | Description |
|
||||
| ----------------------------- | ----------------------------------------------------------------------------------------------------- |
|
||||
| `[content]` | Auto-destruct behavior (`keep_content`, `share_mode`, `default_allow_download`, `default_max_views`). |
|
||||
| `[crypto]` | Master key source (`env` or `file`). |
|
||||
| `[telegram]` | Bot token and optional custom API URL. |
|
||||
| `[groups]` | `admin_group_ids` and `review_group_ids` (Telegram chat IDs). |
|
||||
| `[storage]` | Filesystem paths for `media`, `documents`, `text`, `temp`, and the streaming chunk size. |
|
||||
| `[upload_limits]` | `max_batch_size`, `max_file_size_bytes`, `max_total_batch_bytes`. |
|
||||
| `[server]` | `base_url`, `bind_address`, `port`. |
|
||||
| `[rate_limiting]` | Per-minute request limits, burst capacity, and password-attempt limits. |
|
||||
| `[logging]` | `level` (e.g., `info`, `debug`). |
|
||||
| `[frontend.behavior_toggles]` | Feature flags for retro animations and particles. |
|
||||
|
||||
### Validating Config
|
||||
|
||||
Both binaries validate configuration on startup. Key checks include:
|
||||
|
||||
- Chunk size between 8 MiB and 256 MiB
|
||||
- Bot token is set and not the placeholder
|
||||
- Upload and rate-limiting values are non-zero
|
||||
- Master key source is fully specified
|
||||
|
||||
---
|
||||
|
||||
## Running
|
||||
|
||||
### Run the Web Server
|
||||
|
||||
```bash
|
||||
cargo run -p cgcx-server
|
||||
```
|
||||
|
||||
The server binds to `127.0.0.1:8080` by default and serves:
|
||||
|
||||
- `/` — Svelte frontend
|
||||
- `/api/health` — health check
|
||||
- `/api/content/:cxid` — metadata JSON
|
||||
- `/api/content/:cxid/verify-password` — password verification
|
||||
- `/api/content/:cxid/file/:file_idx` — streamed decrypted file
|
||||
- `/assets/*` — static frontend assets
|
||||
|
||||
### Run the Telegram Bot
|
||||
|
||||
```bash
|
||||
cargo run -p cgcx-bot
|
||||
```
|
||||
|
||||
The bot processes updates from Telegram, handles user dialogues, and triggers the file pipeline for uploads.
|
||||
|
||||
### Run Both Simultaneously
|
||||
|
||||
Because the bot and server are separate binaries, they can run side-by-side sharing the same SQLite database and data directories:
|
||||
|
||||
```bash
|
||||
# Terminal 1
|
||||
cargo run -p cgcx-server
|
||||
|
||||
# Terminal 2
|
||||
cargo run -p cgcx-bot
|
||||
```
|
||||
|
||||
Ensure both processes point to the same database path and storage directories via shared configuration.
|
||||
|
||||
---
|
||||
|
||||
## Database Migrations
|
||||
|
||||
Migrations are managed by `rusqlite_migration` and embedded into the `cgcx-db` crate at compile time.
|
||||
|
||||
- `migrations/001_init.sql` — Creates `users`, `contents`, `content_files`, `reports`, and `admin_actions` tables.
|
||||
- `migrations/002_indexes.sql` — Adds performance indexes on foreign keys, status columns, and report state.
|
||||
|
||||
On startup, both the bot and server call `db.run_migrations()`, which applies any pending migrations automatically. The database is opened with:
|
||||
|
||||
- `PRAGMA journal_mode = WAL;`
|
||||
- `PRAGMA foreign_keys = ON;`
|
||||
- `PRAGMA busy_timeout = 5000;`
|
||||
|
||||
### Manual Inspection
|
||||
|
||||
```bash
|
||||
sqlite3 data/db.sqlite ".schema"
|
||||
sqlite3 data/cgcx.db ".indexes"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deployment
|
||||
|
||||
### systemd Service
|
||||
|
||||
Create `/etc/systemd/system/cgcx-server.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=cg.cx Web Server
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=cgcx
|
||||
Group=cgcx
|
||||
WorkingDirectory=/opt/cgcx
|
||||
Environment="RUST_LOG=info"
|
||||
Environment="CGCX_AES_MASTER_KEY=<64-hex-chars>"
|
||||
ExecStart=/opt/cgcx/cgcx-server
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Create a similar service for `cgcx-bot`. Reload and enable:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now cgcx-server cgcx-bot
|
||||
```
|
||||
|
||||
### Reverse Proxy (nginx)
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
server_name cg.cx;
|
||||
|
||||
ssl_certificate /path/to/cert.pem;
|
||||
ssl_certificate_key /path/to/key.pem;
|
||||
|
||||
location / {
|
||||
proxy_pass http://127.0.0.1:8080;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# Support streaming
|
||||
proxy_buffering off;
|
||||
proxy_request_buffering off;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### TLS
|
||||
|
||||
Use Let's Encrypt (certbot) or a managed TLS terminator. The `__Host-pw` cookie requires HTTPS (`Secure` flag).
|
||||
|
||||
### File Permissions
|
||||
|
||||
- The master key file (if used instead of env) **must** be readable only by the service user:
|
||||
```bash
|
||||
chmod 600 /opt/cgcx/master.key
|
||||
chown cgcx:cgcx /opt/cgcx/master.key
|
||||
```
|
||||
- Data directories (`data/media`, `data/documents`, `data/text`, `data/temp`) should be owned by the service user.
|
||||
|
||||
---
|
||||
|
||||
## Administration
|
||||
|
||||
### Admin Commands
|
||||
|
||||
Admin commands are restricted to users in configured `admin_group_ids` who also have the `admin` role in the database.
|
||||
|
||||
| Command | Usage | Description |
|
||||
| ---------------- | -------------------------- | ---------------------------------------------------------------------------------------------- |
|
||||
| `/reload` | `/reload` | Reloads moderation lists from disk (`data/blacklisted_ids.json`, `data/whitelisted_ids.json`). |
|
||||
| `/blacklist_uid` | `/blacklist_uid <user_id>` | Blacklists a Telegram user ID and sets their role to `banned`. |
|
||||
| `/whitelist_uid` | `/whitelist_uid <user_id>` | Whitelists a Telegram user ID (relevant in whitelist mode). |
|
||||
|
||||
### Review Groups
|
||||
|
||||
Reports submitted by users are forwarded to all configured `review_group_ids` with an inline keyboard:
|
||||
|
||||
- **🗑 Delete** — Sets content status to `deleted`.
|
||||
- **⛔ Blacklist User** — Blacklists the uploader and bans them.
|
||||
- **📝 Ignore** — Dismisses the report.
|
||||
|
||||
### Moderation Modes
|
||||
|
||||
- **Blocklist mode (`share_mode = "b"`)**: Everyone can upload except blacklisted IDs.
|
||||
- **Allowlist mode (`share_mode = "w"`)**: Only whitelisted IDs can upload.
|
||||
|
||||
Moderation lists are hot-reloaded every 30 seconds by a background task, or immediately via `/reload`.
|
||||
|
||||
---
|
||||
|
||||
## Development
|
||||
|
||||
### Dev Mode (Frontend)
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
npm install
|
||||
npm run dev
|
||||
```
|
||||
|
||||
Vite dev server runs separately; point `config/local.toml` `server.base_url` to your local frontend proxy if needed.
|
||||
|
||||
### Dev Mode (Backend)
|
||||
|
||||
```bash
|
||||
# Server with tracing
|
||||
cargo run -p cgcx-server
|
||||
|
||||
# Bot with tracing
|
||||
cargo run -p cgcx-bot
|
||||
```
|
||||
|
||||
Set `RUST_LOG=debug` for verbose output:
|
||||
|
||||
```bash
|
||||
RUST_LOG=debug cargo run -p cgcx-server
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
Run workspace tests:
|
||||
|
||||
```bash
|
||||
cargo test --workspace
|
||||
```
|
||||
|
||||
Individual crate tests:
|
||||
|
||||
```bash
|
||||
cargo test -p cgcx-core
|
||||
cargo test -p cgcx-crypto
|
||||
cargo test -p cgcx-content-typing
|
||||
```
|
||||
|
||||
### Useful Debug Tips
|
||||
|
||||
- Inspect SQLite directly: `sqlite3 data/db.sqlite "SELECT * FROM contents;"`
|
||||
- Check moderation lists: `cat data/blacklisted_ids.json`
|
||||
- Verify master key fingerprint in logs on startup.
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
MIT License — see [LICENSE](LICENSE) for details.
|
||||
|
||||
---
|
||||
|
||||
## Security Disclosure
|
||||
|
||||
If you discover a security vulnerability, please do not open a public issue. Contact the maintainers directly through the admin channels configured in the bot.
|
||||
Reference in New Issue
Block a user