Files
Stratum/README.md

199 lines
6.0 KiB
Markdown

# Stratum
> Self-optimizing, S3-compatible object storage with autonomous intelligent tiering — built for EU data sovereignty.
---
## What Is Stratum?
Stratum is an open-source S3-compatible object storage server written in Rust. Unlike every other S3-compatible storage solution, Stratum autonomously moves objects between storage tiers (hot/warm/cold) based on observed access patterns — with zero configuration required.
Point any S3-compatible client at it. It gets smarter over time.
---
## Why Stratum?
| | AWS S3 Intelligent-Tiering | MinIO | Garage | **Stratum** |
|---|---|---|---|---|
| S3 compatible | ✅ | ✅ | ✅ | ✅ |
| Autonomous tiering | ✅ (black box) | ❌ | ❌ | ✅ (transparent) |
| EU sovereign | ❌ (CLOUD Act) | ❌ | ✅ | ✅ |
| Open source | ❌ | ☠️ Archived | ✅ | ✅ |
| Transparent tier reasoning | ❌ | ❌ | ❌ | ✅ |
| Self-hosted | ❌ | ✅ | ✅ | ✅ |
MinIO was archived in February 2026. RustFS is alpha. Garage targets geo-distribution only. **The space for a production-ready, intelligent, EU-sovereign S3 server is open.**
---
## Architecture
Stratum is a Cargo workspace split into focused crates:
```
stratum/
├── src/
│ ├── stratum/ → binary — wires everything together
│ ├── stratum-api-s3/ → S3 API layer (routes, handlers, auth)
│ ├── stratum-storage/ → volume management, tier logic, shard I/O
│ ├── stratum-metadata/ → bucket/key → volume mapping (sled)
│ ├── stratum-tiering/ → tier decision engine
│ ├── stratum-auth/ → AWS Signature V4 validation
│ └── stratum-core/ → shared types and config
```
### Storage Model
Objects are not stored directly by key. Keys point to **volumes**. Volumes hold the actual data and can live on any tier:
```
bucket/key → volume_id → Volume {
tier: Hot | Warm | Cold
location: Local(path) | Remote(url)
size, checksum
last_accessed, access_count ← tiering signals
}
```
When tiering promotes or demotes an object, only the volume location changes. The key never moves. Clients never know.
### Storage Tiers
```
Hot → NVMe/SSD — frequently accessed objects, lowest latency
Warm → HDD — infrequently accessed, medium cost
Cold → Remote S3 — rarely accessed, cheapest (B2, R2, AWS, Garage...)
```
### Erasure Coding
Stratum uses Reed-Solomon erasure coding (4 data + 2 parity shards) instead of replication. This gives:
```
3x replication: 3.0x storage overhead, lose 1 node
4+2 erasure: 1.5x storage overhead, lose any 2 nodes
```
Each object is split into shards. Shards are distributed across nodes/disks. Loss of any 2 shards is fully recoverable.
---
## S3 API Coverage
### Implemented (routing layer)
All routes are defined and return `501 Not Implemented` until handlers are built.
| Operation | Method | Status |
|---|---|---|
| ListBuckets | GET / | 🔲 Stub |
| CreateBucket | PUT /{bucket} | 🔲 Stub |
| DeleteBucket | DELETE /{bucket} | 🔲 Stub |
| HeadBucket | HEAD /{bucket} | 🔲 Stub |
| ListObjectsV2 | GET /{bucket} | 🔲 Stub |
| GetObject | GET /{bucket}/{*key} | 🔲 Stub |
| PutObject | PUT /{bucket}/{*key} | 🔲 Stub |
| DeleteObject | DELETE /{bucket}/{*key} | 🔲 Stub |
| HeadObject | HEAD /{bucket}/{*key} | 🔲 Stub |
| CreateMultipartUpload | POST /{bucket}/{*key}?uploads | 🔲 Stub |
| UploadPart | PUT /{bucket}/{*key}?partNumber&uploadId | 🔲 Stub |
| CompleteMultipartUpload | POST /{bucket}/{*key}?uploadId | 🔲 Stub |
| AbortMultipartUpload | DELETE /{bucket}/{*key}?uploadId | 🔲 Stub |
### Endpoint Parser
All S3 endpoints are parsed from raw HTTP requests into typed `Endpoint` enum variants before reaching handlers. Query parameters disambiguate operations sharing the same route (e.g. `UploadPart` vs `PutObject`).
### Error Handling
S3-compatible error types defined:
- `BucketNotFound` → 404
- `ObjectNotFound` → 404
- `BucketAlreadyExists` → 409
- `InvalidArgument` → 400
- `InvalidBucketName` → 400
- `AuthorizationFailed` → 403
- `MissingAuthHeader` → 401
- `InternalError` → 500
- `NotImplemented` → 501
---
## Design Principles
- **KISS** — no macros where plain match arms work
- **Bottom-up** — storage layer before API layer
- **TDD** — tests written before implementation
- **One concern per file** — enum definitions separate from parsing logic
- **No lifetime annotations** — owned types throughout for maintainability
- **`cargo fmt` always** — enforced formatting
---
## Testing
```bash
# run all tests
cargo test
# run specific crate
cargo test -p stratum-api-s3
# coverage report
cargo tarpaulin -p stratum-api-s3 --out Html
```
### Test Layers
```
Unit tests → endpoint parser, individual functions
Integration tests → axum routes, full HTTP request/response
E2E tests → awscli + rclone against running server (planned)
```
---
## Development Setup
```bash
git clone https://github.com/gsh-digital/stratum
cd stratum
cargo build
cargo test
```
### Requirements
- Rust 1.75+
- cargo
### Tested On
- Linux x86_64
- Linux aarch64 (Raspberry Pi 4) ← primary dev/test bench
---
## Roadmap
### Phase 1 — Core S3 Server (current)
> Goal: pass MinIO s3-tests suite at >95%, work with awscli and rclone out of the box
### Phase 2 — Geo Distribution
> Goal: multi-node replication across geographic regions with Raft consensus
### Phase 3 — Intelligent Tiering
> Goal: autonomous object movement between hot/warm/cold based on access patterns
### Phase 4 — Managed Service
> Goal: GSH Digital Services hosted offering with Grafana monitoring
---
## License
Apache 2.0 — see LICENSE
---
## By
**GSH Digital Services**
Author: [Soliman, Ramez](mailto:r.soliman@gsh-services.com)
Building EU-sovereign infrastructure that doesn't cost like AWS and doesn't require a PhD to operate.