Files
Stratum/todo.md

123 lines
4.4 KiB
Markdown

# Stratum — TODO
## Immediate Next Session
### 1. `stratum-storage` — Volume Layer
- [ ] `config.rs` — StorageConfig with hot/warm/cold paths
- [x] `tier.rs` — StorageTier enum (Hot, Warm, Cold)
- [x] `location.rs` — Location + ShardLocation enums (Local/Remote/Mixed)
- [x] `manifest.rs` — Volume struct with access tracking fields
- [ ] `store.rs` — VolumeStore (in-memory HashMap for now)
- [ ] `shard.rs` — async read/write/delete shard files via tokio::fs
- [ ] Tests for all of the above
### 2. `stratum-metadata` — Bucket + Key Mapping
- [ ] Sled-backed metadata store
- [ ] Bucket operations (create, delete, exists, list)
- [ ] Key → Volume ID mapping (put, get, delete, list)
- [ ] Tests for all of the above
### 3. Wire Storage Into API Handlers (bottom-up)
- [ ] `CreateBucket` → 200 (create metadata entry)
- [ ] `ListBuckets` → 200 + XML response
- [ ] `PutObject` → 200 (write shard, create volume, store mapping)
- [ ] `GetObject` → 200 + stream bytes (read shard via volume location)
- [ ] `DeleteObject` → 204 (delete shard + metadata)
- [ ] `HeadObject` → 200 + metadata headers only
- [ ] `ListObjectsV2` → 200 + XML response
- [ ] Multipart (last, most complex)
### 4. XML Responses
- [ ] `xml/responses.rs` — ListBuckets XML
- [ ] `xml/responses.rs` — ListObjectsV2 XML
- [ ] `xml/responses.rs` — Error XML (replace current plain text)
- [ ] `xml/responses.rs` — InitiateMultipartUploadResult XML
- [ ] `xml/responses.rs` — CompleteMultipartUploadResult XML
---
## Backlog (Implement After Core Works)
### S3 Compatibility
- [ ] AWS Signature V4 validation (`stratum-auth`)
- [ ] ETag generation (MD5 for single part, MD5-of-MD5s for multipart)
- [ ] Content-MD5 header validation on PUT
- [ ] Bucket naming validation (3-63 chars, lowercase, no underscores)
- [ ] `GetBucketLocation` endpoint
- [ ] `CopyObject` endpoint
- [ ] Virtual-hosted style URLs (bucket.host/key)
- [ ] Range request support (critical for video streaming)
- [ ] Conditional requests (If-None-Match, If-Modified-Since)
### Storage
- [ ] Erasure coding integration (reed-solomon-erasure)
- [ ] Shard distribution across multiple disks/directories
- [ ] Checksum verification on read
- [ ] Atomic writes (write to temp, rename to final)
- [ ] Multipart upload temporary shard storage
- [ ] Multipart upload cleanup on abort
### Testing
- [ ] Run MinIO s3-tests compliance suite against server
- [ ] Test with awscli (`--no-sign-request` flag)
- [ ] Test with rclone
- [ ] Test with aws-sdk-rust
- [ ] Coverage report via cargo-tarpaulin
- [ ] Helper function refactor for query param extraction (backlogged from parser)
### Binary (`stratum`)
- [ ] `main.rs` — start axum server
- [ ] Config file loading (toml)
- [ ] CLI args (port, config path, data dir)
- [ ] Graceful shutdown
- [ ] Structured logging via tracing
---
## Phase 2 Backlog — Geo Distribution
- [ ] Node discovery and membership
- [ ] Raft consensus via openraft (metadata only)
- [ ] Consistent hashing for object placement
- [ ] Shard distribution across geographic nodes
- [ ] Node failure detection and recovery
- [ ] Replication lag monitoring
---
## Phase 3 Backlog — Intelligent Tiering
- [ ] Access frequency tracking (exponential moving average)
- [ ] Spike detection (sudden 10x access increase → promote immediately)
- [ ] Time-of-day pattern recognition
- [ ] Decay function (not accessed in 48h → demote)
- [ ] MIME type classification (pre-trained ONNX model)
- [ ] Range request pattern detection (video streaming awareness)
- [ ] Tier promotion/demotion engine
- [ ] Warmup period (observe 7 days before making tier decisions)
- [ ] Developer priority hints via object metadata
- [ ] Transparency API (why is this object in this tier?)
- [ ] Prometheus metrics endpoint
---
## Phase 4 Backlog — Managed Service
- [ ] Multi-tenant isolation
- [ ] Grafana dashboard
- [ ] Alerting (disk usage, node health, replication lag)
- [ ] Billing metrics
- [ ] BSI C5 certification process
- [ ] ISO 27001 certification process
- [ ] SLA definition and monitoring
- [ ] Enterprise support tier
---
## Known Issues / Technical Debt
- [ ] `VolumeStore` is currently in-memory only — needs sled persistence
- [ ] Error responses return plain text — should return S3 XML format
- [ ] No auth middleware yet — all requests accepted unsigned
- [ ] `StorageConfig` cold tier credentials need secure storage solution
- [ ] Query param helper functions (opt_string, opt_parse) backlogged from parser refactor