Files
Stratum/todo.md

4.4 KiB

Stratum — TODO

Immediate Next Session

1. stratum-storage — Volume Layer

  • config.rs — StorageConfig with hot/warm/cold paths
  • tier.rs — StorageTier enum (Hot, Warm, Cold)
  • location.rs — Location + ShardLocation enums (Local/Remote/Mixed)
  • volume.rs — Volume struct with access tracking fields
  • store.rs — VolumeStore (in-memory HashMap for now)
  • shard.rs — async read/write/delete shard files via tokio::fs
  • Tests for all of the above

2. stratum-metadata — Bucket + Key Mapping

  • Sled-backed metadata store
  • Bucket operations (create, delete, exists, list)
  • Key → Volume ID mapping (put, get, delete, list)
  • Tests for all of the above

3. Wire Storage Into API Handlers (bottom-up)

  • CreateBucket → 200 (create metadata entry)
  • ListBuckets → 200 + XML response
  • PutObject → 200 (write shard, create volume, store mapping)
  • GetObject → 200 + stream bytes (read shard via volume location)
  • DeleteObject → 204 (delete shard + metadata)
  • HeadObject → 200 + metadata headers only
  • ListObjectsV2 → 200 + XML response
  • Multipart (last, most complex)

4. XML Responses

  • xml/responses.rs — ListBuckets XML
  • xml/responses.rs — ListObjectsV2 XML
  • xml/responses.rs — Error XML (replace current plain text)
  • xml/responses.rs — InitiateMultipartUploadResult XML
  • xml/responses.rs — CompleteMultipartUploadResult XML

Backlog (Implement After Core Works)

S3 Compatibility

  • AWS Signature V4 validation (stratum-auth)
  • ETag generation (MD5 for single part, MD5-of-MD5s for multipart)
  • Content-MD5 header validation on PUT
  • Bucket naming validation (3-63 chars, lowercase, no underscores)
  • GetBucketLocation endpoint
  • CopyObject endpoint
  • Virtual-hosted style URLs (bucket.host/key)
  • Range request support (critical for video streaming)
  • Conditional requests (If-None-Match, If-Modified-Since)

Storage

  • Erasure coding integration (reed-solomon-erasure)
  • Shard distribution across multiple disks/directories
  • Checksum verification on read
  • Atomic writes (write to temp, rename to final)
  • Multipart upload temporary shard storage
  • Multipart upload cleanup on abort

Testing

  • Run MinIO s3-tests compliance suite against server
  • Test with awscli (--no-sign-request flag)
  • Test with rclone
  • Test with aws-sdk-rust
  • Coverage report via cargo-tarpaulin
  • Helper function refactor for query param extraction (backlogged from parser)

Binary (stratum)

  • main.rs — start axum server
  • Config file loading (toml)
  • CLI args (port, config path, data dir)
  • Graceful shutdown
  • Structured logging via tracing

Phase 2 Backlog — Geo Distribution

  • Node discovery and membership
  • Raft consensus via openraft (metadata only)
  • Consistent hashing for object placement
  • Shard distribution across geographic nodes
  • Node failure detection and recovery
  • Replication lag monitoring

Phase 3 Backlog — Intelligent Tiering

  • Access frequency tracking (exponential moving average)
  • Spike detection (sudden 10x access increase → promote immediately)
  • Time-of-day pattern recognition
  • Decay function (not accessed in 48h → demote)
  • MIME type classification (pre-trained ONNX model)
  • Range request pattern detection (video streaming awareness)
  • Tier promotion/demotion engine
  • Warmup period (observe 7 days before making tier decisions)
  • Developer priority hints via object metadata
  • Transparency API (why is this object in this tier?)
  • Prometheus metrics endpoint

Phase 4 Backlog — Managed Service

  • Multi-tenant isolation
  • Grafana dashboard
  • Alerting (disk usage, node health, replication lag)
  • Billing metrics
  • BSI C5 certification process
  • ISO 27001 certification process
  • SLA definition and monitoring
  • Enterprise support tier

Known Issues / Technical Debt

  • VolumeStore is currently in-memory only — needs sled persistence
  • Error responses return plain text — should return S3 XML format
  • No auth middleware yet — all requests accepted unsigned
  • StorageConfig cold tier credentials need secure storage solution
  • Query param helper functions (opt_string, opt_parse) backlogged from parser refactor