* filer: add StreamMutateEntry bidi streaming RPC Add a bidirectional streaming RPC that carries all filer mutation types (create, update, delete, rename) over a single ordered stream. This eliminates per-request connection overhead for pipelined operations and guarantees mutation ordering within a stream. The server handler delegates each request to the existing unary handlers (CreateEntry, UpdateEntry, DeleteEntry) and uses a proxy stream adapter for rename operations to reuse StreamRenameEntry logic. The is_last field signals completion for multi-response operations (rename sends multiple events per request; create/update/delete always send exactly one response with is_last=true). * mount: add streaming mutation multiplexer (streamMutateMux) Implement a client-side multiplexer that routes all filer mutation RPCs (create, update, delete, rename) over a single bidirectional gRPC stream. Multiple goroutines submit requests through a send channel; a dedicated sendLoop serializes them on the stream; a recvLoop dispatches responses to waiting callers via per-request channels. Key features: - Lazy stream opening on first use - Automatic reconnection on stream failure - Permanent fallback to unary RPCs if filer returns Unimplemented - Monotonic request_id for response correlation - Multi-response support for rename operations (is_last signaling) The mux is initialized on WFS and closed during unmount cleanup. No call sites use it yet — wiring comes in subsequent commits. * mount: route CreateEntry and UpdateEntry through streaming mux Wire all CreateEntry call sites to use wfs.streamCreateEntry() which routes through the StreamMutateEntry stream when available, falling back to unary RPCs otherwise. Also wire Link's UpdateEntry calls through wfs.streamUpdateEntry(). Updated call sites: - flushMetadataToFiler (file flush after write) - Mkdir (directory creation) - Symlink (symbolic link creation) - createRegularFile non-deferred path (Mknod) - flushFileMetadata (periodic metadata flush) - Link (hard link: update source + create link + rollback) * mount: route UpdateEntry and DeleteEntry through streaming mux Wire remaining mutation call sites through the streaming mux: - saveEntry (Setattr/chmod/chown/utimes) → streamUpdateEntry - Unlink → streamDeleteEntry (replaces RemoveWithResponse) - Rmdir → streamDeleteEntry (replaces RemoveWithResponse) All filer mutations except Rename now go through StreamMutateEntry when the filer supports it, with automatic unary RPC fallback. * mount: route Rename through streaming mux Wire Rename to use streamMutate.Rename() when available, with fallback to the existing StreamRenameEntry unary stream. The streaming mux sends rename as a StreamRenameEntryRequest oneof variant. The server processes it through the existing rename logic and sends multiple StreamRenameEntryResponse events (one per moved entry), with is_last=true on the final response. All filer mutations now go through a single ordered stream. * mount: fix stream mux connection ownership WithGrpcClient(streamingMode=true) closes the gRPC connection when the callback returns, destroying the stream. Own the connection directly via pb.GrpcDial so it stays alive for the stream's lifetime. Close it explicitly in recvLoop on stream failure and in Close on shutdown. * mount: fix rename failure for deferred-create files Three fixes for rename operations over the streaming mux: 1. lookupEntry: fall back to local metadata store when filer returns "not found" for entries in uncached directories. Files created with deferFilerCreate=true exist only in the local leveldb store until flushed; lookupEntry skipped the local store when the parent directory had never been readdir'd, causing rename to fail with ENOENT. 2. Rename: wait for pending async flushes and force synchronous flush of dirty metadata before sending rename to the filer. Covers the writebackCache case where close() defers the flush to a background worker that may not complete before rename fires. 3. StreamMutateEntry: propagate rename errors from server to client. Add error/errno fields to StreamMutateEntryResponse so the mount can map filer errors to correct FUSE status codes instead of silently returning OK. Also fix the existing Rename error handler which could return fuse.OK on unrecognized errors. * mount: fix streaming mux error handling, sendLoop lifecycle, and fallback Address PR review comments: 1. Server: populate top-level Error/Errno on StreamMutateEntryResponse for create/update/delete errors, not just rename. Previously update errors were silently dropped and create/delete errors were only in nested response fields that the client didn't check. 2. Client: check nested error fields in CreateEntry (ErrorCode, Error) and DeleteEntry (Error) responses, matching CreateEntryWithResponse behavior. 3. Fix sendLoop lifecycle: give each stream generation a stopSend channel. recvLoop closes it on error to stop the paired sendLoop. Previously a reconnect left the old sendLoop draining sendCh, breaking ordering. 4. Transparent fallback: stream helpers and doRename fall back to unary RPCs on transport errors (ErrStreamTransport), including the first Unimplemented from ensureStream. Previously the first call failed instead of degrading. 5. Filer rotation in openStream: try all filer addresses on dial failure, matching WithFilerClient behavior. Stop early on Unimplemented. 6. Pass metadata-bearing context to StreamMutateEntry RPC call so sw-client-id header is actually sent. 7. Gate lookupEntry local-cache fallback on open dirty handle or pending async flush to avoid resurrecting deleted/renamed entries. 8. Remove dead code in flushFileMetadata (err=nil followed by if err!=nil). 9. Use string matching for rename error-to-errno mapping in the mount to stay portable across Linux/macOS (numeric errno values differ). * mount: make failAllPending idempotent with delete-before-close Change failAllPending to collect pending entries into a local slice (deleting from the sync.Map first) before closing channels. This prevents double-close panics if called concurrently. Also remove the unused err parameter. * mount: add stream generation tracking and teardownStream Introduce a generation counter on streamMutateMux that increments each time a new stream is created. Requests carry the generation they were enqueued for so sendLoop can reject stale requests after reconnect. Add teardownStream(gen) which is idempotent (only acts when gen matches current generation and stream is non-nil). Both sendLoop and recvLoop call it on error, replacing the inline cleanup in recvLoop. sendLoop now actively triggers teardown on send errors instead of silently exiting. ensureStream waits for the prior generation's recvDone before creating a new stream, ensuring all old pending waiters are failed before reconnect. recvLoop now takes the stream, generation, and recvDone channel as parameters to avoid accessing shared fields without the lock. * mount: harden Close to prevent races with teardownStream Nil out stream, cancel, and grpcConn under the lock so that any concurrent teardownStream call from recvLoop/sendLoop becomes a no-op. Call failAllPending before closing sendCh to unblock waiters promptly. Guard recvDone with a nil check for the case where Close is called before any stream was ever opened. * mount: make errCh receive ctx-aware in doUnary and Rename Replace the blocking <-sendReq.errCh with a select that also observes ctx.Done(). If sendLoop exits via stopSend without consuming a buffered request, the caller now returns ctx.Err() instead of blocking forever. The buffered errCh (capacity 1) ensures late acknowledgements from sendLoop don't block the sender. * mount: fix sendLoop/Close race and recvLoop/teardown pending channel race Three related fixes: 1. Stop closing sendCh in Close(). Closing the shared producer channel races with callers who passed ensureStream() but haven't sent yet, causing send-on-closed-channel panics. sendCh is now left open; ensureStream checks m.closed to reject new callers. 2. Drain buffered sendCh items on shutdown. sendLoop defers drainSendCh() on exit so buffered requests get an ErrStreamTransport on their errCh instead of blocking forever. Close() drains again for any stragglers enqueued between sendLoop's drain and the final shutdown. 3. Move failAllPending from teardownStream into recvLoop's defer. teardownStream (called from sendLoop on send error) was closing pending response channels while recvLoop could be between pending.Load and the channel send — a send-on-closed-channel panic. recvLoop is now the sole closer of pending channels, eliminating the race. Close() waits on recvDone (with cancel() to guarantee Recv unblocks) so pending cleanup always completes. * filer/mount: add debug logging for hardlink lifecycle Add V(0) logging at every point where a HardLinkId is created, stored, read, or deleted to trace orphaned hardlink references. Logging covers: - gRPC server: CreateEntry/UpdateEntry when request carries HardLinkId - FilerStoreWrapper: InsertEntry/UpdateEntry when entry has HardLinkId - handleUpdateToHardLinks: entry path, HardLinkId, counter, chunk count - setHardLink: KvPut with blob size - maybeReadHardLink: V(1) on read attempt and successful decode - DeleteHardLink: counter decrement/deletion events - Mount Link(): when NewHardLinkId is generated and link is created This helps diagnose how a git pack .rev file ended up with a HardLinkId during a clone (no hard links should be involved). * test: add git clone/pull integration test for FUSE mount Shell script that exercises git operations on a SeaweedFS mount: 1. Creates a bare repo on the mount 2. Clones locally, makes 3 commits, pushes to mount 3. Clones from mount bare repo into an on-mount working dir 4. Verifies clone integrity (files, content, commit hashes) 5. Pushes 2 more commits with renames and deletes 6. Checks out an older revision on the mount clone 7. Returns to branch and pulls with real changes 8. Verifies file content, renames, deletes after pull 9. Checks git log integrity and clean status 27 assertions covering file existence, content, commit hashes, file counts, renames, deletes, and git status. Run against any existing mount: bash test-git-on-mount.sh /path/to/mount * test: add git clone/pull FUSE integration test to CI suite Add TestGitOperations to the existing fuse_integration test framework. The test exercises git's full file operation surface on the mount: 1. Creates a bare repo on the mount (acts as remote) 2. Clones locally, makes 3 commits (files, bulk data, renames), pushes 3. Clones from mount bare repo into an on-mount working dir 4. Verifies clone integrity (content, commit hash, file count) 5. Pushes 2 more commits with new files, renames, and deletes 6. Checks out an older revision on the mount clone 7. Returns to branch and pulls with real fast-forward changes 8. Verifies post-pull state: content, renames, deletes, file counts 9. Checks git log integrity (5 commits) and clean status Runs automatically in the existing fuse-integration.yml CI workflow. * mount: fix permission check with uid/gid mapping The permission checks in createRegularFile() and Access() compared the caller's local uid/gid against the entry's filer-side uid/gid without applying the uid/gid mapper. With -map.uid 501:0, a directory created as uid 0 on the filer would not match the local caller uid 501, causing hasAccess() to fall through to "other" permission bits and reject write access (0755 → other has r-x, no w). Fix: map entry uid/gid from filer-space to local-space before the hasAccess() call so both sides are in the same namespace. This fixes rsync -a failing with "Permission denied" on mkstempat when using uid/gid mapping. * mount: fix Mkdir/Symlink returning filer-side uid/gid to kernel Mkdir and Symlink used `defer wfs.mapPbIdFromFilerToLocal(entry)` to restore local uid/gid, but `outputPbEntry` writes the kernel response before the function returns — so the kernel received filer-side uid/gid (e.g., 0:0). macFUSE then caches these and rejects subsequent child operations (mkdir, create) because the caller uid (501) doesn't match the directory owner (0), and "other" bits (0755 → r-x) lack write permission. Fix: replace the defer with an explicit call to mapPbIdFromFilerToLocal before outputPbEntry, so the kernel gets local uid/gid. Also add nil guards for UidGidMapper in Access and createRegularFile to prevent panics in tests that don't configure a mapper. This fixes rsync -a "Permission denied" on mkpathat for nested directories when using uid/gid mapping. * mount: fix Link outputting filer-side uid/gid to kernel, add nil guards Link had the same defer-before-outputPbEntry bug as Mkdir and Symlink: the kernel received filer-side uid/gid because the defer hadn't run yet when outputPbEntry wrote the response. Also add nil guards for UidGidMapper in Access and createRegularFile so tests without a mapper don't panic. Audit of all outputPbEntry/outputFilerEntry call sites: - Mkdir: fixed in prior commit (explicit map before output) - Symlink: fixed in prior commit (explicit map before output) - Link: fixed here (explicit map before output) - Create (existing file): entry from maybeLoadEntry (already mapped) - Create (deferred): entry has local uid/gid (never mapped to filer) - Create (non-deferred): createRegularFile defer runs before return - Mknod: createRegularFile defer runs before return - Lookup: entry from lookupEntry (already mapped) - GetAttr: entry from maybeReadEntry/maybeLoadEntry (already mapped) - readdir: entry from cache (mapIdFromFilerToLocal) or filer (mapped) - saveEntry: no kernel output - flushMetadataToFiler: no kernel output - flushFileMetadata: no kernel output * test: fix git test for same-filesystem FUSE clone When both the bare repo and working clone live on the same FUSE mount, git's local transport uses hardlinks and cross-repo stat calls that fail on FUSE. Fix: - Use --no-local on clone to disable local transport optimizations - Use reset --hard instead of checkout to stay on branch - Use fetch + reset --hard origin/<branch> instead of git pull to avoid local transport stat failures during fetch * adjust logging * test: use plain git clone/pull to exercise real FUSE behavior Remove --no-local and fetch+reset workarounds. The test should use the same git commands users run (clone, reset --hard, pull) so it reveals real FUSE issues rather than hiding them. * test: enable V(1) logging for filer/mount and collect logs on failure - Run filer and mount with -v=1 so hardlink lifecycle logs (V(0): create/delete/insert, V(1): read attempts) are captured - On test failure, automatically dump last 16KB of all process logs (master, volume, filer, mount) to test output - Copy process logs to /tmp/seaweedfs-fuse-logs/ for CI artifact upload - Update CI workflow to upload SeaweedFS process logs alongside test output * mount: clone entry for filer flush to prevent uid/gid race flushMetadataToFiler and flushFileMetadata used entry.GetEntry() which returns the file handle's live proto entry pointer, then mutated it in-place via mapPbIdFromLocalToFiler. During the gRPC call window, a concurrent Lookup (which takes entryLock.RLock but NOT fhLockTable) could observe filer-side uid/gid (e.g., 0:0) on the file handle entry and return it to the kernel. The kernel caches these attributes, so subsequent opens by the local user (uid 501) fail with EACCES. Fix: proto.Clone the entry before mapping uid/gid for the filer request. The file handle's live entry is never mutated, so concurrent Lookup always sees local uid/gid. This fixes the intermittent "Permission denied" on .git/FETCH_HEAD after the first git pull on a mount with uid/gid mapping. * mount: add debug logging for stale lock file investigation Add V(0) logging to trace the HEAD.lock recreation issue: - Create: log when O_EXCL fails (file already exists) with uid/gid/mode - completeAsyncFlush: log resolved path, saved path, dirtyMetadata, isDeleted at entry to trace whether async flush fires after rename - flushMetadataToFiler: log the dir/name/fullpath being flushed This will show whether the async flush is recreating the lock file after git renames HEAD.lock → HEAD. * mount: prevent async flush from recreating renamed .lock files When git renames HEAD.lock → HEAD, the async flush from the prior close() can run AFTER the rename and re-insert HEAD.lock into the meta cache via its CreateEntryRequest response event. The next git pull then sees HEAD.lock and fails with "File exists". Fix: add isRenamed flag on FileHandle, set by Rename before waiting for the pending async flush. The async flush checks this flag and skips the metadata flush for renamed files (same pattern as isDeleted for unlinked files). The data pages still flush normally. The Rename handler flushes deferred metadata synchronously (Case 1) before setting isRenamed, ensuring the entry exists on the filer for the rename to proceed. For already-released handles (Case 2), the entry was created by a prior flush. * mount: also mark renamed inodes via entry.Attributes.Inode fallback When GetInode fails (Forget already removed the inode mapping), the Rename handler couldn't find the pending async flush to set isRenamed. The async flush then recreated the .lock file on the filer. Fix: fall back to oldEntry.Attributes.Inode to find the pending async flush when the inode-to-path mapping is gone. Also extract MarkInodeRenamed into a method on FileHandleToInode for clarity. * mount: skip async metadata flush when saved path no longer maps to inode The isRenamed flag approach failed for refs/remotes/origin/HEAD.lock because neither GetInode nor oldEntry.Attributes.Inode could find the inode (Forget already evicted the mapping, and the entry's stored inode was 0). Add a direct check in completeAsyncFlush: before flushing metadata, verify that the saved path still maps to this inode in the inode-to-path table. If the path was renamed or removed (inode mismatch or not found), skip the metadata flush to avoid recreating a stale entry. This catches all rename cases regardless of whether the Rename handler could set the isRenamed flag. * mount: wait for pending async flush in Unlink before filer delete Unlink was deleting the filer entry first, then marking the draining async-flush handle as deleted. The async flush worker could race between these two operations and recreate the just-unlinked entry on the filer. This caused git's .lock files (e.g. refs/remotes/origin/HEAD.lock) to persist after git pull, breaking subsequent git operations. Move the isDeleted marking and add waitForPendingAsyncFlush() before the filer delete so any in-flight flush completes first. Even if the worker raced past the isDeleted check, the wait ensures it finishes before the filer delete cleans up any recreated entry. * mount: reduce async flush and metadata flush log verbosity Raise completeAsyncFlush entry log, saved-path-mismatch skip log, and flushMetadataToFiler entry log from V(0) to V(3)/V(4). These fire for every file close with writebackCache and are too noisy for normal use. * filer: reduce hardlink debug log verbosity from V(0) to V(4) HardLinkId logs in filerstore_wrapper, filerstore_hardlink, and filer_grpc_server fire on every hardlinked file operation (git pack files use hardlinks extensively) and produce excessive noise. * mount/filer: reduce noisy V(0) logs for link, rmdir, and empty folder check - weedfs_link.go: hardlink creation logs V(0) → V(4) - weedfs_dir_mkrm.go: non-empty folder rmdir error V(0) → V(1) - empty_folder_cleaner.go: "not empty" check log V(0) → V(4) * filer: handle missing hardlink KV as expected, not error A "kv: not found" on hardlink read is normal when the link blob was already cleaned up but a stale entry still references it. Log at V(1) for not-found; keep Error level for actual KV failures. * test: add waitForDir before git pull in FUSE git operations test After git reset --hard, the FUSE mount's metadata cache may need a moment to settle on slow CI. The git pull subprocess (unpack-objects) could fail to stat the working directory. Poll for up to 5s. * Update git_operations_test.go * wait * test: simplify FUSE test framework to use weed mini Replace the 4-process setup (master + volume + filer + mount) with 2 processes: "weed mini" (all-in-one) + "weed mount". This simplifies startup, reduces port allocation, and is faster on CI. * test: fix mini flag -admin → -admin.ui
522 lines
16 KiB
Go
522 lines
16 KiB
Go
package empty_folder_cleanup
|
|
|
|
import (
|
|
"context"
|
|
"strings"
|
|
"sync"
|
|
"time"
|
|
|
|
"github.com/seaweedfs/seaweedfs/weed/cluster/lock_manager"
|
|
"github.com/seaweedfs/seaweedfs/weed/glog"
|
|
"github.com/seaweedfs/seaweedfs/weed/pb"
|
|
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
|
|
"github.com/seaweedfs/seaweedfs/weed/s3api/s3_constants"
|
|
"github.com/seaweedfs/seaweedfs/weed/util"
|
|
)
|
|
|
|
const (
|
|
DefaultMaxCountCheck = 1000
|
|
DefaultCacheExpiry = 5 * time.Minute
|
|
DefaultQueueMaxSize = 1000
|
|
DefaultQueueMaxAge = 5 * time.Second
|
|
DefaultProcessorSleep = 10 * time.Second // How often to check queue
|
|
)
|
|
|
|
// FilerOperations defines the filer operations needed by EmptyFolderCleaner
|
|
type FilerOperations interface {
|
|
CountDirectoryEntries(ctx context.Context, dirPath util.FullPath, limit int) (count int, err error)
|
|
DeleteEntryMetaAndData(ctx context.Context, p util.FullPath, isRecursive, ignoreRecursiveError, shouldDeleteChunks, isFromOtherCluster bool, signatures []int32, ifNotModifiedAfter int64) error
|
|
GetEntryAttributes(ctx context.Context, p util.FullPath) (attributes map[string][]byte, err error)
|
|
}
|
|
|
|
// folderState tracks the state of a folder for empty folder cleanup
|
|
type folderState struct {
|
|
roughCount int // Cached rough count (up to maxCountCheck)
|
|
lastAddTime time.Time // Last time an item was added
|
|
lastDelTime time.Time // Last time an item was deleted
|
|
lastCheck time.Time // Last time we checked the actual count
|
|
}
|
|
|
|
type bucketCleanupPolicyState struct {
|
|
autoRemove bool
|
|
attrValue string
|
|
lastCheck time.Time
|
|
}
|
|
|
|
// EmptyFolderCleaner handles asynchronous cleanup of empty folders
|
|
// Each filer owns specific folders via consistent hashing based on the peer filer list
|
|
type EmptyFolderCleaner struct {
|
|
filer FilerOperations
|
|
lockRing *lock_manager.LockRing
|
|
host pb.ServerAddress
|
|
|
|
// Folder state tracking
|
|
mu sync.RWMutex
|
|
folderCounts map[string]*folderState // Rough count cache
|
|
bucketCleanupPolicies map[string]*bucketCleanupPolicyState // bucket path -> cleanup policy cache
|
|
|
|
// Cleanup queue (thread-safe, has its own lock)
|
|
cleanupQueue *CleanupQueue
|
|
|
|
// Configuration
|
|
maxCountCheck int // Max items to count (1000)
|
|
cacheExpiry time.Duration // How long to keep cache entries
|
|
processorSleep time.Duration // How often processor checks queue
|
|
bucketPath string // e.g., "/buckets"
|
|
|
|
// Control
|
|
enabled bool
|
|
stopCh chan struct{}
|
|
}
|
|
|
|
// NewEmptyFolderCleaner creates a new EmptyFolderCleaner
|
|
func NewEmptyFolderCleaner(filer FilerOperations, lockRing *lock_manager.LockRing, host pb.ServerAddress, bucketPath string) *EmptyFolderCleaner {
|
|
efc := &EmptyFolderCleaner{
|
|
filer: filer,
|
|
lockRing: lockRing,
|
|
host: host,
|
|
folderCounts: make(map[string]*folderState),
|
|
bucketCleanupPolicies: make(map[string]*bucketCleanupPolicyState),
|
|
cleanupQueue: NewCleanupQueue(DefaultQueueMaxSize, DefaultQueueMaxAge),
|
|
maxCountCheck: DefaultMaxCountCheck,
|
|
cacheExpiry: DefaultCacheExpiry,
|
|
processorSleep: DefaultProcessorSleep,
|
|
bucketPath: bucketPath,
|
|
enabled: true,
|
|
stopCh: make(chan struct{}),
|
|
}
|
|
go efc.cacheEvictionLoop()
|
|
go efc.cleanupProcessor()
|
|
return efc
|
|
}
|
|
|
|
// SetEnabled enables or disables the cleaner
|
|
func (efc *EmptyFolderCleaner) SetEnabled(enabled bool) {
|
|
efc.mu.Lock()
|
|
defer efc.mu.Unlock()
|
|
efc.enabled = enabled
|
|
}
|
|
|
|
// IsEnabled returns whether the cleaner is enabled
|
|
func (efc *EmptyFolderCleaner) IsEnabled() bool {
|
|
efc.mu.RLock()
|
|
defer efc.mu.RUnlock()
|
|
return efc.enabled
|
|
}
|
|
|
|
// ownsFolder checks if this filer owns the folder via consistent hashing
|
|
func (efc *EmptyFolderCleaner) ownsFolder(folder string) bool {
|
|
servers := efc.lockRing.GetSnapshot()
|
|
if len(servers) <= 1 {
|
|
return true // Single filer case
|
|
}
|
|
return efc.hashKeyToServer(folder, servers) == efc.host
|
|
}
|
|
|
|
// hashKeyToServer uses consistent hashing to map a folder to a server
|
|
func (efc *EmptyFolderCleaner) hashKeyToServer(key string, servers []pb.ServerAddress) pb.ServerAddress {
|
|
if len(servers) == 0 {
|
|
return ""
|
|
}
|
|
x := util.HashStringToLong(key)
|
|
if x < 0 {
|
|
x = -x
|
|
}
|
|
x = x % int64(len(servers))
|
|
return servers[x]
|
|
}
|
|
|
|
// OnDeleteEvent is called when a file or directory is deleted
|
|
// Both file and directory deletions count towards making the parent folder empty
|
|
// eventTime is the time when the delete event occurred (for proper ordering)
|
|
func (efc *EmptyFolderCleaner) OnDeleteEvent(directory string, entryName string, isDirectory bool, eventTime time.Time) {
|
|
// Skip if not under bucket path (must be at least /buckets/<bucket>/...)
|
|
if efc.bucketPath != "" && !isUnderBucketPath(directory, efc.bucketPath) {
|
|
return
|
|
}
|
|
|
|
// Check if we own this folder
|
|
if !efc.ownsFolder(directory) {
|
|
glog.V(4).Infof("EmptyFolderCleaner: not owner of %s, skipping", directory)
|
|
return
|
|
}
|
|
|
|
efc.mu.Lock()
|
|
defer efc.mu.Unlock()
|
|
|
|
// Check enabled inside lock to avoid race with Stop()
|
|
if !efc.enabled {
|
|
return
|
|
}
|
|
|
|
glog.V(3).Infof("EmptyFolderCleaner: delete event in %s/%s (isDir=%v)", directory, entryName, isDirectory)
|
|
|
|
// Update cached count (create entry if needed)
|
|
state, exists := efc.folderCounts[directory]
|
|
if !exists {
|
|
state = &folderState{}
|
|
efc.folderCounts[directory] = state
|
|
}
|
|
if state.roughCount > 0 {
|
|
state.roughCount--
|
|
}
|
|
state.lastDelTime = eventTime
|
|
|
|
// Only add to cleanup queue if roughCount suggests folder might be empty
|
|
if state.roughCount > 0 {
|
|
glog.V(3).Infof("EmptyFolderCleaner: skipping queue for %s, roughCount=%d", directory, state.roughCount)
|
|
return
|
|
}
|
|
|
|
// Add to cleanup queue with event time (handles out-of-order events)
|
|
if efc.cleanupQueue.Add(directory, entryName, eventTime) {
|
|
glog.V(3).Infof("EmptyFolderCleaner: queued %s for cleanup (triggered by %s)", directory, entryName)
|
|
}
|
|
}
|
|
|
|
// OnCreateEvent is called when a file or directory is created
|
|
// Both file and directory creations cancel pending cleanup for the parent folder
|
|
func (efc *EmptyFolderCleaner) OnCreateEvent(directory string, entryName string, isDirectory bool) {
|
|
// Skip if not under bucket path (must be at least /buckets/<bucket>/...)
|
|
if efc.bucketPath != "" && !isUnderBucketPath(directory, efc.bucketPath) {
|
|
return
|
|
}
|
|
|
|
efc.mu.Lock()
|
|
defer efc.mu.Unlock()
|
|
|
|
// Check enabled inside lock to avoid race with Stop()
|
|
if !efc.enabled {
|
|
return
|
|
}
|
|
|
|
// Update cached count only if already tracked (no need to track new folders)
|
|
if state, exists := efc.folderCounts[directory]; exists {
|
|
state.roughCount++
|
|
state.lastAddTime = time.Now()
|
|
}
|
|
|
|
// Remove from cleanup queue (cancel pending cleanup)
|
|
if efc.cleanupQueue.Remove(directory) {
|
|
glog.V(3).Infof("EmptyFolderCleaner: cancelled cleanup for %s due to new entry", directory)
|
|
}
|
|
}
|
|
|
|
// cleanupProcessor runs in background and processes the cleanup queue
|
|
func (efc *EmptyFolderCleaner) cleanupProcessor() {
|
|
ticker := time.NewTicker(efc.processorSleep)
|
|
defer ticker.Stop()
|
|
|
|
for {
|
|
select {
|
|
case <-efc.stopCh:
|
|
return
|
|
case <-ticker.C:
|
|
efc.processCleanupQueue()
|
|
}
|
|
}
|
|
}
|
|
|
|
// processCleanupQueue processes items from the cleanup queue
|
|
func (efc *EmptyFolderCleaner) processCleanupQueue() {
|
|
// Check if we should process
|
|
if !efc.cleanupQueue.ShouldProcess() {
|
|
if efc.cleanupQueue.Len() > 0 {
|
|
glog.Infof("EmptyFolderCleaner: pending queue not processed yet (len=%d, oldest_age=%v, max_size=%d, max_age=%v)",
|
|
efc.cleanupQueue.Len(), efc.cleanupQueue.OldestAge(), efc.cleanupQueue.maxSize, efc.cleanupQueue.maxAge)
|
|
}
|
|
return
|
|
}
|
|
|
|
glog.V(3).Infof("EmptyFolderCleaner: processing cleanup queue (len=%d, age=%v)",
|
|
efc.cleanupQueue.Len(), efc.cleanupQueue.OldestAge())
|
|
|
|
// Process all items that are ready
|
|
for efc.cleanupQueue.Len() > 0 {
|
|
// Check if still enabled
|
|
if !efc.IsEnabled() {
|
|
return
|
|
}
|
|
|
|
// Pop the oldest item
|
|
folder, triggeredBy, ok := efc.cleanupQueue.Pop()
|
|
if !ok {
|
|
break
|
|
}
|
|
|
|
// Execute cleanup for this folder
|
|
efc.executeCleanup(folder, triggeredBy)
|
|
}
|
|
}
|
|
|
|
// executeCleanup performs the actual cleanup of an empty folder
|
|
func (efc *EmptyFolderCleaner) executeCleanup(folder string, triggeredBy string) {
|
|
efc.mu.Lock()
|
|
|
|
// Quick check: if we have cached count and it's > 0, skip
|
|
if state, exists := efc.folderCounts[folder]; exists {
|
|
if state.roughCount > 0 {
|
|
glog.V(3).Infof("EmptyFolderCleaner: skipping %s (triggered by %s), cached count=%d", folder, triggeredBy, state.roughCount)
|
|
efc.mu.Unlock()
|
|
return
|
|
}
|
|
// If there was an add after our delete, skip
|
|
if !state.lastAddTime.IsZero() && state.lastAddTime.After(state.lastDelTime) {
|
|
glog.V(3).Infof("EmptyFolderCleaner: skipping %s (triggered by %s), add happened after delete", folder, triggeredBy)
|
|
efc.mu.Unlock()
|
|
return
|
|
}
|
|
}
|
|
efc.mu.Unlock()
|
|
|
|
// Re-check ownership (topology might have changed)
|
|
if !efc.ownsFolder(folder) {
|
|
glog.V(3).Infof("EmptyFolderCleaner: no longer owner of %s (triggered by %s), skipping", folder, triggeredBy)
|
|
return
|
|
}
|
|
|
|
ctx := context.Background()
|
|
bucketPath, autoRemove, source, attrValue, err := efc.getBucketCleanupPolicy(ctx, folder)
|
|
if err != nil {
|
|
if err == filer_pb.ErrNotFound {
|
|
return
|
|
}
|
|
glog.V(2).Infof("EmptyFolderCleaner: failed to load bucket cleanup policy for folder %s (triggered by %s): %v", folder, triggeredBy, err)
|
|
return
|
|
}
|
|
|
|
if !autoRemove {
|
|
glog.V(3).Infof("EmptyFolderCleaner: skipping folder %s (triggered by %s), bucket %s auto-remove-empty-folders disabled (source=%s attr=%s)",
|
|
folder, triggeredBy, bucketPath, source, attrValue)
|
|
return
|
|
}
|
|
|
|
// Check if folder is actually empty (count up to maxCountCheck)
|
|
count, err := efc.countItems(ctx, folder)
|
|
if err != nil {
|
|
glog.V(2).Infof("EmptyFolderCleaner: error counting items in %s: %v", folder, err)
|
|
return
|
|
}
|
|
|
|
efc.mu.Lock()
|
|
// Update cache
|
|
if _, exists := efc.folderCounts[folder]; !exists {
|
|
efc.folderCounts[folder] = &folderState{}
|
|
}
|
|
efc.folderCounts[folder].roughCount = count
|
|
efc.folderCounts[folder].lastCheck = time.Now()
|
|
efc.mu.Unlock()
|
|
|
|
if count > 0 {
|
|
glog.V(4).Infof("EmptyFolderCleaner: folder %s (triggered by %s) has %d items, not empty", folder, triggeredBy, count)
|
|
return
|
|
}
|
|
|
|
// Delete the empty folder
|
|
glog.Infof("EmptyFolderCleaner: deleting empty folder %s (triggered by %s)", folder, triggeredBy)
|
|
if err := efc.deleteFolder(ctx, folder); err != nil {
|
|
glog.V(2).Infof("EmptyFolderCleaner: failed to delete empty folder %s (triggered by %s): %v", folder, triggeredBy, err)
|
|
return
|
|
}
|
|
|
|
// Clean up cache entry
|
|
efc.mu.Lock()
|
|
delete(efc.folderCounts, folder)
|
|
efc.mu.Unlock()
|
|
|
|
// Note: No need to recursively check parent folder here.
|
|
// The deletion of this folder will generate a metadata event,
|
|
// which will trigger OnDeleteEvent for the parent folder.
|
|
}
|
|
|
|
// countItems counts items in a folder (up to maxCountCheck)
|
|
func (efc *EmptyFolderCleaner) countItems(ctx context.Context, folder string) (int, error) {
|
|
return efc.filer.CountDirectoryEntries(ctx, util.FullPath(folder), efc.maxCountCheck)
|
|
}
|
|
|
|
// deleteFolder deletes an empty folder
|
|
func (efc *EmptyFolderCleaner) deleteFolder(ctx context.Context, folder string) error {
|
|
return efc.filer.DeleteEntryMetaAndData(ctx, util.FullPath(folder), false, false, false, false, nil, 0)
|
|
}
|
|
|
|
func (efc *EmptyFolderCleaner) getBucketCleanupPolicy(ctx context.Context, folder string) (bucketPath string, autoRemove bool, source string, attrValue string, err error) {
|
|
bucketPath, ok := util.ExtractBucketPath(efc.bucketPath, folder, true)
|
|
if !ok {
|
|
return "", true, "default", "<not_bucket_path>", nil
|
|
}
|
|
|
|
now := time.Now()
|
|
|
|
efc.mu.RLock()
|
|
if state, found := efc.bucketCleanupPolicies[bucketPath]; found && now.Sub(state.lastCheck) <= efc.cacheExpiry {
|
|
efc.mu.RUnlock()
|
|
return bucketPath, state.autoRemove, "cache", state.attrValue, nil
|
|
}
|
|
efc.mu.RUnlock()
|
|
|
|
attrs, err := efc.filer.GetEntryAttributes(ctx, util.FullPath(bucketPath))
|
|
if err != nil {
|
|
return "", true, "", "", err
|
|
}
|
|
|
|
autoRemove, attrValue = autoRemoveEmptyFoldersEnabled(attrs)
|
|
|
|
efc.mu.Lock()
|
|
if efc.bucketCleanupPolicies == nil {
|
|
efc.bucketCleanupPolicies = make(map[string]*bucketCleanupPolicyState)
|
|
}
|
|
efc.bucketCleanupPolicies[bucketPath] = &bucketCleanupPolicyState{
|
|
autoRemove: autoRemove,
|
|
attrValue: attrValue,
|
|
lastCheck: now,
|
|
}
|
|
efc.mu.Unlock()
|
|
|
|
return bucketPath, autoRemove, "filer", attrValue, nil
|
|
}
|
|
|
|
func autoRemoveEmptyFoldersEnabled(attrs map[string][]byte) (bool, string) {
|
|
if attrs == nil {
|
|
return true, "<no_attrs>"
|
|
}
|
|
|
|
value, found := attrs[s3_constants.ExtAllowEmptyFolders]
|
|
if !found {
|
|
return true, "<missing>"
|
|
}
|
|
|
|
text := strings.TrimSpace(string(value))
|
|
if text == "" {
|
|
return true, "<empty>"
|
|
}
|
|
|
|
return !strings.EqualFold(text, "true"), text
|
|
}
|
|
|
|
// isUnderPath checks if child is under parent path
|
|
func isUnderPath(child, parent string) bool {
|
|
if parent == "" || parent == "/" {
|
|
return true
|
|
}
|
|
// Ensure parent ends without slash for proper prefix matching
|
|
if len(parent) > 0 && parent[len(parent)-1] == '/' {
|
|
parent = parent[:len(parent)-1]
|
|
}
|
|
// Child must start with parent and then have a / or be exactly parent
|
|
if len(child) < len(parent) {
|
|
return false
|
|
}
|
|
if child[:len(parent)] != parent {
|
|
return false
|
|
}
|
|
if len(child) == len(parent) {
|
|
return true
|
|
}
|
|
return child[len(parent)] == '/'
|
|
}
|
|
|
|
// isUnderBucketPath checks if directory is inside a bucket (under /buckets/<bucket>/...)
|
|
// This ensures we only clean up folders inside buckets, not the buckets themselves
|
|
func isUnderBucketPath(directory, bucketPath string) bool {
|
|
if bucketPath == "" {
|
|
return true
|
|
}
|
|
// Ensure bucketPath ends without slash
|
|
if len(bucketPath) > 0 && bucketPath[len(bucketPath)-1] == '/' {
|
|
bucketPath = bucketPath[:len(bucketPath)-1]
|
|
}
|
|
// Directory must be under bucketPath
|
|
if !isUnderPath(directory, bucketPath) {
|
|
return false
|
|
}
|
|
// Directory must be at least /buckets/<bucket>/<something>
|
|
// i.e., depth must be at least bucketPath depth + 2
|
|
// For /buckets (depth 1), we need at least /buckets/mybucket/folder (depth 3)
|
|
bucketPathDepth := strings.Count(bucketPath, "/")
|
|
directoryDepth := strings.Count(directory, "/")
|
|
return directoryDepth >= bucketPathDepth+2
|
|
}
|
|
|
|
// cacheEvictionLoop periodically removes stale entries from folderCounts
|
|
func (efc *EmptyFolderCleaner) cacheEvictionLoop() {
|
|
ticker := time.NewTicker(efc.cacheExpiry)
|
|
defer ticker.Stop()
|
|
|
|
for {
|
|
select {
|
|
case <-efc.stopCh:
|
|
return
|
|
case <-ticker.C:
|
|
efc.evictStaleCacheEntries()
|
|
}
|
|
}
|
|
}
|
|
|
|
// evictStaleCacheEntries removes cache entries that haven't been accessed recently
|
|
func (efc *EmptyFolderCleaner) evictStaleCacheEntries() {
|
|
efc.mu.Lock()
|
|
defer efc.mu.Unlock()
|
|
|
|
now := time.Now()
|
|
expiredCount := 0
|
|
for folder, state := range efc.folderCounts {
|
|
// Skip if folder is in cleanup queue
|
|
if efc.cleanupQueue.Contains(folder) {
|
|
continue
|
|
}
|
|
|
|
// Find the most recent activity time for this folder
|
|
lastActivity := state.lastCheck
|
|
if state.lastAddTime.After(lastActivity) {
|
|
lastActivity = state.lastAddTime
|
|
}
|
|
if state.lastDelTime.After(lastActivity) {
|
|
lastActivity = state.lastDelTime
|
|
}
|
|
|
|
// Evict if no activity within cache expiry period
|
|
if now.Sub(lastActivity) > efc.cacheExpiry {
|
|
delete(efc.folderCounts, folder)
|
|
expiredCount++
|
|
}
|
|
}
|
|
|
|
for bucketPath, state := range efc.bucketCleanupPolicies {
|
|
if now.Sub(state.lastCheck) > efc.cacheExpiry {
|
|
delete(efc.bucketCleanupPolicies, bucketPath)
|
|
}
|
|
}
|
|
|
|
if expiredCount > 0 {
|
|
glog.V(3).Infof("EmptyFolderCleaner: evicted %d stale cache entries", expiredCount)
|
|
}
|
|
}
|
|
|
|
// Stop stops the cleaner and cancels all pending tasks
|
|
func (efc *EmptyFolderCleaner) Stop() {
|
|
close(efc.stopCh)
|
|
|
|
efc.mu.Lock()
|
|
defer efc.mu.Unlock()
|
|
|
|
efc.enabled = false
|
|
efc.cleanupQueue.Clear()
|
|
efc.folderCounts = make(map[string]*folderState) // Clear cache on stop
|
|
efc.bucketCleanupPolicies = make(map[string]*bucketCleanupPolicyState)
|
|
}
|
|
|
|
// GetPendingCleanupCount returns the number of pending cleanup tasks (for testing)
|
|
func (efc *EmptyFolderCleaner) GetPendingCleanupCount() int {
|
|
return efc.cleanupQueue.Len()
|
|
}
|
|
|
|
// GetCachedFolderCount returns the cached count for a folder (for testing)
|
|
func (efc *EmptyFolderCleaner) GetCachedFolderCount(folder string) (int, bool) {
|
|
efc.mu.RLock()
|
|
defer efc.mu.RUnlock()
|
|
if state, exists := efc.folderCounts[folder]; exists {
|
|
return state.roughCount, true
|
|
}
|
|
return 0, false
|
|
}
|