22 Commits

Author SHA1 Message Date
Chris Lu
0cd9f34177 mount: improve EnsureVisited performance with dedup, parallelism, and batching (#7697)
* mount: add singleflight to deduplicate concurrent EnsureVisited calls

When multiple goroutines access the same uncached directory simultaneously,
they would all make redundant network requests to the filer. This change
uses singleflight.Group to ensure only one goroutine fetches the directory
entries while others wait for the result.

This fixes a race condition where concurrent lookups or readdir operations
on the same uncached directory would:
1. Make duplicate network requests to the filer
2. Insert duplicate entries into LevelDB cache
3. Waste CPU and network bandwidth

* mount: fetch parent directories in parallel during EnsureVisited

Previously, when accessing a deep path like /a/b/c/d, the parent directories
were fetched serially from target to root. This change:

1. Collects all uncached directories from target to root first
2. Fetches them all in parallel using errgroup
3. Relies on singleflight (from previous commit) for deduplication

This reduces latency when accessing deep uncached paths, especially in
high-latency network environments where parallel requests can significantly
improve performance.

* mount: add batch inserts for LevelDB meta cache

When populating the meta cache from filer, entries were inserted one-by-one
into LevelDB. This change:

1. Adds BatchInsertEntries method to LevelDBStore that uses LevelDB's
   native batch write API
2. Updates MetaCache to keep a direct reference to the LevelDB store
   for batch operations
3. Modifies doEnsureVisited to collect entries and insert them in
   batches of 100 entries

Batch writes are more efficient because:
- Reduces number of individual write operations
- Reduces disk syncs
- Improves throughput for large directories

* mount: fix potential nil dereference in MarkChildrenCached

Add missing check for inode existence in inode2path map before accessing
the InodeEntry. This prevents a potential nil pointer dereference if the
inode exists in path2inode but not in inode2path (which could happen due
to race conditions or bugs).

This follows the same pattern used in IsChildrenCached which properly
checks for existence before accessing the entry.

* mount: fix batch flush when last entry is hidden

The previous batch insert implementation relied on the isLast flag to flush
remaining entries. However, if the last entry is a hidden system entry
(like 'topics' or 'etc' in root), the callback returns early and the
remaining entries in the batch are never flushed.

Fix by:
1. Only flush when batch reaches threshold inside the callback
2. Flush any remaining entries after ReadDirAllEntries completes
3. Use error wrapping instead of logging+returning to avoid duplicate logs
4. Create new slice after flush to allow GC of flushed entries
5. Add documentation for batchInsertSize constant

This ensures all entries are properly inserted regardless of whether
the last entry is hidden, and prevents memory retention issues.

* mount: add context support for cancellation in EnsureVisited

Thread context.Context through the batch insert call chain to enable
proper cancellation and timeout support:

1. Use errgroup.WithContext() so if one fetch fails, others are cancelled
2. Add context parameter to BatchInsertEntries for consistency with InsertEntry
3. Pass context to ReadDirAllEntries for cancellation during network calls
4. Check context cancellation before starting work in doEnsureVisited
5. Use %w for error wrapping to preserve error types for inspection

This prevents unnecessary work when one directory fetch fails and makes
the batch operations consistent with the existing context-aware APIs.
2025-12-09 23:44:15 -08:00
tam-i13
b669607fcd Add error list each entry func (#7485)
* added error return in type ListEachEntryFunc

* return error if errClose

* fix fmt.Errorf

* fix return errClose

* use %w fmt.Errorf

* added entry in messege error

* add callbackErr in ListDirectoryEntries

* fix error

* add log

* clear err when the scanner stops on io.EOF, so returning err doesn’t surface EOF as a failure.

* more info in error

* add ctx to logs, error handling

* fix return eachEntryFunc

* fix

* fix log

* fix return

* fix foundationdb test s

* fix eachEntryFunc

* fix return resEachEntryFuncErr

* Update weed/filer/filer.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update weed/filer/elastic/v7/elastic_store.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update weed/filer/hbase/hbase_store.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update weed/filer/foundationdb/foundationdb_store.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update weed/filer/ydb/ydb_store.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix

* add scanErr

---------

Co-authored-by: Roman Tamarov <r.tamarov@kryptonite.ru>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-25 19:35:19 -08:00
Aleksey Kosov
4511c2cc1f Changes logging function (#6919)
* updated logging methods for stores

* updated logging methods for stores

* updated logging methods for filer

* updated logging methods for uploader and http_util

* updated logging methods for weed server

---------

Co-authored-by: akosov <a.kosov@kryptonite.ru>
2025-06-24 08:44:06 -07:00
chrislu
69fcdd0840 adjust logging 2024-09-10 10:28:49 -07:00
chrislu
70a4c98b00 refactor filer_pb.Entry and filer.Entry to use GetChunks()
for later locking on reading chunks
2022-11-15 06:33:36 -08:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
Konstantin Lebedev
21033ff4c3 refactor use const CountEntryChunksForGzip 2022-05-01 22:28:55 +05:00
chrislu
10ecf80ca1 add a debug capability to list all metadata keys 2022-01-11 23:25:04 -08:00
joshuafc
5654d0d60d CompactionTableSizeMultiplier of leveldb use default value. #2325
To improve performance of leveldb find key in condition of large directory(millions of files) which use uuid as filename.
2021-09-09 10:42:34 +08:00
zhoub
6a7ed1bd0e add bloom filter to leveldb_store to improve fuse performance. 2021-09-07 21:09:10 +08:00
Chris Lu
182288f860 filer: fix mysql, postgres batch delete error 2021-07-22 08:23:20 -07:00
Konstantin Lebedev
6aa1a56ec8 avoid crashes Galera Cluster
https://github.com/chrislusf/seaweedfs/issues/2125
2021-06-15 18:12:39 +05:00
Chris Lu
2d4c2db81d filer: leveldb, rocksdb auto create store directory
fix https://github.com/chrislusf/seaweedfs/issues/1901
2021-03-14 13:20:14 -07:00
Chris Lu
4be51c0701 filer: leveldb and hbase may miss files when listing large directories more than 1024
fix https://github.com/chrislusf/seaweedfs/issues/1768
2021-01-31 20:11:44 -08:00
Chris Lu
a4063a5437 add stream list directory entries 2021-01-15 23:56:24 -08:00
Chris Lu
f002e668de change limit to int64 in case of overflow 2021-01-14 23:10:37 -08:00
Chris Lu
b5ceffe188 implement leveldb changes 2021-01-14 22:33:05 -08:00
Chris Lu
361043e6c1 filer store: leveldb2 fix nil entry error if not found 2021-01-12 02:28:57 -08:00
Chris Lu
2c3c2c27d7 separate prefix from namePattern
fix https://github.com/chrislusf/seaweedfs/issues/1722
2021-01-01 20:23:23 -08:00
Chris Lu
eab53ea80d filer leveldb store: a bit more efficient directory listing with prefix 2020-11-22 21:10:41 -08:00
Chris Lu
b8f32bcab9 filer: compress stored metadata 2020-09-03 11:00:20 -07:00
Chris Lu
eb7929a971 rename filer2 to filer 2020-09-01 00:21:19 -07:00