Files
seaweedFS/other/java/hdfs3/README.md
dependabot[bot] c14e513964 chore(deps): bump org.apache.hadoop:hadoop-common from 3.2.4 to 3.4.0 in /other/java/hdfs3 (#7512)
* chore(deps): bump org.apache.hadoop:hadoop-common in /other/java/hdfs3

Bumps org.apache.hadoop:hadoop-common from 3.2.4 to 3.4.0.

---
updated-dependencies:
- dependency-name: org.apache.hadoop:hadoop-common
  dependency-version: 3.4.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* add java client unit tests

* Update dependency-reduced-pom.xml

* add java integration tests

* fix

* fix buffer

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-19 21:22:18 -08:00

191 lines
4.1 KiB
Markdown

# SeaweedFS Hadoop3 Client
Hadoop FileSystem implementation for SeaweedFS, compatible with Hadoop 3.x.
## Building
```bash
mvn clean install
```
## Testing
This project includes two types of tests:
### 1. Configuration Tests (No SeaweedFS Required)
These tests verify configuration handling and initialization logic without requiring a running SeaweedFS instance:
```bash
mvn test -Dtest=SeaweedFileSystemConfigTest
```
### 2. Integration Tests (Requires SeaweedFS)
These tests verify actual FileSystem operations against a running SeaweedFS instance.
#### Prerequisites
1. Start SeaweedFS with default ports:
```bash
# Terminal 1: Start master
weed master
# Terminal 2: Start volume server
weed volume -mserver=localhost:9333
# Terminal 3: Start filer
weed filer -master=localhost:9333
```
2. Verify services are running:
- Master: http://localhost:9333
- Filer HTTP: http://localhost:8888
- Filer gRPC: localhost:18888
#### Running Integration Tests
```bash
# Enable integration tests
export SEAWEEDFS_TEST_ENABLED=true
# Run all tests
mvn test
# Run specific test
mvn test -Dtest=SeaweedFileSystemTest
```
### Test Configuration
Integration tests can be configured via environment variables or system properties:
- `SEAWEEDFS_TEST_ENABLED`: Set to `true` to enable integration tests (default: false)
- Tests use these default connection settings:
- Filer Host: localhost
- Filer HTTP Port: 8888
- Filer gRPC Port: 18888
### Running Tests with Custom Configuration
To test against a different SeaweedFS instance, modify the test code or use Hadoop configuration:
```java
conf.set("fs.seaweed.filer.host", "your-host");
conf.setInt("fs.seaweed.filer.port", 8888);
conf.setInt("fs.seaweed.filer.port.grpc", 18888);
```
## Test Coverage
The test suite covers:
- **Configuration & Initialization**
- URI parsing and configuration
- Default values
- Configuration overrides
- Working directory management
- **File Operations**
- Create files
- Read files
- Write files
- Append to files
- Delete files
- **Directory Operations**
- Create directories
- List directory contents
- Delete directories (recursive and non-recursive)
- **Metadata Operations**
- Get file status
- Set permissions
- Set owner/group
- Rename files and directories
## Usage in Hadoop
1. Copy the built JAR to your Hadoop classpath:
```bash
cp target/seaweedfs-hadoop3-client-*.jar $HADOOP_HOME/share/hadoop/common/lib/
```
2. Configure `core-site.xml`:
```xml
<configuration>
<property>
<name>fs.seaweedfs.impl</name>
<value>seaweed.hdfs.SeaweedFileSystem</value>
</property>
<property>
<name>fs.seaweed.filer.host</name>
<value>localhost</value>
</property>
<property>
<name>fs.seaweed.filer.port</name>
<value>8888</value>
</property>
<property>
<name>fs.seaweed.filer.port.grpc</name>
<value>18888</value>
</property>
</configuration>
```
3. Use SeaweedFS with Hadoop commands:
```bash
hadoop fs -ls seaweedfs://localhost:8888/
hadoop fs -mkdir seaweedfs://localhost:8888/test
hadoop fs -put local.txt seaweedfs://localhost:8888/test/
```
## Continuous Integration
For CI environments, tests can be run in two modes:
1. **Configuration Tests Only** (default, no SeaweedFS required):
```bash
mvn test -Dtest=SeaweedFileSystemConfigTest
```
2. **Full Integration Tests** (requires SeaweedFS):
```bash
# Start SeaweedFS in CI environment
# Then run:
export SEAWEEDFS_TEST_ENABLED=true
mvn test
```
## Troubleshooting
### Tests are skipped
If you see "Skipping test - SEAWEEDFS_TEST_ENABLED not set":
```bash
export SEAWEEDFS_TEST_ENABLED=true
```
### Connection refused errors
Ensure SeaweedFS is running and accessible:
```bash
curl http://localhost:8888/
```
### gRPC errors
Verify the gRPC port is accessible:
```bash
# Should show the port is listening
netstat -an | grep 18888
```
## Contributing
When adding new features, please include:
1. Configuration tests (no SeaweedFS required)
2. Integration tests (with SEAWEEDFS_TEST_ENABLED guard)
3. Documentation updates