1. 概述

本文将演示如何使用AWS Java SDK处理Amazon S3的分片上传。

简单来说,分片上传就是将大文件拆分成多个小片段,然后分别上传。 所有片段在服务端接收后会自动重新组装。

分片上传的优势包括:

  • 更高吞吐量:可并行上传多个片段
  • 错误恢复更简单:只需重新上传失败的片段
  • 支持暂停/恢复:可在任意时间点上传片段,整个流程可暂停后继续

⚠️ 注意:Amazon S3要求除最后一个片段外,每个片段至少5MB

2. Maven依赖

首先添加AWS SDK依赖:

<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>s3</artifactId>
    <version>2.24.9</version>
</dependency>

最新版本请查看Maven Central

3. 执行分片上传

3.1 创建Amazon S3客户端

先创建S3访问客户端,使用AmazonS3ClientBuilder

AmazonS3 amazonS3 = AmazonS3ClientBuilder
  .standard()
  .withCredentials(new DefaultAWSCredentialsProviderChain())
  .withRegion(Regions.DEFAULT_REGION)
  .build();

这会使用默认凭证链获取AWS凭证。默认凭证链的工作原理详见官方文档。若使用非默认区域(如US West-2),需替换Regions.DEFAULT_REGION

3.2 初始化上传

创建CreateMultipartUploadRequest实例并指定bucket和key,调用s3.createMultipartUpload()初始化上传,返回的响应中包含uploadId

// 初始化分片上传
CreateMultipartUploadRequest createRequest = CreateMultipartUploadRequest.builder()
    .bucket(existingBucketName)
    .key(keyName)
    .build();

CreateMultipartUploadResponse createResponse = s3.createMultipartUpload(createRequest);

String uploadId = createResponse.uploadId();

3.3 准备并上传每个片段

使用5MB的ByteBuffer存储每个片段。对每个片段创建UploadPartRequest对象,包含bucket、key、uploadId、片段号和内容长度,调用uploadPart()上传。将返回的UploadPartResponse中的信息添加到已完成片段列表:

// 准备上传的片段
List<CompletedPart> completedParts = new ArrayList<>();
int partNumber = 1;
ByteBuffer buffer = ByteBuffer.allocate(5 * 1024 * 1024); // 设置片段大小(示例为5MB)

// 读取文件并上传每个片段
try (RandomAccessFile file = new RandomAccessFile(filePath, "r")) {
    long fileSize = file.length();
    long position = 0;

    while (position < fileSize) {
        file.seek(position);
        int bytesRead = file.getChannel().read(buffer);

        buffer.flip();
        UploadPartRequest uploadPartRequest = UploadPartRequest.builder()
            .bucket(existingBucketName)
            .key(keyName)
            .uploadId(uploadId)
            .partNumber(partNumber)
            .contentLength((long) bytesRead)
            .build();

        UploadPartResponse response = s3.uploadPart(uploadPartRequest, RequestBody.fromByteBuffer(buffer));

        completedParts.add(CompletedPart.builder()
            .partNumber(partNumber)
            .eTag(response.eTag())
            .build());

        buffer.clear();
        position += bytesRead;
        partNumber++;
    }
} catch (IOException e) {
    e.printStackTrace();
}

3.4 完成分片上传

所有片段上传完成后,用已完成片段列表创建CompletedMultipartUpload对象,将其放入CompleteMultipartUploadRequest并调用s3.completeMultipartUpload()完成上传:

// 完成分片上传
CompletedMultipartUpload completedUpload = CompletedMultipartUpload.builder()
    .parts(completedParts)
    .build();

CompleteMultipartUploadRequest completeRequest = CompleteMultipartUploadRequest.builder()
    .bucket(existingBucketName)
    .key(keyName)
    .uploadId(uploadId)
    .multipartUpload(completedUpload)
    .build();

CompleteMultipartUploadResponse completeResponse = s3.completeMultipartUpload(completeRequest);

3.5 验证上传结果

通过生成对象URL验证文件是否存在:

String objectUrl = s3.utilities().getUrl(GetUrlRequest.builder()
        .bucket(existingBucketName)
        .key(keyName)
        .build())
    .toExternalForm();

System.out.println("上传对象URL: " + objectUrl);

4. 总结

本文介绍了使用AWS Java SDK实现S3分片上传的核心流程。完整代码示例可在GitHub获取。


原始标题:Multipart Uploads in Amazon S3 with Java