2. 理解批量处理
当应用与数据库建立连接后,我们可以一次性执行多条SQL语句,而不是逐条发送。这种方式能显著降低通信开销。Spring JDBC API正是实现这一目标的有效工具。
2.1. 数据库支持
虽然JDBC API提供了批量功能,但并非所有JDBC驱动都真正实现了这些API。Spring提供了JdbcUtils.supportsBatchUpdates()
工具方法来检查连接是否支持批量操作。不过使用JdbcTemplate
时,Spring通常已自动处理兼容性检查,不支持时会降级为常规操作。
2.2. 影响性能的关键因素
插入大量数据时需重点关注:
- 数据库连接数量
- 目标表结构
- 执行逻辑任务所需的数据库请求数
解决连接问题的首选方案是连接池,通过复用连接避免重复创建开销。目标表结构同样关键:索引列越多性能越差,因为每次插入后数据库都需要更新索引。批量处理可大幅减少插入时的网络往返次数,但需注意不同数据库的批量操作效率差异(如MySQL需额外配置才能获得明显提升)。
3. Spring JDBC批量插入实战
以Postgres 14为数据库,首先添加依赖:
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<scope>runtime</scope>
</dependency>
再添加Spring JDBC抽象层:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
我们将对比两种插入方式:常规单条插入和批量插入。先创建测试表:
CREATE TABLE product (
id SERIAL PRIMARY KEY,
title VARCHAR(40),
created_ts timestamp without time zone,
price numeric
);
对应实体类:
public class Product {
private long id;
private String title;
private LocalDateTime createdTs;
private BigDecimal price;
// 标准getter/setter
}
3.1. 数据源配置
在application.properties
中配置:
spring.datasource.url=jdbc:postgresql://localhost:5432/sample-baeldung-db
spring.datasource.username=postgres
spring.datasource.password=root
spring.datasource.driver-class-name=org.postgresql.Driver
3.2. 常规插入实现
创建仓库接口:
public interface ProductRepository {
void saveAll(List<Product> products);
}
常规插入实现(单条循环插入):
@Repository
public class SimpleProductRepository implements ProductRepository {
private JdbcTemplate jdbcTemplate;
public SimpleProductRepository(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
}
@Override
@Transactional
public void saveAll(List<Product> products) {
for (Product product : products) {
jdbcTemplate.update("INSERT INTO PRODUCT (TITLE, CREATED_TS, PRICE) " +
"VALUES (?, ?, ?)",
product.getTitle(),
Timestamp.valueOf(product.getCreatedTs()),
product.getPrice());
}
}
}
创建服务类生成测试数据:
public class ProductService {
private ProductRepository productRepository;
private Random random;
private Clock clock;
// 构造函数注入依赖
private List<Product> generate(int count) {
final String[] titles = { "car", "plane", "house", "yacht" };
final BigDecimal[] prices = {
new BigDecimal("12483.12"),
new BigDecimal("8539.99"),
new BigDecimal("88894"),
new BigDecimal("458694")
};
final List<Product> products = new ArrayList<>(count);
for (int i = 0; i < count; i++) {
Product product = new Product();
product.setCreatedTs(LocalDateTime.now(clock));
product.setPrice(prices[random.nextInt(4)]);
product.setTitle(titles[random.nextInt(4)]);
products.add(product);
}
return products;
}
@Transactional
public long createProducts(int count) {
List<Product> products = generate(count);
long startTime = clock.millis();
productRepository.saveAll(products);
return clock.millis() - startTime;
}
}
配置Spring Bean:
@Configuration
public class AppConfig {
@Bean
public ProductService simpleProductService(SimpleProductRepository simpleProductRepository) {
return new ProductService(simpleProductRepository, new Random(), Clock.systemUTC());
}
}
3.3. 批量插入实现
创建批量插入仓库:
@Repository
public class BatchProductRepository implements ProductRepository {
private JdbcTemplate jdbcTemplate;
public BatchProductRepository(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
}
@Override
@Transactional
public void saveAll(List<Product> products) {
jdbcTemplate.batchUpdate("INSERT INTO PRODUCT (TITLE, CREATED_TS, PRICE) " +
"VALUES (?, ?, ?)",
products,
100,
(PreparedStatement ps, Product product) -> {
ps.setString(1, product.getTitle());
ps.setTimestamp(2, Timestamp.valueOf(product.getCreatedTs()));
ps.setBigDecimal(3, product.getPrice());
});
}
}
关键点: 批量大小设为100,意味着每100条记录合并一次网络请求。推荐批量大小50-100,但需根据数据库配置调整(如MySQL的max_allowed_packet
限制)。
添加批量服务Bean:
@Bean
public ProductService batchProductService(BatchProductRepository batchProductRepository) {
return new ProductService(batchProductRepository, new Random(), Clock.systemUTC());
}
4. 性能对比
通过Spring Boot命令行应用测试:
@SpringBootApplication
public class SpringJdbcBatchPerformanceApplication implements CommandLineRunner {
@Autowired
@Qualifier("batchProductService")
private ProductService batchProductService;
@Autowired
@Qualifier("simpleProductService")
private ProductService simpleProductService;
public static void main(String[] args) {
SpringApplication.run(SpringJdbcBatchPerformanceApplication.class, args);
}
@Override
public void run(String... args) throws Exception {
int[] recordCounts = {1, 10, 100, 1000, 10_000, 100_000, 1000_000};
for (int recordCount : recordCounts) {
long regularElapsedTime = simpleProductService.createProducts(recordCount);
long batchElapsedTime = batchProductService.createProducts(recordCount);
System.out.println(String.join("", Collections.nCopies(50, "-")));
System.out.format("%-20s%-5s%-10s%-5s%8sms\n", "常规插入", "|", recordCount, "|", regularElapsedTime);
System.out.format("%-20s%-5s%-10s%-5s%8sms\n", "批量插入", "|", recordCount, "|", batchElapsedTime);
System.out.printf("性能提升: %d %s\n", calculateGainInPercent(regularElapsedTime, batchElapsedTime), "%");
}
}
int calculateGainInPercent(long before, long after) {
return (int) Math.floor(100D * (before - after) / before);
}
}
基础测试结果:
--------------------------------------------------
常规插入 | 1 | 14ms
批量插入 | 1 | 8ms
性能提升: 42 %
--------------------------------------------------
常规插入 | 10 | 4ms
批量插入 | 10 | 1ms
性能提升: 75 %
--------------------------------------------------
常规插入 | 100 | 29ms
批量插入 | 100 | 6ms
性能提升: 79 %
--------------------------------------------------
常规插入 | 1000 | 175ms
批量插入 | 1000 | 24ms
性能提升: 86 %
--------------------------------------------------
常规插入 | 10000 | 861ms
批量插入 | 10000 | 128ms
性能提升: 85 %
--------------------------------------------------
常规插入 | 100000 | 5098ms
批量插入 | 100000 | 1126ms
性能提升: 77 %
--------------------------------------------------
常规插入 | 1000000 | 47738ms
批量插入 | 1000000 | 13066ms
性能提升: 72 %
--------------------------------------------------
进阶优化: 启用Postgres的多值插入特性,在application.properties
添加:
spring.datasource.hikari.data-source-properties.reWriteBatchedInserts=true
优化后测试结果:
--------------------------------------------------
常规插入 | 1 | 15ms
批量插入 | 1 | 10ms
性能提升: 33 %
--------------------------------------------------
常规插入 | 10 | 3ms
批量插入 | 10 | 2ms
性能提升: 33 %
--------------------------------------------------
常规插入 | 100 | 42ms
批量插入 | 100 | 10ms
性能提升: 76 %
--------------------------------------------------
常规插入 | 1000 | 141ms
批量插入 | 1000 | 19ms
性能提升: 86 %
--------------------------------------------------
常规插入 | 10000 | 827ms
批量插入 | 10000 | 104ms
性能提升: 87 %
--------------------------------------------------
常规插入 | 100000 | 5093ms
批量插入 | 100000 | 981ms
性能提升: 80 %
--------------------------------------------------
常规插入 | 1000000 | 50482ms
批量插入 | 1000000 | 9821ms
性能提升: 80 %
--------------------------------------------------
关键发现: 多值插入在数据量较大时(>1000条)能进一步提升性能,但需注意不同数据库的配置差异。
5. 结论
本文通过实战对比验证了Spring JDBC批量插入的性能优势:
- 批量操作相比单条插入普遍有70%-90%的性能提升
- 批量大小建议设为50-100,需结合数据库配置调整
- Postgres等多值插入特性可进一步优化性能(大数量集时提升10%左右)
生产环境建议:
- 优先检查JDBC驱动的批量支持程度
- 根据数据集大小选择合适的批量策略
- 针对特定数据库启用优化特性(如Postgres的
reWriteBatchedInserts
)
完整示例代码见GitHub仓库