2. 查找并修复性能问题

2.1 生产环境记录SQL语句

第一个性能问题极其容易被发现,却常常被忽视——在生产环境记录SQL语句

记录日志听起来无伤大雅,但实际应用中这招是性能杀手。尤其当Hibernate的show_sql参数设为true时,它会通过System.out.println输出SQL,效率极低:

Hibernate: select 
    order0_.id as id1_2_, 
    order0_.orderNumber as orderNum2_2_, 
    order0_.version as version3_2_ 
  from purchaseOrder order0_
Hibernate: select 
    items0_.order_id as order_id4_0_0_, 
    items0_.id as id1_0_0_, 
    items0_.id as id1_0_1_, 
    items0_.order_id as order_id4_0_1_, 
    items0_.product_id as product_5_0_1_, 
    items0_.quantity as quantity2_0_1_, 
    items0_.version as version3_0_1_ 
  from OrderItem items0_ 
  where items0_.order_id=?
Hibernate: select 
    items0_.order_id as order_id4_0_0_, 
    items0_.id as id1_0_0_, 
    items0_.id as id1_0_1_, 
    items0_.order_id as order_id4_0_1_, 
    items0_.product_id as product_5_0_1_, 
    items0_.quantity as quantity2_0_1_, 
    items0_.version as version3_0_1_ 
  from OrderItem items0_ 
  where items0_.order_id=?
Hibernate: select 
    items0_.order_id as order_id4_0_0_, 
    items0_.id as id1_0_0_, 
    items0_.id as id1_0_1_, 
    items0_.order_id as order_id4_0_1_, 
    items0_.product_id as product_5_0_1_, 
    items0_.quantity as quantity2_0_1_, 
    items0_.version as version3_0_1_ 
  from OrderItem items0_ 
  where items0_.order_id=?

我在一个项目中,仅将show_sql设为false,几分钟内性能就提升了20%。这种立竿见影的优化,下次站会汇报绝对亮眼。

修复方案

  • 直接在配置文件(如persistence.xml)中将show_sql设为false。生产环境根本不需要这些信息
  • 开发环境需要SQL日志?别用两套配置!应该为开发和生产环境分别配置日志:

开发环境配置
开发配置要提供尽可能多的调试信息,至少需要记录SQL语句。通过设置日志级别实现:

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{HH:mm:ss,SSS} %-5p [%c] - %m%n
log4j.rootLogger=info, stdout

# 基础日志级别
log4j.logger.org.hibernate=info

# SQL语句和参数
log4j.logger.org.hibernate.SQL=debug
log4j.logger.org.hibernate.type.descriptor.sql=trace

这样Hibernate会输出详细日志,包含SQL和绑定参数:

23:03:22,246 DEBUG SQL:92 - select 
    order0_.id as id1_2_, 
    order0_.orderNumber as orderNum2_2_, 
    order0_.version as version3_2_ 
  from purchaseOrder order0_ 
  where order0_.id=1
23:03:22,254 TRACE BasicExtractor:61 - extracted value ([id1_2_] : [BIGINT]) - [1]
23:03:22,261 TRACE BasicExtractor:61 - extracted value ([orderNum2_2_] : [VARCHAR]) - [order1]
23:03:22,263 TRACE BasicExtractor:61 - extracted value ([version3_2_] : [INTEGER]) - [0]

踩坑提示:开发环境可开启Hibernate统计(hibernate.generate_statistics=true),但生产环境绝对禁止!统计本身会拖垮性能。

统计日志示例:

23:04:12,123 INFO StatisticalLoggingSessionEventListener:258 - Session Metrics {
 23793 nanoseconds spent acquiring 1 JDBC connections;
 0 nanoseconds spent releasing 0 JDBC connections;
 394686 nanoseconds spent preparing 4 JDBC statements;
 2528603 nanoseconds spent executing 4 JDBC statements;
 0 nanoseconds spent executing 0 JDBC batches;
 0 nanoseconds spent performing 0 L2C puts;
 0 nanoseconds spent performing 0 L2C hits;
 0 nanoseconds spent performing 0 L2C misses;
 9700599 nanoseconds spent executing 1 flushes (flushing a total of 9 entities and 3 collections);
 42921 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

重点关注:

  • JDBC语句数量(第2-5行):远超预期?大概率是N+1查询问题
  • 二级缓存交互(第7-9行):监控缓存命中率,避免无效查询

生产环境配置
生产配置要极致简洁,只记录错误:

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{HH:mm:ss,SSS} %-5p [%c] - %m%n
log4j.rootLogger=info, stdout

# 只记录错误
log4j.logger.org.hibernate=error

2.2 N+1查询问题

N+1查询是Hibernate最常见的性能陷阱。虽然ORM概念本身有责任,但开发者理解懒惰加载机制才能彻底避免。

问题根源
实体映射时,Hibernate支持延迟加载(FetchType.LAZY):

@Entity
@Table(name = "purchaseOrder")
public class Order implements Serializable {

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private Set<OrderItem> items = new HashSet<OrderItem>();

    ...
}

这样加载Order时,不会立即查询关联的OrderItem。看似节省资源,但遍历订单列表时,每个订单都会触发额外查询:

List<Order> orders = em.createQuery("SELECT o FROM Order o").getResultList();
for (Order order : orders) {
    log.info("Order: " + order.getOrderNumber());
    log.info("Number of items: " + order.getItems().size()); // 触发N+1查询
}

实际执行时会产生N+1条SQL:

22:47:30,065 DEBUG SQL:92 - select 
    order0_.id as id1_2_, 
    order0_.orderNumber as orderNum2_2_, 
    order0_.version as version3_2_ 
  from purchaseOrder order0_  -- 第1条查询所有Order
22:47:30,136 INFO NamedEntityGraphTest:58 - Order: order1
22:47:30,140 DEBUG SQL:92 - select ... from OrderItem items0_ where items0_.order_id=? -- 第2条查询order1的items
22:47:30,171 INFO NamedEntityGraphTest:59 - Number of items: 2
22:47:30,171 INFO NamedEntityGraphTest:58 - Order: order2
22:47:30,172 DEBUG SQL:92 - select ... from OrderItem items0_ where items0_.order_id=? -- 第3条查询order2的items
22:47:30,174 INFO NamedEntityGraphTest:59 - Number of items: 2
... -- 后续每个Order触发一次查询

致命陷阱:小规模测试数据库可能看不出问题,但生产环境几千条订单时,查询量会爆炸式增长!

解决方案:用@NamedEntityGraph按需初始化关联关系
定义实体图,指定需要加载的属性:

@Entity
@Table(name = "purchase_order")
@NamedEntityGraph(
  name = "graph.Order.items", 
  attributeNodes = @NamedAttributeNode("items"))
public class Order implements Serializable {

    ...
}

查询时传入实体图提示:

EntityGraph graph = this.em.getEntityGraph("graph.Order.items");

Map hints = new HashMap();
hints.put("javax.persistence.fetchgraph", graph);

return this.em.find(Order.class, 1L, hints);

Hibernate会生成JOIN查询一次性加载数据:

select order0_.id as id1_2_0_, 
  order0_.orderNumber as orderNum2_2_0_, 
  items1_.order_id as order_id4_2_1_, 
  ... 
from purchase_order order0_ 
  left outer join OrderItem items1_ on order0_.id=items1_.order_id 
  left outer join Product product2_ on items1_.product_id=product2_.id 
where order0_.id=?

复杂关联处理
使用@NamedSubGraph定义多级加载:

@Entity
@Table(name = "purchase_order")
@NamedEntityGraph(
  name = "graph.Order.items", 
  attributeNodes = @NamedAttributeNode(value = "items", subgraph = "items"), 
  subgraphs = @NamedSubgraph(name = "items", attributeNodes = @NamedAttributeNode("product")))
public class Order implements Serializable {

    ...
}

这样能同时加载Order、OrderItem及关联的Product,彻底消灭N+1问题。

2.3 逐条更新实体

面向对象思维下,逐条更新实体很自然:

// 先查询
List<Product> products = em.createQuery("SELECT p FROM Product p").getResultList();
// 逐条修改
for (Product p : products) {
    p.setPrice(p.getPrice() * 1.1); // 涨价10%
}
// 提交时自动生成多条UPDATE

但日志中会出现大量重复的UPDATE语句:

22:58:05,829 DEBUG SQL:92 - select ... from Product product0_  -- 1次查询
22:58:05,883 DEBUG SQL:92 - update Product set name=?, price=?, version=? where id=? and version=?  -- 第1条更新
22:58:05,889 DEBUG SQL:92 - update Product set name=?, price=?, version=? where id=? and version=?  -- 第2条更新
22:58:05,891 DEBUG SQL:92 - update Product set name=?, price=?, version=? where id=? and version=?  -- 第3条更新
...  -- 后续N条更新

高效方案:批量操作
用JPQL/CriteriaUpdate/native SQL执行批量更新:

// 单条SQL更新所有产品
em.createQuery("UPDATE Product p SET p.price = p.price * 1.1").executeUpdate();

注意事项:批量操作会绕过Hibernate持久化上下文,需确保:

  1. 同一Session内不要混合使用批量操作和实体更新
  2. 必要时手动清除缓存(em.clear()

3. 总结

本文揭示了3种可通过日志定位的Hibernate性能问题

  1. 生产环境记录SQL

    • 症状:大量System.out.println输出的SQL
    • 修复:关闭show_sql,分离开发/生产日志配置
  2. N+1查询问题

    • 症状:1条查询+N条关联查询
    • 修复:用@NamedEntityGraph按需加载关联关系
  3. 逐条更新实体

    • 症状:大量结构相同的UPDATE语句
    • 修复:改用JPQL/CriteriaUpdate批量操作

核心原则:Hibernate隐藏了数据库交互细节,但性能优化必须直面SQL。每次修改持久层代码后,务必检查日志中的SQL数量和模式——这是避免生产环境踩坑的最有效手段。


原始标题:3 Common Hibernate Performance Issues in Your Log