1. 概述
Apache Commons CSV 库提供了创建和读取 CSV 文件的丰富功能。本文将通过一个简单示例,展示如何高效利用这个库。
2. Maven 依赖
首先通过 Maven 导入最新版本:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.10.0</version>
</dependency>
✅ 最新版本查询:Maven 仓库
3. 读取 CSV 文件
假设有 book.csv
文件包含书籍信息:
author,title
Dan Simmons,Hyperion
Douglas Adams,The Hitchhiker's Guide to the Galaxy
读取代码示例:
Map<String, String> AUTHOR_BOOK_MAP = new HashMap<>() {
{
put("Dan Simmons", "Hyperion");
put("Douglas Adams", "The Hitchhiker's Guide to the Galaxy");
}
});
String[] HEADERS = { "author", "title"};
@Test
void givenCSVFile_whenRead_thenContentsAsExpected() throws IOException {
Reader in = new FileReader("src/test/resources/book.csv");
CSVFormat csvFormat = CSVFormat.DEFAULT.builder()
.setHeader(HEADERS)
.setSkipHeaderRecord(true)
.build();
Iterable<CSVRecord> records = csvFormat.parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
assertEquals(AUTHOR_BOOK_MAP.get(author), title);
}
}
关键点:
- 跳过首行表头(
setSkipHeaderRecord(true)
) - 通过
CSVFormat
定义文件格式 - 后续章节会展示更多格式配置选项
4. 创建 CSV 文件
生成相同结构的 CSV 文件:
@Test
void givenAuthorBookMap_whenWrittenToStream_thenOutputStreamAsExpected() throws IOException {
StringWriter sw = new StringWriter();
CSVFormat csvFormat = CSVFormat.DEFAULT.builder()
.setHeader(HEADERS)
.build();
try (final CSVPrinter printer = new CSVPrinter(sw, csvFormat)) {
AUTHOR_BOOK_MAP.forEach((author, title) -> {
try {
printer.printRecord(author, title);
} catch (IOException e) {
e.printStackTrace();
}
});
}
assertEquals(EXPECTED_FILESTREAM, sw.toString().trim());
}
⚠️ 注意:
try-with-resources
确保CSVPrinter
正确关闭
5. 表头与列读取
5.1. 按索引访问列
最基础的方式,适用于无表头场景:
Reader in = new FileReader("book.csv");
Iterable<CSVRecord> records = csvFormat.parse(in);
for (CSVRecord record : records) {
String columnOne = record.get(0);
String columnTwo = record.get(1);
}
5.2. 通过预定义表头访问
更直观的列访问方式:
Iterable<CSVRecord> records = csvFormat.parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
}
5.3. 使用枚举作为表头
避免字符串硬编码,提升代码健壮性:
enum BookHeaders{
author, title
}
CSVFormat csvFormat = CSVFormat.DEFAULT.builder()
.setHeader(BookHeaders.class)
.setSkipHeaderRecord(true)
.build();
Iterable<CSVRecord> records = csvFormat.parse(in);
for (CSVRecord record : records) {
String author = record.get(BookHeaders.author);
String title = record.get(BookHeaders.title);
assertEquals(AUTHOR_BOOK_MAP.get(author), title);
}
5.4. 跳过表头行
CSV 文件通常首行为表头,直接跳过:
CSVFormat csvFormat = CSVFormat.DEFAULT.builder()
.setSkipHeaderRecord(true)
.build();
Iterable<CSVRecord> records = csvFormat.parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
}
5.5. 创建带表头的文件
生成文件时自动添加表头:
FileWriter out = new FileWriter("book_new.csv");
CSVPrinter printer = csvFormat.print(out);
6. 总结
本文通过示例展示了 Apache Commons CSV 的核心功能。更多高级用法可参考:
💡 踩坑提示:处理大文件时注意内存消耗,建议使用流式处理而非全量加载