Java 中生成组合的几种方式

1. 简介

在本教程中，我们将讨论如何在 Java 中解决 k-combinations 问题。

首先，我们会介绍并实现递归和迭代两种算法来生成指定大小的所有组合。随后，我们还会回顾一些使用常见 Java 库的解决方案。

2. 组合概述

简单来说，组合是从给定集合中选取若干元素的子集。

与排列不同的是，组合不关心元素的顺序，只关心某个元素是否被选中。

举个例子，在扑克牌游戏中，从一副 52 张牌中发 5 张牌给玩家，我们并不关心这 5 张牌的发牌顺序，而只关心最终手牌的内容。

有些问题需要我们遍历所有可能的组合。为了实现这一点，我们需要枚举出所有不同的组合情况。

从包含 “n” 个元素的集合中选择 “r” 个元素的不同方式数可以用以下数学公式表示：

因此，在最坏情况下，组合的数量会呈指数级增长。这意味着当集合较大时，可能无法枚举所有组合。

在这种情况下，我们可以随机选取部分代表性组合进行采样（sampling）。

接下来，我们将介绍生成组合的各种算法。

3. 递归算法生成组合

递归算法通常将问题拆解为更小的同类子问题，直到达到终止条件（即基础情况），然后直接求解基础情况。

我们将介绍两种划分问题的方式：

第一种方式基于集合中的元素进行划分；
第二种方式则通过追踪已选元素来划分任务。

3.1. 基于整个集合中的元素划分

我们可以通过逐一检查集合中的每个元素，决定是否将其包含在组合中。

对于集合中的每个元素，我们可以选择包含它或排除它。

✅ 如果我们选择包含第一个元素，则需要从剩余的 “n – 1” 个元素中选出 “r – 1” 个元素。
❌ 如果我们排除第一个元素，则需要从剩余的 “n – 1” 个元素中选出 “r” 个元素。

这可以用数学方式表达为：

下面是该方法的递归实现：

private void helper(List<int[]> combinations, int data[], int start, int end, int index) {
    if (index == data.length) {
        int[] combination = data.clone();
        combinations.add(combination);
    } else if (start <= end) {
        data[index] = start;
        helper(combinations, data, start + 1, end, index + 1);
        helper(combinations, data, start + 1, end, index);
    }
}

helper 方法会进行两次递归调用：一次包含当前元素，一次排除当前元素。

接下来，我们编写一个组合生成器来调用这个 helper 方法：

public List<int[]> generate(int n, int r) {
    List<int[]> combinations = new ArrayList<>();
    helper(combinations, new int[r], 0, n-1, 0);
    return combinations;
}

然后我们调用该方法生成组合：

List<int[]> combinations = generate(N, R);
for (int[] combination : combinations) {
    System.out.println(Arrays.toString(combination));
}
System.out.printf("generated %d combinations of %d items from %d ", combinations.size(), R, N);

执行后输出如下：

[0, 1]
[0, 2]
[0, 3]
[0, 4]
[1, 2]
[1, 3]
[1, 4]
[2, 3]
[2, 4]
[3, 4]
generated 10 combinations of 2 items from 5

最后，编写测试用例：

@Test
public void givenSetAndSelectionSize_whenCalculatedUsingSetRecursiveAlgorithm_thenExpectedCount() {
    SetRecursiveCombinationGenerator generator = new SetRecursiveCombinationGenerator();
    List<int[]> selection = generator.generate(N, R);
    assertEquals(nCr, selection.size());
}

⚠️ 注意：该方法的调用栈深度等于集合中元素的个数。如果集合很大，可能会导致 StackOverflowError。

因此，这种方法不适合处理大规模输入。

3.2. 基于组合中元素的划分

我们不再追踪输入集合中的元素，而是通过追踪已选元素来划分任务。

假设我们将输入集合的元素编号为 1 到 n。我们可以从第 1 到第 “n – r + 1” 个元素中选择第一个元素。

假设我们选择了第 k 个元素，那么我们需要从第 “k + 1” 到第 “n” 个元素中选择 “r – 1” 个元素。

这个过程的数学表达如下：

下面是实现该方法的递归代码：

private void helper(List<int[]> combinations, int data[], int start, int end, int index) {
    if (index == data.length) {
        int[] combination = data.clone();
        combinations.add(combination);
    } else {
        int max = Math.min(end, end + 1 - data.length + index);
        for (int i = start; i <= max; i++) {
            data[index] = i;
            helper(combinations, data, i + 1, end, index + 1);
        }
    }
}

在上面的代码中，for 循环负责选择下一个元素，然后递归调用 helper 方法继续选择剩余元素。

接着我们使用 helper 方法生成组合：

public List<int[]> generate(int n, int r) {
    List<int[]> combinations = new ArrayList<>();
    helper(combinations, new int[r], 0, n - 1, 0);
    return combinations;
}

最后，编写测试用例：

@Test
public void givenSetAndSelectionSize_whenCalculatedUsingSelectionRecursiveAlgorithm_thenExpectedCount() {
    SelectionRecursiveCombinationGenerator generator = new SelectionRecursiveCombinationGenerator();
    List<int[]> selection = generator.generate(N, R);
    assertEquals(nCr, selection.size());
}

✅ 该方法的调用栈深度等于组合中元素的个数。因此，只要组合大小不超过最大调用栈深度，该方法适用于大规模输入。

但如果组合本身也很大，该方法依然会失效。

4. 迭代算法

在迭代方法中，我们从一个初始组合开始，不断生成下一个组合，直到所有组合都被生成完毕。

我们按照字典序生成组合。从字典序最小的组合开始。

为了生成下一个组合，我们找到当前组合中可以递增的最右边的位置。然后将其递增，并在该位置右侧生成字典序最小的组合。

下面是实现代码：

public List<int[]> generate(int n, int r) {
    List<int[]> combinations = new ArrayList<>();
    int[] combination = new int[r];

    // 初始化为字典序最小的组合
    for (int i = 0; i < r; i++) {
        combination[i] = i;
    }

    while (combination[r - 1] < n) {
        combinations.add(combination.clone());

        // 生成下一个字典序组合
        int t = r - 1;
        while (t != 0 && combination[t] == n - r + t) {
            t--;
        }
        combination[t]++;
        for (int i = t + 1; i < r; i++) {
            combination[i] = combination[i - 1] + 1;
        }
    }

    return combinations;
}

接着编写测试用例：

@Test
public void givenSetAndSelectionSize_whenCalculatedUsingIterativeAlgorithm_thenExpectedCount() {
    IterativeCombinationGenerator generator = new IterativeCombinationGenerator();
    List<int[]> selection = generator.generate(N, R);
    assertEquals(nCr, selection.size());
}

5. 使用 Java 库实现组合

在可能的情况下，我们应该优先使用现有的库实现，而不是自己造轮子。下面我们将介绍几个常用的 Java 库。

5.1. Apache Commons

Apache Commons 的 [CombinatoricsUtils](https://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/util/CombinatoricsUtils.html) 类提供了很多组合相关的工具方法。其中的 combinationsIterator 方法返回一个按字典序生成组合的迭代器。

首先添加 Maven 依赖：

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-math3</artifactId>
    <version>3.6.1</version>
</dependency>

然后使用 combinationsIterator 方法打印组合：

public static void generate(int n, int r) {
    Iterator<int[]> iterator = CombinatoricsUtils.combinationsIterator(n, r);
    while (iterator.hasNext()) {
        final int[] combination = iterator.next();
        System.out.println(Arrays.toString(combination));
    }
}

5.2. Google Guava

Guava 的 [Sets](https://google.github.io/guava/releases/snapshot/api/docs/com/google/common/collect/Sets.html) 类提供了许多集合相关的工具方法。其中的 combinations 方法可以返回指定大小的所有子集。

添加 Maven 依赖：

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>27.0.1-jre</version>
</dependency>

使用 combinations 方法生成组合：

Set<Set<Integer>> combinations = Sets.combinations(ImmutableSet.of(0, 1, 2, 3, 4, 5), 3);

5.3. CombinatoricsLib

CombinatoricsLib 是一个轻量级库，支持排列、组合、子集、整数划分和笛卡尔积等操作。

添加 Maven 依赖：

<dependency>
    <groupId>com.github.dpaukov</groupId>
    <artifactId>combinatoricslib3</artifactId>
    <version>3.3.0</version>
</dependency>

使用示例：

Generator.combination(0, 1, 2, 3, 4, 5)
  .simple(3)
  .stream()
  .forEach(System.out::println);

执行后输出如下：

[0, 1, 2]
[0, 1, 3]
[0, 1, 4]
[0, 1, 5]
[0, 2, 3]
[0, 2, 4]
[0, 2, 5]
[0, 3, 4]
[0, 3, 5]
[0, 4, 5]
[1, 2, 3]
[1, 2, 4]
[1, 2, 5]
[1, 3, 4]
[1, 3, 5]
[1, 4, 5]
[2, 3, 4]
[2, 3, 5]
[2, 4, 5]
[3, 4, 5]

更多示例请参考 combinatoricslib3-example。

6. 总结

本文中我们实现了几种生成组合的算法，并介绍了几个常见的 Java 库实现。

✅ 在实际开发中，推荐优先使用成熟的库，而不是自己实现算法。

完整代码可以在 GitHub 上找到。

Persistence

REST

Security