528. Random Pick with Weight (M)

https://leetcode.com/problems/random-pick-with-weight/

You are given a 0-indexed array of positive integers w where w[i] describes the weight of the ith index.

You need to implement the function pickIndex(), which randomly picks an index in the range [0, w.length - 1] (inclusive) and returns it. The probability of picking an index i is w[i] / sum(w).

For example, if w = [1, 3], the probability of picking index 0 is 1 / (1 + 3) = 0.25 (i.e., 25%), and the probability of picking index 1 is 3 / (1 + 3) = 0.75 (i.e., 75%).

Example 1:

Input
["Solution","pickIndex"]
[[[1]],[]]
Output
[null,0]

Explanation
Solution solution = new Solution([1]);
solution.pickIndex(); // return 0. The only option is to return 0 since there is only one element in w.

Example 2:

Input
["Solution","pickIndex","pickIndex","pickIndex","pickIndex","pickIndex"]
[[[1,3]],[],[],[],[],[]]
Output
[null,1,1,1,1,0]

Explanation
Solution solution = new Solution([1, 3]);
solution.pickIndex(); // return 1. It is returning the second element (index = 1) that has a probability of 3/4.
solution.pickIndex(); // return 1
solution.pickIndex(); // return 1
solution.pickIndex(); // return 1
solution.pickIndex(); // return 0. It is returning the first element (index = 0) that has a probability of 1/4.

Since this is a randomization problem, multiple answers are allowed.
All of the following outputs can be considered correct:
[null,1,1,1,1,0]
[null,1,1,1,1,1]
[null,1,1,1,0,0]
[null,1,1,1,0,1]
[null,1,0,1,0,0]
......
and so on.

Constraints:

1 <= w.length <= 104
1 <= w[i] <= 105
pickIndex will be called at most 104 times.

Solution:

首先回顾一下我们和随机算法有关的历史文章：

前文设计随机删除元素的数据结构主要考察的是数据结构的使用，每次把元素移到数组尾部再删除，可以避免数据搬移。

前文无限序列中随机抽取元素讲的是经典的「水塘抽样算法」，运用简单的数学运算，在无限序列中等概率选取元素。

前文算法笔试技巧中我还分享过一个巧用概率最大化测试用例通过率的骗分技巧。

不过上述旧文并不能解决本文提出的问题，反而是前文 前缀和技巧 加上 二分搜索详解 能够解决带权重的随机选择算法。

这个随机算法和前缀和技巧和二分搜索技巧能扯上啥关系？且听我慢慢道来。

假设给你输入的权重数组是 w = [1,3,2,1]，我们想让概率符合权重，那么可以抽象一下，根据权重画出这么一条彩色的线段：

如果我在线段上面随机丢一个石子，石子落在哪个颜色上，我就选择该颜色对应的权重索引，那么每个索引被选中的概率是不是就是和权重相关联了？

所以，你再仔细看看这条彩色的线段像什么？这不就是 前缀和数组 嘛：

那么接下来，如何模拟在线段上扔石子？

当然是随机数，比如上述前缀和数组 preSum，取值范围是 [1, 7]，那么我生成一个在这个区间的随机数 target = 5，就好像在这条线段中随机扔了一颗石子：

还有个问题，preSum 中并没有 5 这个元素，我们应该选择比 5 大的最小元素，也就是 6，即 preSum 数组的索引 3：

如何快速寻找数组中大于等于目标值的最小元素？这里就要用到 二分搜索 了，确切地说是搜索左侧边界的二分搜索。

到这里，这道题的核心思路就说完了，主要分几步：

1、根据权重数组 w 生成前缀和数组 preSum。

2、生成一个取值在 preSum 之内的随机数，用二分搜索算法寻找大于等于这个随机数的最小元素索引。

3、最后对这个索引减一（因为前缀和数组有一位索引偏移），就可以作为权重数组的索引，即最终答案:

解法代码

上述思路应该不难理解，但是写代码的时候坑可就多了。

要知道涉及开闭区间、索引偏移和二分搜索的题目，需要你对算法的细节把控非常精确，否则会出各种难以排查的 bug。

下面来抠细节，继续前面的例子：

就比如这个 preSum 数组，你觉得随机数 target 应该在什么范围取值？闭区间 [0, 7] 还是左闭右开 [0, 7)？

都不是，应该在闭区间 [1, 7] 中选择，因为前缀和数组中 0 本质上是个占位符，仔细体会一下：

所以要这样写代码：

int n = preSum.length;
// target 取值范围是闭区间 [1, preSum[n - 1]]
int target = rand.nextInt(preSum[n - 1]) + 1;

接下来，在 preSum 中寻找大于等于 target 的最小元素索引，应该用什么品种的二分搜索？搜索左侧边界的还是搜索右侧边界的？

实际上应该使用搜索左侧边界的二分搜索：

// 搜索左侧边界的二分搜索
int left_bound(int[] nums, int target) {
    if (nums.length == 0) return -1;
    int left = 0, right = nums.length;
    while (left < right) {
        int mid = left + (right - left) / 2;
        if (nums[mid] == target) {
            right = mid;
        } else if (nums[mid] < target) {
            left = mid + 1;
        } else if (nums[mid] > target) {
            right = mid;
        }
    }
    return left;
}

前文二分搜索详解着重讲了数组中存在目标元素重复的情况，没仔细讲目标元素不存在的情况。

当目标元素 target 不存在数组 nums 中时，搜索左侧边界的二分搜索的返回值可以做以下几种解读：

1、返回的这个值是 nums 中大于等于 target 的最小元素索引。

2、返回的这个值是 target 应该插入在 nums 中的索引位置。

3、返回的这个值是 nums 中小于 target 的元素个数。

比如在有序数组 nums = [2,3,5,7] 中搜索 target = 4，搜索左边界的二分算法会返回 2，你带入上面的说法，都是对的。

所以以上三种解读都是等价的，可以根据具体题目场景灵活运用，显然这里我们需要的是第一种。

综上，我们可以写出最终解法代码：

class Solution {
    // 前缀和数组
    private int[] preSum;
    private Random rand = new Random();
    
    public Solution(int[] w) {
        int n = w.length;
        // 构建前缀和数组，偏移一位留给 preSum[0]
        preSum = new int[n + 1];
        preSum[0] = 0;
        // preSum[i] = sum(w[0..i-1])
        for (int i = 1; i <= n; i++) {
            preSum[i] = preSum[i - 1] + w[i - 1];
        }
    }
    
    public int pickIndex() {
        int n = preSum.length;
        // 在闭区间 [1, preSum[n - 1]] 中随机选择一个数字
        int target = rand.nextInt(preSum[n - 1]) + 1;
        // 获取 target 在前缀和数组 preSum 中的索引
        // 搜索左侧边界的二分搜索
        int left = 0, right = n;
        while (left < right) {
            int mid = left + (right - left) / 2;
            if (preSum[mid] < target) {
                left = mid + 1;
            } else {
                right = mid;
            }
        }
        // preSum 的索引偏移了一位，还原为权重数组 w 的索引
        return left - 1;
    }
}


JAVA Version:
class Solution {

    private Random random;
    private int[] preSum;
    public Solution(int[] w) {
        
        random = new Random();
        preSum = new int[w.length+1];
        preSum[0] = 0;
        for(int i = 1;i< w.length+1; i++)
        {
            preSum[i] = preSum[i-1] + w[i-1];
        } 
    }
    
    public int pickIndex() {
        
        int len = preSum.length;
        int target = random.nextInt(preSum[len-1]) + 1;
        
        int left = 1;
        int right = preSum.length-1;
        while(left + 1 < right)
        {
            int mid = left+(right-left)/2;
            if(preSum[mid] == target)
            {
                right = mid;
            }
            else if(preSum[mid] > target)
            {
                right = mid;
            }
            else if(preSum[mid] < target)
            {
                left = mid;
            }
        }
        
        if(preSum[left] >= target) return left-1;
        if(preSum[right] >= target) return right-1;
        return left-1;
    }
}

有了之前的铺垫，相信你能够完全理解上述代码，这道随机权重的题目就解决了

Previous710. Random Pick with Blacklist (H)Next26. Remove Duplicates from Sorted Array (E)

Last updated 3 years ago

Was this helpful?