原文链接: https://leetcode-cn.com/problems/kth-largest-element-in-a-stream

英文原文

Design a class to find the k^th largest element in a stream. Note that it is the k^th largest element in the sorted order, not the k^th distinct element.

Implement KthLargest class:

KthLargest(int k, int[] nums) Initializes the object with the integer k and the stream of integers nums.
int add(int val) Appends the integer val to the stream and returns the element representing the k^th largest element in the stream.

Example 1:

Input
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
Output
[null, 4, 5, 5, 8, 8]

Explanation
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3);   // return 4
kthLargest.add(5);   // return 5
kthLargest.add(10);  // return 5
kthLargest.add(9);   // return 8
kthLargest.add(4);   // return 8

Constraints:

1 <= k <= 10⁴
0 <= nums.length <= 10⁴
-10⁴ <= nums[i] <= 10⁴
-10⁴ <= val <= 10⁴
At most 10⁴ calls will be made to add.
It is guaranteed that there will be at least k elements in the array when you search for the k^th element.

中文题目

设计一个找到数据流中第 k 大元素的类（class）。注意是排序后的第 k 大元素，不是第 k 个不同的元素。

请实现 KthLargest 类：

KthLargest(int k, int[] nums) 使用整数 k 和整数流 nums 初始化对象。
int add(int val) 将 val 插入数据流 nums 后，返回当前数据流中第 k 大的元素。

示例：

输入：
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
输出：
[null, 4, 5, 5, 8, 8]

解释：
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3);   // return 4
kthLargest.add(5);   // return 5
kthLargest.add(10);  // return 5
kthLargest.add(9);   // return 8
kthLargest.add(4);   // return 8

提示：

1 <= k <= 10⁴
0 <= nums.length <= 10⁴
-10⁴ <= nums[i] <= 10⁴
-10⁴ <= val <= 10⁴
最多调用 add 方法 10⁴ 次
题目数据保证，在查找第 k 大元素时，数组中至少有 k 个元素

通过代码

高赞题解

各位题友大家除夕好！ 今天是 @负雪明烛 坚持日更的第 18 天。今天力扣上的每日一题是「703. 数据流中的第 K 大元素」。

解题思路

首先，面试题警告：

本题「数据流中的 TopK」是我参加亚马逊面试遇到过的真题。
另外，实现堆算法是我参加微软面试遇到的真题。
多说一句，TopK 算法在面试中常问，推荐阅读「拜托，面试别再问我TopK了！！！」

本题是我们求在一个数据流中的第 $K$ 大元素。所谓数据流，即是说我们写的算法需要支持 add() 函数；在力扣后台评测程序中会多次调用add()函数，每次调用都会向我们写的算法中添加一个元素。而题目要求的就是在每次 add() 之后，整个数据流（包括初始化的元素和所有 add 进来的元素）中的第 $K$ 大元素。

先说一个最暴力的解法：我们底层数据结构使用数组实现，当每次调用 add() 函数时，向数组中添加一个元素，然后调用 sort() 函数进行排序，返回排序后数组的第 $K$ 个数字。该做法在每次调用 add() 函数时的时间复杂度为 $O(K*log(K))$ ，该时间复杂度太高，当 $K$ 很大 / add()调用次数太多的时候，一定会超时。

从上面的分析中，我们已经看出来了，使用数组的核心问题是：数组自身不带排序功能，只能用 sort() 函数，导致时间复杂度过高。

因此我们考虑使用自带排序功能的数据结构——堆。

在大根堆（图一）中，父节点的值比每一个子节点的值都要大。在小根堆（图二）中，父节点的值比每一个子节点的值都要小。

image.png

本题的操作步骤如下：

使用大小为 $K$ 的小根堆，在初始化的时候，保证堆中的元素个数不超过 $K$ 。
在每次 add() 的时候，将新元素 push() 到堆中，如果此时堆中的元素超过了 $K$，那么需要把堆中的最小元素（堆顶）pop() 出来。
此时堆中的最小元素（堆顶）就是整个数据流中的第 $K$ 大元素。

问答：

为什么使用小根堆？

因为我们需要在堆中保留数据流中的前 $K$ 大元素，使用小根堆能保证每次调用堆的 pop() 函数时，从堆中删除的是堆中的最小的元素（堆顶）。

为什么能保证堆顶元素是第 $K$ 大元素？

因为小根堆中保留的一直是堆中的前 $K$ 大的元素，堆的大小是 $K$，所以堆顶元素是第 $K$ 大元素。

每次 add() 的时间复杂度是多少？

每次 add() 时，调用了堆的 push() 和 pop() 方法，两个操作的时间复杂度都是 $log(K)$.

代码

使用 Python2 写的代码如下。

class KthLargest(object):

    def __init__(self, k, nums):
        """
        :type k: int
        :type nums: List[int]
        """
        self.k = k
        self.que = nums
        heapq.heapify(self.que)

    def add(self, val):
        """
        :type val: int
        :rtype: int
        """
        heapq.heappush(self.que, val)
        while len(self.que) > self.k:
            heapq.heappop(self.que)
        return self.que[0]

# Your KthLargest object will be instantiated and called as such:
# obj = KthLargest(k, nums)
# param_1 = obj.add(val)

刷题心得

本题是堆的经典运用，在面试中可能会遇到，请认真对待本题，包括「TopK问题」；
数据流的题目还是很有意思的，力扣上有其他数据流题目，建议也做一下。

参考资料：

OK，以上就是 @负雪明烛写的今天题解的全部内容了，如果你觉得有帮助的话，求赞、求关注、求收藏。如果有疑问的话，请在下面评论，我会及时解答。

关注我，你将不会错过我的精彩动画题解、面试题分享、组队刷题活动，进入主页 @负雪明烛右侧有刷题组织，从此刷题不再孤单。

祝大家过年好！我们明天再见！

统计信息

通过次数	提交次数	AC比率
62989	122688	51.3%

提交历史

提交时间	提交结果	执行时间	内存消耗	语言

相似题目

题目	难度
数组中的第K个最大元素	中等

凤起丶宛南

从来没有真正的绝境，只有心灵的迷途

英文原文

中文题目

通过代码

高赞题解

解题思路

代码

刷题心得

统计信息

提交历史

相似题目

你的赏识是我前进的动力

英文原文

中文题目

通过代码

高赞题解

解题思路

代码

刷题心得

统计信息

提交历史

相似题目

你的赏识是我前进的动力

微信扫一扫：分享