TopK：
LC215 Kth Largest Element in an Array
LC973 K Closest Points to Origin
LC378 Kth Smallest Element in a Sorted Matrix [pq存序号，访问后标记]

中位数：
LC4 排序数组的中位数 [只需[first, last)区间的条件]
LC295 数据流中的中位数 [两个pq]

二叉堆(heap)

见二叉堆

划分(partition)

Hoare划分(推荐)

更适合C++ STL左闭右开区间的一种划分方式，而且相比Lomuto划分，交换次数要少很多。

选择数组第一个元素作为枢纽元P(假设其指针为first)。初始化两个指针(L,R)在数组first+1位置和last位置
开始循环(while(true))
- 开始循环(while(L < end && !(P < *L)))
  - ++L
- 开始循环(while(P < *R))
  - --R
- 如果(R < L) 结束循环
- swap(*L, *R)
返回L-1
下面这个是SGI STL的partition实现。来自《STL源码剖析》侯捷。这个算法不能用作其他需要partition算法的子算法，因为其实现不能满足严格弱序关系。注意算法中的!(pred(*first))，当传入一个比较函数时，可能会使得划分一侧区间全为空，而使上一级算法陷入死循环。一个典型的例子是对于一个vector<int>{1,1,1,1,1,1}使用auto comp = less<int>{}做nth_element时(假设该nth_element使用的是这个partition)。

code

template<typename It, typename Pred>
It partition_hoare(It first, It last, Pred pred){
    while(true){
        while(true)
            if(first == last)
                return first;
            else if(!(pred(*first)))
                break;
            else
                ++first;
        --last;
        while(true)
            if(first == last)
                return first;
            else if(pred(*last))
                break;
            else
                --last;
        iter_swap(first, last);
        ++first;
   }
}

Lomuto划分(不推荐)

选择数组a最后一个元素为枢纽元(P)。初始化两个指针(i,j)在数组起始位置。
循环直到 j == a.size() - 1
- 如果a[i] <= P, 交换i,j的内容，++i,++j.
- 如果a[i] > P, ++j.
交换i,j的内容，返回i.

维护的循环不变式：

[begin, i)区间的元素都<=P
[i, j)区间的元素都>P.
j <= a.size() - 1.

Lumuto划分只能保证a[i]<=P部分的元素（即前半部分元素）划分后是稳定的。因为a[i]>P（即后半部分元素）可能是由前面交换过来的。
下面是一个简单的Lomuto划分实现（不保证正确）：

code

template<typename It, typename Less>
It partition_lomuto(It first, It last, Less less){
    if(first == last)     return first;
    auto pivot = last - 1;
    auto P = *pivot;
    auto i = first;
    auto j = first;
    while(j < pivot){
        if(!less(P, *j)){
            swap(*i, *j);
            ++i;++j;
        }else if(less(P, *j)) ++j;
    }
    swap(*i, *pivot);
    return i+1;
}

unguarded_partition（用于其他算法作为子算法）

下面是SGI STL的unguarded_partition实现。同样来自《STL源码剖析》。这个partition的实现，可以防止左侧元素为空的情况，用于其他使用partition作为子算法的算法，比如随机选择和快速排序。但是仍然无法防止因为传入的比较函数不满足严格弱序关系而导致的死循环。

code

template <typename It, typename T, typename Compare>
It unguarded_partition_hoare(It first, It last, T pivot, Compare comp){
    while(true){
        while(comp(*first, pivot)) ++first;
        --last;
        while(comp(pivot,*last)) --last;
        if(!(first < last)) return first;
        iter_swap(first, last);
        ++first;
    }
}

严格弱序关系(Strict weak Ordering)

如果比较函数对相等的值返回了true, 即使是unguarded_partition也会导致一边为空。这会使得后面的随机选择算法random_partition进入死循环。

使比较函数对相等的值返回true还会有很多其他奇妙的事情产生：

set可能存在看起来不等价的元素
在这样的set<int, less_equal<int> >中，即使用<=作为set的比较函数。这会使相等的元素不等价，即!(a<=a) && !(a<=a) == false。在相等的元素不等价的情况下，一个set里就可能会有两个a同时存在。
multiset的equal_range()不会包含所有看起来等价的元素
在这样的multiset<int, less_equal<int> >中，如果你插入两个相同值a，当你使用equal_range时，两个不会都在equal_range的返回区间里，因为两个a不等价。

事实上对于sort, nth_element, 和其他的一些使用等价关系的C++容器和算法，传入的比较函数必须满足严格弱序关系(strict weak ordering)，简单的来讲，需要满足以下性质：

对于所有 a，comp(a,a)==false
若 comp(a,b)==true 则 comp(b,a)==false
若 comp(a,b)==true 且 comp(b,c)==true 则 comp(a,c)==true

LLVM(clang)的nth_element，即使传入一个使相等值返回true的比较函数，也不会进入死循环。原因：TODO

参考：

Effective STL, Scott Meyers, Item 21: Always have comparison functions return false for equal values.
Weak ordering - Wikipedia
C++ 具名要求： Compare

关于iter_swap:

Strictly speaking, iter_swap is redundant. It exists only for technical reasons: in some circumstances, some compilers have difficulty performing the type deduction required to interpret swap(*a, *b).
iter_swap() versus swap() — what’s the difference? - stackoverflow

稳定的划分算法

STL中稳定划分算法：stable_partition有以下复杂度保证：给定 N = last - first，若有充足内存，则准确应用 N 次谓词及交换 O(N) 次。若内存不充足，则至多交换 N log N 次(猜测是直接使用了sort)。

选择

LC215 Kth Largest Element in an Array

最大值与最小值

std::minmax_element和std::minmax 一次遍历，同时找到最大值和最小值。对于N个元素，最多3N/2次比较。同时记录最大值和最小值。在每次比较时，对输入元素成对处理，先比较两个输入元素的大小。将其中较大的元素和最大值比较，较小的元素和最小值比较。

顺序统计量(TopK问题)

优先级队列

时间复杂度为$\Theta(k\log N)$：在给定数组上建一个最大堆make_heap()，后再pop_heap()k次即可。就地算法。不稳定。

make_heap & pop_heap, klogN code

class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        --k;
        make_heap(nums.begin(), nums.end());
        auto end = nums.end();
        while(k)
        {
            pop_heap(nums.begin(), end--);
            --k;
        }
        return *nums.begin();
    }
};

时间复杂度为%\Theta(N\log k)%：建立一个大小为k的最小堆。遍历给定数组，当堆未满时，将元素push_heap()。当堆满了时，比较堆顶元素(即堆中最小的元素)和当前遍历元素，如果当前遍历元素比堆顶元素大，将堆顶元素pop_heap()后push_heap()当前遍历的元素。遍历结束后，堆中元素即为最大的k个元素。

push_heap & pop_heap Nlogk code

class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        priority_queue<int, vector<int>, greater<int> > pq(greater<int>{});
        
        for(auto n : nums)
        {
            if(pq.size() < k)
            {
                pq.push(n);
            }
            else if(n > pq.top())
            {
                pq.pop();
                pq.push(n);
            }
        }

        return pq.top();
    }
};

期望为线性时间的选择

STL中的随机选择算法实现比较巧妙。输入为三个迭代器。first是序列头，last是序列尾。下面来自std::nth_element - cppreference:

nth_element 是部分排序算法，它重排 [first, last) 中元素，使得：

nth 所指向的元素被更改为假如 [first, last) 已排序则该位置会出现的元素。
这个新的 nth 元素前的所有元素小于或等于新的 nth 元素后的所有元素。

如果想要找第k小的元素，应该这么调用nth_element(nums.begin(), nums.begin() + k - 1, nums.end())。

random_select code

class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) { 
        random_select(nums.begin(), nums.begin() + k - 1, nums.end());
        return nums[k - 1];
        return 0;
    }
    template<typename It>
    void random_select(It first, It nth, It last){
        while(last - first > 1){
            auto pivot = *(last - 1);
            auto cut = unguarded_partition_hoare(first, last, pivot, greater<int>{});
            if(cut <= nth) first = cut;
            else           last = cut;
        }
    }
    template <typename It, typename T, typename Compare>
    It unguarded_partition_hoare(It first, It last, T pivot, Compare comp){
        while(true){
            while(comp(*first, pivot)) ++first;
            --last;
            while(comp(pivot,*last)) --last;
            if(!(first < last)) return first;
            iter_swap(first, last);
            ++first;
        }
    }
};

事实上STL中该算法的实现为内省选择(introselect), 来自David Musser

使用median_of_three作为pivot，防止因为传入已经排序的数组导致的性能下降。
因为也存在对median_of_three方法的破解序列，因此当first - last小于某一值时使用插入排序将该区间排序。使该算法平均与std::distance(first, last)成线性。即期望时间复杂度为$O(N)$。

算法原理：

循环不变式：nth永远在区间[first, last)中，因此必须保证[first, last)非空，即last - first > 1。
使用median_of_three作为pivot，用该pivot对区间做划分，将划分点记为cut
如果cut <= nth，为了保证nth仍然落在[first, last)中，将cut的位置赋给first
如果cut > nth，为了保证nth仍然落在[first, last)中，将cut的位置赋给last

未使用insertion_sort的introselect

#include <ext/algorithm>
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        introselect(nums.begin(), nums.begin() + k - 1, nums.end());
        return *(nums.begin() + k - 1);
    }

    template<typename It>
    void introselect(It first, It nth, It last)
    {
        while(last - first > 1)
        {
            auto pivot = __gnu_cxx::__median(*first, *(first + (last - first) / 2), *(last - 1));
            auto cut = unguarded_partition(first, last, pivot);

            // invariance: nth in [first, last)
            if(cut <= nth)
            {
                first = cut;
            }
            else // cut > nth
            {
                last = cut;
            }
        }
    }

    template<typename It>
    It unguarded_partition(It first, It last, typename  iterator_traits<It>::value_type pivot)
    {
        while(true)
        {
            while(*first > pivot) ++first;
            --last;
            while(pivot > *last ) --last;
            if(!(first < last)) return first;
            iter_swap(first, last);
            ++first;
        }
    }
};

TopK问题

数组上的TopK

LC973 K Closest Points to Origin

class Solution {
public:
    vector<vector<int>> kClosest(vector<vector<int>>& points, int K) {
        auto cmp = [](const vector<int>& lhs, const vector<int>& rhs){
            return lhs[0]*lhs[0]+lhs[1]*lhs[1] < rhs[0]*rhs[0]+rhs[1]*rhs[1];
        };
        random_select(points.begin(), points.begin() + K - 1, points.end(), cmp);
        return vector<vector<int>>(points.begin(), points.begin() + K);
    }
    
    template<typename It, typename Comp>
    void random_select(It first, It nth, It last, Comp comp){
        while(last - first > 1){
            auto cut = unguarded_partition_hoare(first, last, *(last - 1), comp);
            if(cut <= nth) first = cut;
            else           last  = cut;
        }
    }
    
    template<typename It, typename T, typename Comp>
    It unguarded_partition_hoare(It first, It last, T pivot, Comp comp){
        while(true){
            while(comp(*first, pivot)) ++first;
            --last;
            while(comp(pivot, *last)) --last;
            if(!(first < last)) return first;
            iter_swap(first, last);
            ++first;
        }
    }
};

矩阵上的TopK

在一个$m \times n$的杨氏矩阵中，每一行的数据都是从左到右排序的，每一列的数据都是从上到下排序的。不存在的元素会被填充为$\infty$。我们容易知道以下性质：

$Y(1,1) = \infty$ 代表矩阵为空。
$Y(m,n) < \infty$ 代表矩阵为满。

[CLRS]6-3, LC378 Kth Smallest Element in a Sorted Matrix

维持一个最小堆。将左上角的元素放入堆中并标记。每次从堆顶中取出一个元素，并尝试将其右边和下方的未标记元素放入堆中。将放入堆中的元素做标记。直到从堆中取出了k个元素。该元素即为第k小的元素。

复杂度分析：
最外层循环进行k次，每次循环有一次pop，最多两次push。因此队列长度会不断增长，最大长度为k。所以运行时间为
$T = 3 \times(\log 1 + \log 2 + \log 3 + \cdots + \log k) = O(\log(k!)) = \Theta(k\log k)$

Feb/19/2020 code

struct pos
{
    size_t x;
    size_t y;
    int  val;
    pos(size_t x, size_t y, int val)
    :x(x)
    ,y(y)
    ,val(val)
    {}

    friend bool operator<(const pos& lhs, const pos& rhs)
    {
        return rhs.val < lhs.val;
    }
};

class Solution {
private:
    constexpr static int EXPLORED = INT_MAX;
public:
    int kthSmallest(vector<vector<int>>& matrix, int k) {
        auto row = matrix.size();
        auto col = matrix.front().size();
        priority_queue<pos> pq{};
        // we start from the top left corner of the matrix
        pq.emplace(0, 0, matrix[0][0]);
        int res = 0;
        while(k)
        {
            auto [x, y, val] = pq.top();
            --k;
            res = val;
            pq.pop();

            if(x < row - 1 && matrix[x + 1][y] != EXPLORED)
            {
                pq.emplace(x + 1, y, matrix[x + 1][y]);
                matrix[x + 1][y] = EXPLORED;
            }

            if(y < col - 1 && matrix[x][y + 1] != EXPLORED)
            {
                pq.emplace(x, y + 1, matrix[x][y + 1]);
                matrix[x][y + 1] = EXPLORED;
            }
        }
        return res;
    }
};

code

class Solution {
public:
    int kthSmallest(vector<vector<int>>& matrix, int k) {
        using t = std::tuple<size_t, size_t, int>; 
        auto cmp = [](const t& lhs, const t& rhs){return get<2>(lhs) > get<2>(rhs);};
        auto pq = priority_queue<t, vector<t>, decltype(cmp) >{cmp};
        auto len = matrix.size();
        int result = 0;
        pq.push(make_tuple(0, 0, matrix[0][0]));
        matrix[0][0] = INT_MAX;
        while(k--){
            auto pv = pq.top();
            pq.pop();
            result = get<2>(pv);
            auto x = get<0>(pv);
            auto y = get<1>(pv);
            if(x < len - 1 && y < len && matrix[x + 1][y] != INT_MAX){
                pq.push(make_tuple(x + 1, y, matrix[x + 1][y]));
                matrix[x + 1][y] = INT_MAX;
            }
            if(x < len && y < len - 1 && matrix[x][y + 1] != INT_MAX){
                pq.push(make_tuple(x, y + 1, matrix[x][y + 1]));
                matrix[x][y + 1] = INT_MAX;
            }
                
        }
        return result;
    }
};

中位数

排序数组的中位数

LC4 排序数组的中位数

见二分查找#排序数组的中位数

数据流中的中位数

LC295 数据流中的中位数

维护两个二叉堆，一个是最大堆maxHeap，另一个是最小堆minHeap，保证最小堆的元素都比最大堆的元素大，两个堆的元素个数不能相差超过1。这样我们只需要取最小堆和最大堆的top就知道中位数了。
设两个堆中的元素总数为size，每当一个新的数据num到来时，如果size是偶数，我们就给他放进最大堆。如果size是奇数，我们就给他放进最小堆。但是有可能当size是偶数时，到来的数据比最小堆的top要大，如果我们此时还给他到最大堆里，就破坏了最小堆的元素都比最大堆的元素大的假设。所以这个时候我们要把这个元素扔到最小堆里，然后再从最小堆里拿出来一个最小的（即top）给他过继给最大堆，这样就维护了平衡。当size是奇数时同理。

奇偶下标放入不同队列 code

class MedianFinder {    
public:
    /** initialize your data structure here. */
    MedianFinder()
    :maxHeap{}
    ,minHeap{}
    ,size{0}
    {}
    
    ~MedianFinder() = default;
        
    void addNum(int num) {
        
        if((size & 1) == 0){
            if(!minHeap.empty() && num > minHeap.top()){
                minHeap.push(num);
                maxHeap.push(minHeap.top());
                minHeap.pop();
            }else{
                maxHeap.push(num);
            }    
        }else{
            if(!maxHeap.empty() && num < maxHeap.top()){
                maxHeap.push(num);
                minHeap.push(maxHeap.top());
                maxHeap.pop();
            }else{
                minHeap.push(num);
            }
        }
        ++size;
    }
    
    double findMedian() {
       if((size & 1) == 0){
           return maxHeap.top() + (minHeap.top() - maxHeap.top()) / 2.0;
       }else{
           return maxHeap.top();
       }
    }
private:
    priority_queue<double, vector<double>, less<double> > maxHeap;
    priority_queue<double, vector<double>, greater<double> > minHeap;
    size_t size;
};

事实上根本就不需要奇偶下标放入不同的容器。偶数下标优先放入最大堆，奇数下标优先放入最小堆只是为了保证在数据总数为奇数时，最大堆的大小大于最小堆的大小。这样我们在找中位数的时候，直接返回最大堆的堆顶即可。我们完全可以减少addNum函数中的大量判断奇偶下标的流程控制，不分奇偶：

如果最大堆是空的，或者该数字不能放进最小堆里(num < minq.top())，我们给他放进最大堆。放入后检查最大堆大小和最小堆大小的差是否超过了1。如果超过了，就把最大堆的最大值过继给最小堆。
最小堆也是这么处理。
当总数是奇数时，我们返回两个堆中元素比较多的那一个的堆顶即可。

Feb/20/2020 code

class MedianFinder {
private:
    priority_queue<int> maxq;
    priority_queue<int, vector<int>, greater<int>> minq;
public:
    /** initialize your data structure here. */
    MedianFinder()
    :maxq{}
    ,minq{}
    {}
    
    void addNum(int num) {
        // invariance: abs(maxq.size() - minq.size()) < 2
        // invariance: maxq.top() <= minq.top()
        if(maxq.empty() || num < maxq.top())
        {
            
            maxq.push(num);
            // check size
            if(maxq.size() - minq.size()  > 1)
            {
                minq.push(maxq.top());
                maxq.pop();
            }
        }
        else
        {
            minq.push(num);
            if(minq.size() - maxq.size() > 1)
            {
                maxq.push(minq.top());
                minq.pop();
            }
        }
    }
    
    double findMedian() {
        // size is odd
        if(((maxq.size() + minq.size()) & 1))
        {
            return maxq.size() > minq.size() ? maxq.top() : minq.top();
        }
        else
        {
            return (maxq.top() + minq.top()) / 2.;
        }
    }
};

滑动窗口中位数

LC480 滑动窗口中位数

仍然使用一个最大堆和一个最小堆
每次更新滑动窗口时，会丢弃一个元素，并加入一个元素
- 如果窗口大小为奇数$2k+1$，设最大堆中元素个数为$k + 1$，最小堆中元素个数为$k$，记为$(k + 1, k)$
  - 丢弃元素在最大堆，加入元素在最小堆：元素个数变为(k, k + 1)
  - 丢弃元素在最小堆，加入元素在最大堆：元素个数变为(k + 2, k - 1)，需要平衡，最大堆向最小堆过继一个元素，元素个数变为(k + 1, k)
  - 丢弃和加入在一侧，不改变。
- 如果窗口大小为奇数$2k+1$，设最大堆中元素个数为$k$，最小堆中元素个数为$k + 1$，记为$(k, k + 1)$
  - 丢弃元素在最大堆，加入元素在最小堆：元素个数变为(k - 1, k + 2)，需要平衡，最小堆向最大堆过继一个元素，元素个数变为(k, k + 1)
  - 丢弃元素在最小堆，加入元素在最大堆：元素个数变为(k + 1, k)
  - 丢弃和加入在一侧，不改变。