Bucket Sort

This article provides a beginner-friendly overview of bucket sort, a useful sorting algorithm known for its efficiency when sorting data that spans a large range but is uniformly distributed.

Definition

Bucket sort is a sorting algorithm designed to work efficiently when the input is evenly distributed over a large range. The fundamental idea is to divide the input into several groups (buckets), sort each bucket individually, and then combine the sorted buckets into a final sorted sequence.

Process

Bucket sort follows these steps:

Create a fixed number of empty buckets.
Iterate over the original sequence, and place each element into a corresponding bucket based on its value.
Sort the elements within each bucket using a simple sorting method (such as insertion sort).
Merge the sorted buckets back into the original sequence.

Properties

Stability

If a stable sorting algorithm is used to sort each bucket, and the process of placing items into buckets does not change the relative order of equivalent items, then bucket sort will be stable. Because each bucket typically contains only a few elements, a simple algorithm like insertion sort is usually sufficient, ensuring stability.

Time Complexity

The average time complexity of bucket sort is

n + \frac{n^2}{k} + k

where the data is divided into approximately $n$ buckets, the cost of sorting within each bucket, and merging the buckets. When the number of buckets $k$ is roughly equal to $n$ , the average time complexity becomes $O(n)$ .

However, in the worst-case scenario, bucket sort can degenerate to a time complexity of $O(n^2)$ .

Implementation

Below are example implementations of bucket sort in both C++ and Python.

constexpr int N = 100010;
 
int n, w, a[N];
vector<int> bucket[N];
 
void insertion_sort(vector<int>& A) {
  for (int i = 1; i < A.size(); ++i) {
    int key = A[i];
    int j = i - 1;
    while (j >= 0 && A[j] > key) {
      A[j + 1] = A[j];
      --j;
    }
    A[j + 1] = key;
  }
}
 
void bucket_sort() {
  int bucket_size = w / n + 1;
  for (int i = 0; i < n; ++i) {
    bucket[i].clear();
  }
  for (int i = 1; i <= n; ++i) {
    bucket[a[i] / bucket_size].push_back(a[i]);
  }
  int p = 0;
  for (int i = 0; i < n; ++i) {
    insertion_sort(bucket[i]);
    for (int j = 0; j < bucket[i].size(); ++j) {
      a[++p] = bucket[i][j];
    }
  }
}

N = 100010
w = n = 0
a = [0] * N
bucket = [[] for i in range(N)]
 
 
def insertion_sort(A):
    for i in range(1, len(A)):
        key = A[i]
        j = i - 1
        while j >= 0 and A[j] > key:
            A[j + 1] = A[j]
            j -= 1
        A[j + 1] = key
 
 
def bucket_sort():
    bucket_size = int(w / n + 1)
    for i in range(0, n):
        bucket[i].clear()
    for i in range(1, n + 1):
        bucket[int(a[i] / bucket_size)].append(a[i])
    p = 0
    for i in range(0, n):
        insertion_sort(bucket[i])
        for j in range(0, len(bucket[i])):
            a[p] = bucket[i][j]
            p += 1

This tutorial has provided an introduction to bucket sort along with insights into its properties and strategies for implementation. With these foundations, you’re ready to explore further and apply bucket sort in various scenarios.

Bubble Sort Counting Sort