TimSort

TimSort is a hybrid sorting algorithm that combines the best features of insertion sort and merge sort. Designed to take advantage of pre-existing order within data, TimSort excels when a dataset contains many ordered subsequences. Its stability ensures that equal elements retain their original order after sorting.

Introduction

TimSort was introduced in 2002 by a core Python developer and has since been the default sorting algorithm in Python. By smartly identifying and merging sorted segments—known as runs—TimSort can approach linear time performance when the input is nearly sorted. Its blend of strategies has also made it popular in other programming languages and environments.

How TimSort Works

The efficiency of TimSort comes primarily from its ability to detect ordered sequences and merge them optimally. The algorithm consists of three main phases:

Discovering Runs:
The algorithm scans the list from left to right, grouping consecutive elements in order (either ascending or descending). A descending run is reversed so that every run becomes ascending.
Extending Runs:
To boost performance on small segments, each identified run is extended to a minimum length—known as MIN_RUN—using insertion sort. This helps ensure that merge operations are more efficient later on.
Merging Runs:
TimSort maintains a stack of runs and merges them according to a set of specific rules. These rules are designed to keep the merging balanced and maintain the overall stability of the sort.

Discovering Runs

Initially, TimSort identifies a "run" by sequentially checking the list:

For an ascending run, it continues as long as each element is not smaller than its predecessor.
For a descending run, it continues until an element is larger than the previous one; the run is then reversed to convert it into ascending order.

Extending Runs

To ensure each run has a sufficient length for efficient merging, a minimum run length (MIN_RUN) is defined. This value is generally computed dynamically based on the total number of elements, typically falling between 32 and 64. The process works as follows:

If a discovered run already meets or exceeds MIN_RUN, it is pushed onto the stack directly.
If it is shorter than MIN_RUN, the algorithm uses binary insertion sort to extend the run until it reaches the required length, then pushes it onto the stack.

Merging Runs

Merging sorted runs is the heart of TimSort. The algorithm uses a stack to store runs, merging them according to specific conditions to maintain balance and stability.

Merge Conditions

TimSort ensures that only adjacent runs are merged. This approach preserves the relative order of equal elements, a key property of stable sorting. When considering the top three runs on the stack—which we can call X, Y, and Z (with Z being the bottom-most)—the algorithm checks:

$\text{len(Z)} > \text{len(Y)} + \text{len(X)}$
$\text{len(Y)} > \text{len(X)}$

If these conditions are not met, the algorithm merges the smaller of the two adjacent runs (either Y with X or Y with Z) and rechecks the conditions. This process maintains a balanced merge tree and limits the total work required.

Merge Optimization

TimSort employs several strategies to optimize merging:

Binary Search for Merge Boundaries:
Instead of always comparing elements one by one, the algorithm uses binary search to quickly locate the insertion points where elements from one run should be merged into the other. This reduces the number of comparisons and narrows the scope of the merge.
Temporary Buffer:
To minimize the data movement overhead, especially when merging runs of different sizes, TimSort copies the smaller run into a temporary buffer and then merges it back into the main array.

Consider two example runs:

Run A: [1, 2, 3, 6, 10]
Run B: [4, 5, 7, 9, 12, 14, 17]

By using binary search:

The first element of Run B (4) fits into Run A at the fourth position.
The last element of Run A (10) fits into Run B at the appropriate insertion point.

Only the overlapping portions, [6, 10] from Run A and [4, 5, 7, 9] from Run B, need to be merged. This targeted merging minimizes unnecessary comparisons.

Galloping Mode

To further speed up merging when one run has many consecutive elements that are smaller (or larger) than the corresponding elements in the other run, TimSort switches to a special mode called Galloping Mode. Here’s how it works:

Exponential Search:
Starting from the current position, the algorithm exponentially increases its search step (1, 2, 4, 8, …) to quickly identify a block where the ordering changes.
Binary Search:
Once a suitable range is found, binary search pinpoints the exact position for merging.

A threshold parameter, Min_Gallop (usually set to 7), controls when the algorithm enters Galloping Mode. If one run repeatedly wins comparisons, the mode is activated. The algorithm also dynamically adjusts this threshold based on performance—reducing it when Galloping Mode is effective and increasing it when it isn’t.

Galloping Mode can considerably accelerate the merging process for nearly sorted data by bypassing redundant comparisons. However, the algorithm will revert to standard merging for random data to maintain the overall time complexity of $O(n \log n)$ .

Time and Space Complexity

TimSort’s performance is directly tied to the existing order in the input data:

Best Case:
$O(n)$
When the data is already sorted or nearly sorted, long runs reduce the number of required merge operations.
Worst Case:
$O(n \log n)$
For completely unsorted data, many small runs require multiple merging steps, leading to the worst-case complexity.

Breaking It Down:

Run Identification and Extension:
Both the detection of runs and the insertion sort used for extension require a single pass through the array, resulting in $O(n)$ time.
Merging Runs:
In the worst case, the array is divided into about $n / \text{MIN_RUN}$ runs. Merging these runs involves $O(\log n)$ merge operations, each taking up to $O(n)$ time, ultimately giving $O(n \log n)$ .

The space complexity of TimSort is about $O(n)$ , since it requires additional storage for managing the run stack and a temporary buffer during merging.

Pseudocode Implementation

Below is a high-level pseudocode representation of TimSort’s process:

1 nRemaining ← length of array 2 minRun ← determine an appropriate MIN_RUN based on nRemaining 3 startIndex ← 0 4 while nRemaining > 0 do 5 runLength ← identify a run in array starting at startIndex with next nRemaining elements 6 if runLength < minRun then 7 extendLength ← min(minRun, nRemaining) 8 perform binary insertion sort on the interval [startIndex, startIndex + extendLength - 1] 9 runLength ← extendLength 10 end if 11 push the run (startIndex, runLength) onto the stack 12 call mergeCollapse(stack) to check and merge runs in the stack according to merge rules 13 startIndex ← startIndex + runLength 14 nRemaining ← nRemaining - runLength 15 end while 16 call mergeForceCollapse(stack) to merge all remaining runs in the stack

This pseudocode outlines the general flow of TimSort from run identification to the final merging steps.

This guide provides a beginner-friendly overview of TimSort’s workings, emphasizing its innovative use of pre-sorted segments and merge optimizations. By combining techniques from insertion sort and merge sort and incorporating intelligent strategies such as Galloping Mode, TimSort achieves efficient and stable sorting in a wide variety of scenarios.

Sorting Algorithms in C and C++ Standard Library Tournament Sort