Sorting Algorithms in C and C++ Standard Library

qsort

qsort is a C standard library function that implements the quicksort algorithm. In C, it is defined in the header <stdlib.h> (or <cstdlib> in C++), and its use can be found on many reference sites.

The function has four parameters:

The array to sort,
The number of elements in the array,
The size (in bytes) of each element, and
A comparison function that controls the sort order.

The comparison function is crucial—it receives two pointers to constant data and must return an integer:

A positive number indicates that the first element is greater than the second.
A negative number indicates that the first element is less than the second.
Zero indicates that the elements are equivalent in terms of the sort order.

Below is an example of a correct comparison function for sorting an array of integers:

int compare(const void *p1, const void *p2)  // Comparison function for int arrays
{
  int *a = (int *)p1;
  int *b = (int *)p2;
  if (*a > *b)
    return 1;  // Returns a positive number: a is greater than b
  else if (*a < *b)
    return -1;  // Returns a negative number: a is less than b
  else
    return 0;  // Returns 0: a and b are equivalent
}

Note that using a simple subtraction (e.g. return *a - *b;) is a common mistake, which can lead to overflow errors.

For example, if you have a custom structure as below:

struct eg  // Sample structure
{
  int e;
  int g;
};
 
int compare(const void *p1, const void *p2)  // Comparison function for arrays of struct eg; sorts by member e
{
  struct eg *a = (struct eg *)p1;
  struct eg *b = (struct eg *)p2;
  if (a->e > b->e)
    return 1;  // a is greater than b
  else if (a->e < b->e)
    return -1;  // a is less than b
  else
    return 0;  // a and b are equivalent within this comparison rule
}

In this example, “equivalent” does not mean equal in every aspect—it only means that, under the given sorting rule, no element is considered larger or smaller than the other.

std::sort

The C++ Standard Library provides the function std::sort which is more commonly used in C++ programming. You can use it to sort an array or sequence in place. Its basic usage is as follows:

// Let a[0] ... a[n - 1] be the elements to be sorted in increasing order.
std::sort(a, a + n);
 
// Alternatively, you can pass a custom comparison function 'cmp':
std::sort(a, a + n, cmp);

A key difference to note is that the comparison function passed to std::sort returns a boolean value (true or false), indicating the ordering between two elements. This is in contrast to the three-valued comparison used by qsort. When adapting qsort-style functions to std::sort, if you want to maintain the same overall order, you would need to map a true result to an equivalent negative value and false to a positive value.

According to C++ standards, older versions only required the average time complexity of std::sort to be $O(n\log n)$ . Beginning with C++11, however, the worst-case complexity is also guaranteed to be $O(n\log n)$ . While the standard does not mandate a specific algorithm, many implementations (including popular ones such as libstdc++ and libc++) use introsort, which is a hybrid method combining quicksort and heapsort.

std::nth_element

std::nth_element rearranges the elements in the range [first, last) so that the element at the nth position is the one that would appear in that position if the range were fully sorted. After this operation, every element preceding this new element is less than or equal to it, and every element following it is greater than or equal to it. Usage is as follows:

std::nth_element(first, nth, last);
std::nth_element(first, nth, last, cmp);

The average time complexity for this function is $O(n)$ , where $n$ is the distance between first and last. It is commonly used for algorithms like constructing K-D Trees.

std::stable_sort

std::stable_sort provides a stable sorting algorithm, meaning that equivalent elements maintain their original relative order post-sorting. Its usage is illustrated below:

std::stable_sort(first, last);
std::stable_sort(first, last, cmp);

If memory is constrained, its time complexity is generally $O(n\log^2 n)$ . However, when additional memory is available, the operation can run in $O(n\log n)$ time.

std::partial_sort

std::partial_sort is used when you only need the first k elements of a sequence to be sorted. It rearranges the sequence so that the first k elements (where mid = first + k) are the smallest (or largest, depending on the comparator), in sorted order, while the order of the remaining elements is unspecified. The usage is as follows:

// Here, mid is defined as first + k
std::partial_sort(first, mid, last);
std::partial_sort(first, mid, last, cmp);

The approximate number of comparisons made is $(last-first) \log(mid-first)$ .

The algorithm works by first building a max heap for the first k elements using make_heap(). Then, for each element in the range [mid, last), it compares the element to the maximum element in the heap (located at the beginning). If the new element is smaller, it replaces the maximum element and the heap is adjusted to maintain its property. Finally, a heap sort (sort_heap()) is applied to the first k elements to arrange them in increasing order (remember, heap order is different from sorted order).

Custom Comparison Functions

Both built-in types (like int) and user-defined structures can be sorted using custom comparators. For user-defined structures, it is essential to either overload at least one relational operator (commonly operator<) or provide a comparator function when calling the sorting function.

For example:

int a[1009], n = 10;
// ...
std::sort(a + 1, a + 1 + n);                   // Sorts in increasing order
std::sort(a + 1, a + 1 + n, greater<int>());   // Sorts in decreasing order

And for a custom structure:

struct data {
  int a, b;
 
  bool operator<(const data rhs) const {
    return (a == rhs.a) ? (b < rhs.b) : (a < rhs.a);
  }
} da[1009];
 
bool cmp(const data u1, const data u2) {
  return (u1.a == u2.a) ? (u1.b > u2.b) : (u1.a > u2.a);
}
 
// ...
std::sort(da + 1, da + 1 + 10);    // Uses the overloaded < operator for increasing order
std::sort(da + 1, da + 1 + 10, cmp); // Uses the custom cmp function for decreasing order

When writing custom comparators, it is important to adhere to the principles of strict weak ordering.

Strict Weak Ordering

For any sorting algorithm to work correctly, the comparison operation must define a strict weak ordering. This means that the comparator must satisfy certain properties (such as transitivity and consistency) to ensure predictable behavior. Common pitfalls include:

Defining the comparator using <= instead of <.
Using external values that may change during sorting (a situation that can arise in algorithms like the shortest path).
Combining multiple comparisons (for example, finding the maximum or minimum among several values) without ensuring the proper relational properties.

Keeping these guidelines in mind helps in avoiding unexpected errors during the sort.

This overview provides a basic understanding of the sorting functions available in C and C++ standard libraries, the differences in their comparison functions, and important considerations when writing custom comparators.

Shell Sort TimSort