Radix Sort: MSD and LSD Approaches
Radix sort is a non-comparison-based sorting algorithm that was originally invented to sort playing cards. Instead of comparing entire numbers or strings directly, radix sort decomposes each element into multiple parts (known as keys or digits) and sorts the elements one key at a time. In this article, we’ll discuss the basic concepts behind radix sort, explain the two major variants—MSD (Most Significant Digit first) and LSD (Least Significant Digit first)—and present Python examples that illustrate how the algorithms work.
Definition
Radix sort works by splitting each element into keys (for example, each digit of an integer or each character in a string). After that, the algorithm sorts the elements by examining these keys sequentially. When comparing two elements:
- First, compare their 1st keys. If they differ, the order is determined.
- If the 1st keys are identical, proceed to compare the 2nd keys.
- Continue this process through the -th key until the elements differ or are deemed equal.
For instance:
- When sorting natural numbers, you can align numbers by their least significant end (padding with zeros on the left) so that each digit represents a key from left to right.
- When sorting strings lexicographically, each character (from left to right) can be treated as a key.
MSD Radix Sort
Overview
MSD radix sort (Most Significant Digit first) sorts the list by examining the keys starting from the most significant one. The idea is that by first sorting all elements according to their first key, you can partially order the list. For those elements that share the same first key, you then sort by the second key, and so on.
How It Works
- Decompose Elements: Break each element into keys.
- Sort First Key: Use a stable sorting algorithm (often counting sort) to order elements by their first key.
- Recursively Sort Buckets: For groups of elements with identical keys, apply the same approach to the next key.
- Complete Sorting: Continue recursively (or iteratively) until all key positions have been processed.
For MSD sort, stability in the inner sorting step is crucial. This guarantees that when keys are the same, the original order is maintained, ensuring the final sorted order is correct.
Example: Sorting Natural Numbers with MSD Radix Sort
Below is a Python implementation of MSD radix sort for non-negative integers. In this example, we first convert numbers to fixed-length strings (padding with zeros), then recursively sort the list based on each digit from left to right.
def msd_radix_sort(nums):
# Return immediately if the input list is empty.
if not nums:
return nums
# Determine the maximum number of digits among all numbers.
max_len = len(str(max(nums)))
# Convert all numbers to strings of equal length by padding with zeros.
str_nums = [str(num).zfill(max_len) for num in nums]
def sort_helper(arr, index):
# Base case: if the array has one or no elements, or all digits are processed.
if len(arr) <= 1 or index >= len(arr[0]):
return arr
# Create buckets for digits 0 through 9.
buckets = [[] for _ in range(10)]
for s in arr:
buckets[int(s[index])].append(s)
# Recursively sort each bucket and combine the results.
result = []
for bucket in buckets:
if bucket:
result.extend(sort_helper(bucket, index + 1))
return result
# Recursively sort based on digits starting from index 0.
sorted_strs = sort_helper(str_nums, 0)
# Convert the sorted strings back to integers.
return [int(x) for x in sorted_strs]
# Example usage:
if __name__ == "__main__":
arr = [170, 45, 75, 90, 802, 24, 2, 66]
print(msd_radix_sort(arr))
Example: Sorting Strings with MSD Radix Sort
The following Python example demonstrates an MSD style sort on strings. In this recursive approach, we use the character at the current index (if available) as the key. Missing characters (when the string is shorter) are treated as an empty string, which naturally sorts before any letter.
def msd_radix_sort_strings(strings, index=0):
if len(strings) <= 1:
return strings
# Group strings based on the current character.
buckets = {}
for s in strings:
key = s[index] if index < len(s) else ""
buckets.setdefault(key, []).append(s)
# Process buckets in sorted key order (empty strings come first).
sorted_strings = []
for key in sorted(buckets.keys()):
sorted_strings.extend(msd_radix_sort_strings(buckets[key], index + 1))
return sorted_strings
# Example usage:
if __name__ == "__main__":
arr = ["banana", "apple", "orange", "apricot", "grape"]
print(msd_radix_sort_strings(arr))
LSD Radix Sort
Overview
LSD radix sort (Least Significant Digit first) reverses the order in which keys are processed. It begins by sorting the list by the least significant key and then moves toward the most significant key. As long as the sorting algorithm used for each digit is stable, this method guarantees the overall sorting is also stable.
How It Works
- Decompose Elements: Split each element into its keys.
- Sort by Least Significant Key: First, sort all elements by the last key (using a stable sort such as counting sort).
- Iterative Sorting: Move to the next key from right to left and sort the entire list again stably.
- Finish: When you have processed the most significant key, the list is fully sorted.
A helpful way to visualize LSD radix sort is to imagine that you maintain the order produced at every digit pass so that earlier (less significant) order is preserved when processing a higher significant digit.
Pseudocode
Below is the pseudocode that illustrates the overall method of LSD radix sort:
Example: Sorting k-Key Elements with LSD Radix Sort
Suppose each element is represented as a list of keys. The following Python code demonstrates LSD radix sort on such multi-key elements. It processes the keys from the last to the first using counting sort as the stable subroutine.
def lsd_radix_sort(elements, key_count, key_ranges):
"""
elements: list of elements, where each element is a list or tuple of keys.
key_count: number of keys in each element.
key_ranges: a list of integers representing the maximum possible value for each key.
"""
for index in range(key_count - 1, -1, -1):
base = key_ranges[index] + 1 # possible digit values from 0 to key_ranges[index]
count = [0] * base
output = [None] * len(elements)
# Count frequency of each digit at current key position.
for element in elements:
count[element[index]] += 1
# Transform count into prefix sum array.
for i in range(1, len(count)):
count[i] += count[i - 1]
# Place elements into output array in a stable manner.
for element in reversed(elements):
digit = element[index]
count[digit] -= 1
output[count[digit]] = element
elements = output
return elements
# Example usage:
if __name__ == "__main__":
# Each element is a list of three keys (digits between 0 and 9).
arr = [
[3, 5, 1],
[3, 2, 9],
[1, 5, 6],
[3, 5, 0]
]
# Here, each key can be any digit from 0 to 9.
sorted_arr = lsd_radix_sort(arr, 3, [9, 9, 9])
print(sorted_arr)
Example: Sorting Positive Integers with LSD Radix Sort
In many cases, LSD radix sort is used to sort non-negative integers by processing groups of bits at a time. The following implementation uses a base (for example, 256) and applies counting sort on each digit (or group of bits):
def radix_sort_integers(arr):
if not arr:
return arr
max_val = max(arr)
exp = 1
base = 256 # Using 256 allows easy handling of 8-bit groups.
while max_val // exp > 0:
output = [0] * len(arr)
count = [0] * base
# Count the number of occurrences for each digit.
for num in arr:
count[(num // exp) % base] += 1
# Compute prefix sums.
for i in range(1, base):
count[i] += count[i - 1]
# Build the output array in stable order.
for num in reversed(arr):
digit = (num // exp) % base
count[digit] -= 1
output[count[digit]] = num
arr = output
exp *= base
return arr
# Example usage:
if __name__ == "__main__":
arr = [170, 45, 75, 90, 802, 24, 2, 66]
print(radix_sort_integers(arr))
Properties of Radix Sort
Stability
Radix sort is stable as long as the inner sorting algorithm used for each digit is stable. Stability means that if two elements have the same key, their relative order remains unchanged after sorting. Both MSD and LSD radix sort are stable under this condition.
Time Complexity
Radix sort can be very efficient compared to comparison-based methods (like quicksort) when the number of digits (or keys) is not too large. When each key's range is limited, counting sort can be used as the inner stable sort. In that case, the overall time complexity is on the order of
where is the number of elements, is the number of keys per element, and is the range of the -th key.
If the range of any key is enormous, it might be more practical to use a comparison-based sort.
Space Complexity
Both MSD and LSD radix sorts generally require additional space. The space complexity is typically due to the need for temporary arrays used during the stable sorting steps.
By understanding both MSD and LSD radix sort approaches, you now have insight into how elements can be sorted on a per-digit basis rather than by overall comparison. These techniques can be especially useful when dealing with large datasets of numbers or strings where each element can be naturally decomposed into fixed keys. Happy coding!