Amortized Analysis
Amortized analysis is a technique used to study the performance of algorithms and dynamic data structures. Instead of focusing solely on the cost of a single operation, this method examines a sequence of operations and averages their cost, providing a more accurate picture of overall performance. It does not involve randomness but rather guarantees that even in the worst-case scenario, the average cost per operation remains within acceptable bounds. Essentially, expensive operations are balanced out by many cheaper ones so that, on average, each operation is efficient.
There are three common approaches to amortized analysis: Aggregate Analysis, Accounting Method, and Potential Method. While each has a slightly different focus, they all aim to balance out high-cost operations by distributing their expense over many inexpensive ones.
Amortized Analysis of a Dynamic Array
Consider a dynamic array (similar to C++’s vector) that starts with an initial capacity of . When inserting an element, if the array isn't full, the cost is constant. However, when the array is full, it doubles its capacity, and all current elements are copied to the new memory space before inserting the new element. Let’s see how we can analyze the cost of such operations.
Aggregate Analysis
In aggregate analysis, we calculate the total cost of a sequence of operations and then average it over the number of operations.
For dynamic array insertions, there are two types of costs:
- If the array is not full, inserting an element costs .
- If the array is full, the cost is dominated by the expansion step, which requires copying elements and hence costs .
Now, when performing insertions:
- The cost for each individual insertion (ignoring resizing) is , so for insertions the total is .
- The cost of resizing happens when the array sizes progress as (with being the largest power of 2 less than or equal to ). The cost per resizing step is , and the sum of these costs is which is .
Thus, the total cost for all insertions is , leading to an average (amortized) cost of per insertion.
Accounting Method
The accounting method assigns a fixed cost to every operation so that the overall cost is covered, even when occasional expensive operations occur. Think of it like prepaying a little extra during inexpensive operations to cover the cost of future expensive ones.
For our dynamic array:
- Suppose each insertion actually costs , but we charge an amortized cost of per insertion.
- unit pays for the current insertion.
- The extra units are saved as a “credit” to help pay for future array expansion.
- When the array is full and resizes (with a cost of copying all elements), the credits saved from previous insertions are sufficient to cover the extra cost.
For example, imagine an array with elements and corresponding saved credits:
Initial State:
arr = [1, 2, 3, 4] // The array contains elements.
credit = [2, 2, 2, 2] // Each element contributed 2 extra credits.
When the array is full, the saved credits are used to pay for the copying needed during expansion. This strategy ensures that even expensive expansion steps are prepaid, and thus every insertion maintains an amortized cost of .
Potential Method
The potential method uses a potential function, usually denoted as , to represent the “stored energy” (or prepaid work) in the data structure that can be used to pay for future operations. The change in potential helps to balance the cost over a sequence of operations.
Basics of the Potential Method
- Define the state of the data structure (for instance, the number of elements, capacity, etc.) as . Let the initial state be .
- Choose a potential function , which should satisfy:
- (initial potential is zero),
- for any state .
For any operation, the amortized cost is defined as:
where is the actual cost of the operation, and and denote the state before and after the operation, respectively.
If we perform a series of operations leading from state to , with actual costs for each step, the total actual cost is:
Since the potential is never negative, the sum of the amortized costs provides an upper bound on the true total cost. If each is , then is an upper bound for the amortized cost.
Example with a Dynamic Array
Let’s use the following potential function for our dynamic array:
where is the number of elements in the array and is its current capacity. The potential function reflects the difference between the capacity and the actual number of elements.
- For an insertion that does NOT trigger expansion:
- Actual cost: .
- After insertion, increases by 1, so the potential increases by :
- Amortized cost: .
- For an insertion that TRIGGERS expansion (when ):
- The array’s capacity doubles to , and the copying costs .
- With careful design, the reduction in potential (due to increased capacity) offsets the high copying cost. For example, suppose the potential decreases by during the expansion; then the high actual cost (copying all items plus inserting the new element) is balanced:
Even though the resizing operation is expensive, the potential function ensures that its cost is pre-accumulated from previous cheaper operations. Thus, on average, every insertion maintains an amortized cost of .
Extended Example: Stack Operations
Stack operations are another classic scenario where amortized analysis is useful. Imagine a stack data structure, S, that supports the following operations:
Operation | Description | Actual Cost |
---|---|---|
S.push(x) | Push x onto the stack | |
S.pop() | Pop the top element | |
S.multi-pop(k) | Pop the top k elements |
We will analyze these operations using all three methods.
1. Aggregate Analysis
- For push operations, each costs , so the total is .
- For pop operations, each costs , so the total is .
- For multi-pop operations, although the worst-case cost for a single operation is , the total number of pop actions cannot exceed the number of push operations (each pushed element can be popped only once).
Since the total number of operations is at most proportional to the number of push operations, the overall cost is , and the amortized cost per operation is .
2. Accounting Method
In the accounting method, we assign a “charge” to each push operation to cover the eventual cost of pop operations.
- For each S.push(x) operation, assume we charge units:
- unit pays for the push.
- unit is saved as credit, to later offset the cost of a pop (or multi-pop) operation.
- Since each pop operation actually costs but is pre-paid for by the earlier push, every pop and multi-pop operation effectively incurs an amortized cost of .
Thus, every operation maintains an amortized cost of .
3. Potential Method
For the potential method, define a potential function as:
which is simply the number of elements currently in the stack. Each element contributes 1 unit of potential.
- For S.push(x):
- Actual cost: .
- The potential increases by 1.
- Amortized cost: .
- For S.pop():
- Actual cost: .
- The potential decreases by 1.
- Amortized cost: .
- For S.multi-pop(k):
- Each of the pops costs , but since the potential decreases by overall, the amortized cost is .
Hence, using this potential function, the amortized cost for any stack operation remains .
In summary, regardless of whether we analyze a dynamic array or stack operations, amortized analysis—through aggregate, accounting, or potential methods—shows that expensive operations (like array resizing or multi-pop) can be effectively “spread out” over a sequence of cheaper operations. As a result, even though some single steps might be costly, the average cost per operation remains constant, which is crucial for designing efficient algorithms and data structures.