Amortized Analysis

Amortized analysis is a technique used to study the performance of algorithms and dynamic data structures. Instead of focusing solely on the cost of a single operation, this method examines a sequence of operations and averages their cost, providing a more accurate picture of overall performance. It does not involve randomness but rather guarantees that even in the worst-case scenario, the average cost per operation remains within acceptable bounds. Essentially, expensive operations are balanced out by many cheaper ones so that, on average, each operation is efficient.

There are three common approaches to amortized analysis: Aggregate Analysis, Accounting Method, and Potential Method. While each has a slightly different focus, they all aim to balance out high-cost operations by distributing their expense over many inexpensive ones.

Amortized Analysis of a Dynamic Array

Consider a dynamic array (similar to C++’s vector) that starts with an initial capacity of $m = 1$ . When inserting an element, if the array isn't full, the cost is constant. However, when the array is full, it doubles its capacity, and all current elements are copied to the new memory space before inserting the new element. Let’s see how we can analyze the cost of such operations.

Aggregate Analysis

In aggregate analysis, we calculate the total cost of a sequence of operations and then average it over the number of operations.

For dynamic array insertions, there are two types of costs:

If the array is not full, inserting an element costs $O(1)$ .
If the array is full, the cost is dominated by the expansion step, which requires copying $m$ elements and hence costs $O(m)$ .

Now, when performing $n$ insertions:

The cost for each individual insertion (ignoring resizing) is $O(1)$ , so for $n$ insertions the total is $O(n)$ .
The cost of resizing happens when the array sizes progress as $1, 2, 4, \dots, 2^k$ (with $2^k$ being the largest power of 2 less than or equal to $n$ ). The cost per resizing step is $1, 2, 4, \dots, 2^{k-1}$ , and the sum of these costs is $1 + 2 + 4 + \dots + 2^{k-1} = 2^k - 1,$ which is $O(n)$ .

Thus, the total cost for all insertions is $O(n)$ , leading to an average (amortized) cost of $O(1)$ per insertion.

Accounting Method

The accounting method assigns a fixed cost to every operation so that the overall cost is covered, even when occasional expensive operations occur. Think of it like prepaying a little extra during inexpensive operations to cover the cost of future expensive ones.

For our dynamic array:

Suppose each insertion actually costs $1$ $1$ , but we charge an amortized cost of $3$ $3$ per insertion.
- $1$ unit pays for the current insertion.
- The extra $2$ units are saved as a “credit” to help pay for future array expansion.
When the array is full and resizes (with a cost of copying all elements), the credits saved from previous insertions are sufficient to cover the extra cost.

For example, imagine an array with elements and corresponding saved credits:

Initial State:
arr    = [1, 2, 3, 4]   // The array contains elements.
credit = [2, 2, 2, 2]   // Each element contributed 2 extra credits.

When the array is full, the saved credits are used to pay for the copying needed during expansion. This strategy ensures that even expensive expansion steps are prepaid, and thus every insertion maintains an amortized cost of $O(1)$ .

Potential Method

The potential method uses a potential function, usually denoted as $\Phi$ , to represent the “stored energy” (or prepaid work) in the data structure that can be used to pay for future operations. The change in potential helps to balance the cost over a sequence of operations.

Basics of the Potential Method

Define the state of the data structure (for instance, the number of elements, capacity, etc.) as $S$ . Let the initial state be $S_0$ .
Choose a potential function $\Phi(S)$ $Φ (S)$ , which should satisfy:
- $\Phi(S_0) = 0$ (initial potential is zero),
- $\Phi(S) \geq 0$ for any state $S$ .

For any operation, the amortized cost $\hat{c}$ is defined as:

\hat{c} = c + \Phi(S') - \Phi(S)

where $c$ is the actual cost of the operation, and $S$ and $S'$ denote the state before and after the operation, respectively.

If we perform a series of operations leading from state $S_0$ to $S_m$ , with actual costs $c_i$ for each step, the total actual cost is:

\sum_{i=1}^m c_i = \sum_{i=1}^m \hat{c_i} + \Phi(S_0) - \Phi(S_m)

Since the potential is never negative, the sum of the amortized costs provides an upper bound on the true total cost. If each $\hat{c_i}$ is $O(T(n))$ , then $O(T(n))$ is an upper bound for the amortized cost.

Example with a Dynamic Array

Let’s use the following potential function for our dynamic array:

\Phi(h) = 2n - m,

where $n$ is the number of elements in the array and $m$ is its current capacity. The potential function reflects the difference between the capacity and the actual number of elements.

For an insertion that does NOT trigger expansion:
- Actual cost: $O(1)$ .
- After insertion, $n$ increases by 1, so the potential increases by $2$ : $\Phi(h') - \Phi(h) = 2.$
- Amortized cost: $1 + 2 = 3$ .
For an insertion that TRIGGERS expansion (when $m = n$ $m = n$ ):
- The array’s capacity doubles to $2n$ , and the copying costs $O(n)$ .
- With careful design, the reduction in potential (due to increased capacity) offsets the high copying cost. For example, suppose the potential decreases by $n - 2$ during the expansion; then the high actual cost $n + 1$ (copying all $n$ items plus inserting the new element) is balanced: $n + 1 + (2 - n) = 3.$

Even though the resizing operation is expensive, the potential function ensures that its cost is pre-accumulated from previous cheaper operations. Thus, on average, every insertion maintains an amortized cost of $O(1)$ .

Extended Example: Stack Operations

Stack operations are another classic scenario where amortized analysis is useful. Imagine a stack data structure, S, that supports the following operations:

Operation	Description	Actual Cost $c_i$
S.push(x)	Push x onto the stack	$1$
S.pop()	Pop the top element	$1$
S.multi-pop(k)	Pop the top k elements	$O(\min(\lvert S\rvert, k))$

We will analyze these operations using all three methods.

1. Aggregate Analysis

For $n_{push}$ push operations, each costs $O(1)$ , so the total is $O(n_{push})$ .
For $n_{pop}$ pop operations, each costs $O(1)$ , so the total is $O(n_{pop})$ .
For $n_{multi-pop}$ multi-pop operations, although the worst-case cost for a single operation is $O(\min(\lvert S\rvert, k))$ , the total number of pop actions cannot exceed the number of push operations (each pushed element can be popped only once).

Since the total number of operations is at most proportional to the number of push operations, the overall cost is $O(n)$ , and the amortized cost per operation is $O(1)$ .

2. Accounting Method

In the accounting method, we assign a “charge” to each push operation to cover the eventual cost of pop operations.

For each S.push(x) operation, assume we charge $2$ $2$ units:
- $1$ unit pays for the push.
- $1$ unit is saved as credit, to later offset the cost of a pop (or multi-pop) operation.
Since each pop operation actually costs $1$ but is pre-paid for by the earlier push, every pop and multi-pop operation effectively incurs an amortized cost of $0$ .

Thus, every operation maintains an amortized cost of $O(1)$ .

3. Potential Method

For the potential method, define a potential function as:

\Phi(S) = \lvert S \rvert,

which is simply the number of elements currently in the stack. Each element contributes 1 unit of potential.

For S.push(x):
- Actual cost: $1$ .
- The potential increases by 1.
- Amortized cost: $1 + 1 = 2$ .
For S.pop():
- Actual cost: $1$ .
- The potential decreases by 1.
- Amortized cost: $1 - 1 = 0$ .
For S.multi-pop(k):
- Each of the $k$ pops costs $1$ , but since the potential decreases by $k$ overall, the amortized cost is $k - k = 0$ .

Hence, using this potential function, the amortized cost for any stack operation remains $O(1)$ .

In summary, regardless of whether we analyze a dynamic array or stack operations, amortized analysis—through aggregate, accounting, or potential methods—shows that expensive operations (like array resizing or multi-pop) can be effectively “spread out” over a sequence of cheaper operations. As a result, even though some single steps might be costly, the average cost per operation remains constant, which is crucial for designing efficient algorithms and data structures.

Introduction Time and Space Complexity