Amortized analysis

{{Short description|Method for algorithm analysis in computer science}} {{Redirect|Amortized|other uses|Amortization (disambiguation){{!}}Amortization}} In computer science, '''amortized analysis''' is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case run time can be too pessimistic. Instead, amortized analysis averages the running times of operations in a sequence over that sequence.<ref name="tarjan"/>{{rp|306}} As a conclusion: "Amortized analysis is a useful tool that complements other techniques such as worst-case and average-case analysis."<ref name="fiebrink"/>{{rp|14}}<ref>{{cite web|quote="[Amortized analysis] is different from what is commonly referred to as average case analysis, because amortized analysis does not make any assumption about the distribution of the data values, whereas average case analysis assumes the data are not "bad" (e.g., some sorting algorithms do well on "average" over all input orderings but very badly on certain input orderings). That is, amortized analysis is a worst case analysis, but for a sequence of operations, rather than for individual operations." |title=Lecture 18: Amortized Algorithms |work=CS312 -Data Structures and Functional Programming |date=2006 |publisher=Cornell University |url=https://www.cs.cornell.edu/courses/cs312/2006sp/lectures/lec18.html }}</ref>

For a given operation of an algorithm, certain situations (e.g., input parametrizations or data structure contents) may imply a significant cost in resources, whereas other situations may not be as costly. The amortized analysis considers both the costly and less costly operations together over the whole sequence of operations. This may include accounting for different types of input, length of the input, and other factors that affect its performance.<ref name="fiebrink">{{Citation |url=http://www.cs.princeton.edu/~fiebrink/423/AmortizedAnalysisExplained_Fiebrink.pdf |title=Amortized Analysis Explained |author=Rebecca Fiebrink |year=2007 |access-date=2011-05-03 |archive-url=https://web.archive.org/web/20131020020356/http://www.cs.princeton.edu/~fiebrink/423/AmortizedAnalysisExplained_Fiebrink.pdf |archive-date=20 October 2013 |url-status=dead }}</ref>

==History== Amortized analysis initially emerged from a method called aggregate analysis, which is now subsumed by amortized analysis. The technique was first formally introduced by Robert Tarjan in his 1985 paper ''Amortized Computational Complexity'',<ref name="tarjan">{{cite journal |last=Tarjan |first=Robert Endre |author-link=Robert Tarjan |title=Amortized Computational Complexity |journal=SIAM Journal on Algebraic and Discrete Methods |date=April 1985 |volume=6 |issue=2 |pages=306–318 |doi=10.1137/0606031 |url=http://www.cs.duke.edu/courses/fall11/cps234/reading/Tarjan85_AmortizedComplexity.pdf |url-status=live |access-date=2024-06-09 |archive-url=https://web.archive.org/web/20150226051958/http://www.cs.duke.edu/courses/fall11/cps234/reading/Tarjan85_AmortizedComplexity.pdf |archive-date=2015-02-26 }}</ref> which addressed the need for a more useful form of analysis than the common probabilistic methods used. Amortization was initially used for very specific types of algorithms, particularly those involving binary trees and union operations. However, it is now ubiquitous and comes into play when analyzing many other algorithms as well.<ref name="fiebrink"/>

==Method== {{main|Accounting method (computer science)|Potential method}} Amortized analysis requires knowledge of which series of operations are possible. This is most commonly the case with data structures, which have a state that persists between operations. The basic idea is that a worst-case operation can alter the state in such a way that the worst case cannot occur again for a long time, thus "amortizing" its cost.

There are generally three methods for performing amortized analysis: the aggregate method, the accounting method, and the potential method. All of these give correct answers; the choice of which to use depends on which is most convenient for a particular situation.<ref name="Lecture 20">{{cite web|url=https://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec20-amortized/amortized.htm|title=CS 3110 Lecture 20: Amortized Analysis|last=Kozen|first=Dexter|date=Spring 2011|publisher=Cornell University|access-date=14 March 2015}}</ref>

*Aggregate analysis determines the upper bound ''T''(''n'') on the total cost of a sequence of ''n'' operations, then calculates the amortized cost to be ''T''(''n'') / ''n''.<ref name="Lecture 20"/> *The accounting method is a form of aggregate analysis which assigns to each operation an ''amortized cost'' which may differ from its actual cost. Early operations have an amortized cost higher than their actual cost, which accumulates a saved "credit" that pays for later operations having an amortized cost lower than their actual cost. Because the credit begins at zero, the actual cost of a sequence of operations equals the amortized cost minus the accumulated credit. Because the credit is required to be non-negative, the amortized cost is an upper bound on the actual cost. Usually, many short-running operations accumulate such credit in small increments, while rare long-running operations decrease it drastically.<ref name="Lecture 20"/> *The potential method is a form of the accounting method where the saved credit is computed as a function (the "potential") of the state of the data structure. The amortized cost is the immediate cost plus the change in potential.<ref name="Lecture 20"/>

==Examples== ===Dynamic array=== thumb|right|270px|Amortized analysis of the push operation for a dynamic array Consider a dynamic array that grows in size as more elements are added to it, such as {{code|ArrayList}} in Java or {{code|std::vector}} in C++. If we started out with a dynamic array of size 4, we could push 4 elements onto it, and each operation would take constant time. Yet pushing a fifth element onto that array would take longer as the array would have to create a new array of a scaled size, copy the old elements onto the new array, then add the new element. The next few push operations would similarly take constant time, then the subsequent addition would require another slow scaling of the array size.

In general, for an arbitrary number <math>n</math> of pushes to an array of any initial size, the times for steps that scale the array add in a geometric series to <math>O(n)</math>, while the constant times for each remaining push also add to <math>O(n)</math>. Therefore the average time per push operation is <math>O(n)/n = O(1)</math>. This reasoning can be formalized and generalized to more complicated data structures using amortized analysis.<ref name="Lecture 20" />

===Queue=== Shown is a Python implementation of a queue, a FIFO data structure: <syntaxhighlight lang="python"> class Queue: """Represents a first-in, first-out collection.""" # Initialize the queue with two empty lists def __init__(self): self.input = [] # Stores elements that are enqueued self.output = [] # Stores elements that are dequeued

def enqueue(self, element): """Add an object to the end of the queue.""" self.input.append(element) # Append the element to the input list

def dequeue(self): """Remove and return the object at the beginning of the queue.""" if not self.output: # If the output list is empty # Transfer all elements from the input list to the output list, reversing the order while self.input: # While the input list is not empty self.output.append( self.input.pop() ) # Pop the last element from the input list and append it to the output list

return self.output.pop() # Pop and return the last element from the output list </syntaxhighlight>

The enqueue operation just pushes an element onto the input array; this operation does not depend on the lengths of either input or output and therefore runs in constant time.

However the dequeue operation is more complicated. If the output array already has some elements in it, then dequeue runs in constant time; otherwise, dequeue takes {{tmath|O(n)}} time to add all the elements onto the output array from the input array, where ''n'' is the current length of the input array. After copying ''n'' elements from input, we can perform ''n'' dequeue operations, each taking constant time, before the output array is empty again. Thus, we can perform a sequence of ''n'' dequeue operations in only {{tmath|O(n)}} time, which implies that the amortized time of each dequeue operation is {{tmath|O(1)}}.<ref name=Grossman>{{cite web|last1=Grossman|first1=Dan|title=CSE332: Data Abstractions|url=http://courses.cs.washington.edu/courses/cse332/10sp/lectures/lecture21.pdf|website=cs.washington.edu|access-date=14 March 2015}}</ref>

Alternatively, we can charge the cost of copying any item from the input array to the output array to the earlier enqueue operation for that item. This charging scheme doubles the amortized time for enqueue but reduces the amortized time for dequeue to {{tmath|O(1)}}.

== Common use == * In common usage, an "amortized algorithm" is one that an amortized analysis has shown to perform well. * Online algorithms commonly use amortized analysis.

== References == {{Reflist}}

== Literature == * {{cite web |url=https://www.cs.cmu.edu/afs/cs/academic/class/15451-s07/www/lecture_notes/lect0206.pdf|title=Lecture 7: Amortized Analysis|publisher=Carnegie Mellon University |access-date=14 March 2015}} * {{cite book |author= Allan Borodin and Ran El-Yaniv |url=https://www.cs.technion.ac.il/~rani/book.html |title= Online Computation and Competitive Analysis |date= 1998 |pages= 20,141 }}

Category:Analysis of algorithms Category:Articles with example Ruby code Category:Amortized data structures