algorithm

Topics related to algorithm:

Getting started with algorithm

Introduction to Algorithms

Algorithms are ubiquitous in Computer Science and Software Engineering. Selection of appropriate algorithms and data structures improves our program efficiency in cost and time.

What is an algorithm? Informally, an algorithm is a procedure to accomplish a specific task. [ Skiena:2008:ADM:1410219] Specifically, an algorithm is a well-defined computational procedure, which takes some value (or set of values) as input and produces some value, or a set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output. Cormen et. al. does not explicitly remark that an algorithm does not necessarily require an input. [Cormen:2001:IA:580470]

Formally, an algorithm must satisfy five features: [Knuth:1997:ACP:260999]

  1. Finiteness. An algorithm must always terminate after a finite number of steps.
  2. Definiteness. Each step of an algorithm must be precisely defined; the actions to be carried out must be rigorously specified. It is this quality that [Cormen:2001:IA:580470] refers to with the term "well-defined".
  3. Input. An algorithm has zero or more inputs. These are quantities that are given to the algorithm initially before it begins or dynamically as it runs.
  4. Output. An algorithm has one or more outputs. These are quantities that have a specified relation to the inputs. We expect that an algorithm produces the same output when given the same input over and over again.
  5. Effectiveness. An algorithm is also generally expected to be effective. Its operations must be sufficiently basic that they can be done exactly in principle and in a finite length of time by someone using pencil and paper.

A procedure that lacks finiteness but satisfies all other characteristics of an algorithm may be called a computational method. [Knuth:1997:ACP:260999]

Sorting

Bubble Sort

Algorithm Complexity

All algorithms are a list of steps to solve a problem. Each step has dependencies on some set of previous steps, or the start of the algorithm. A small problem might look like the following:

enter image description here

This structure is called a directed acyclic graph, or DAG for short. The links between each node in the graph represent dependencies in the order of operations, and there are no cycles in the graph.

How do dependencies happen? Take for example the following code:

total = 0
for(i = 1; i < 10; i++)
    total = total + i

In this psuedocode, each iteration of the for loop is dependent on the result from the previous iteration because we are using the value calculated in the previous iteration in this next iteration. The DAG for this code might look like this:

enter image description here

If you understand this representation of algorithms, you can use it to understand algorithm complexity in terms of work and span.

Work

Work is the actual number of operations that need to be executed in order to achieve the goal of the algorithm for a given input size n.

Span

Span is sometimes referred to as the critical path, and is the fewest number of steps an algorithm must make in order to achieve the goal of the algorithm.

The following image highlights the graph to show the differences between work and span on our sample DAG.

enter image description here

The work is the number of nodes in the graph as a whole. This is represented by the graph on the left above. The span is the critical path, or longest path from the start to the end. When work can be done in parallel, the yellow highlighted nodes on the right represent span, the fewest number of steps required. When work must be done serially, the span is the same as the work.

Both work and span can be evaluated independently in terms of analysis. The speed of an algorithm is determined by the span. The amount of computational power required is determined by the work.

Graph

Graphs are a mathematical structure that model sets of objects that may or may not be connected with members from sets of edges or links.

A graph can be described through two different sets of mathematical objects:

  • A set of vertices.
  • A set of edges that connect pairs of vertices.

Graphs can be either directed or undirected.

  • Directed graphs contain edges that "connect" only one way.
  • Undirected graphs contain only edges that automatically connect two vertices together in both directions.

Dynamic Programming

Dynamic Programming is an improvement on Brute Force, see this example to understand how one can obtain a Dynamic Programming solution from Brute Force.

A Dynamic Programming Solution has 2 main requirements:

  1. Overlapping Problems

  2. Optimal Substructure

Overlapping Subproblems means that results of smaller versions of the problem are reused multiple times in order to arrive at the solution to the original problem

Optimal Substructure means that there is a method of calculating a problem from its subproblems.

A Dynamic Programming Solution has 2 main components, the State and the Transition

The State refers to a subproblem of the original problem.

The Transition is the method to solve a problem based on its subproblems

The time taken by a Dynamic Programming Solution can be calculated as No. of States * Transition Time. Thus if a solution has N^2 states and the transition is O(N), then the solution would take roughly O(N^3) time.

Kruskal's Algorithm

Kruskal's Algorithm is a greedy algorithm used to find Minimum Spanning Tree (MST) of a graph. A minimum spanning tree is a tree which connects all the vertices of the graph and has the minimum total edge weight.

Kruskal's algorithm does so by repeatedly picking out edges with minimum weight (which are not already in the MST) and add them to the final result if the two vertices connected by that edge are not yet connected in the MST, otherwise it skips that edge. Union - Find data structure can be used to check whether two vertices are already connected in the MST or not. A few properties of MST are as follows:

  1. A MST of a graph with n vertices will have exactly n-1 edges.
  2. There exists a unique path from each vertex to every other vertex.

Greedy Algorithms

A greedy algorithm is an algorithm in which in each step we choose the most beneficial option in every step without looking into the future. The choice depends only on current profit.

Greedy approach is usually a good approach when each profit can be picked up in every step, so no choice blocks another one.

Searching

Big-O Notation

Definition

The Big-O notation is at its heart a mathematical notation, used to compare the rate of convergence of functions. Let n -> f(n) and n -> g(n) be functions defined over the natural numbers. Then we say that f = O(g) if and only if f(n)/g(n) is bounded when n approaches infinity. In other words, f = O(g) if and only if there exists a constant A, such that for all n, f(n)/g(n) <= A.

Actually the scope of the Big-O notation is a bit wider in mathematics but for simplicity I have narrowed it to what is used in algorithm complexity analysis : functions defined on the naturals, that have non-zero values, and the case of n growing to infinity.

What does it mean ?

Let's take the case of f(n) = 100n^2 + 10n + 1 and g(n) = n^2. It is quite clear that both of these functions tend to infinity as n tends to infinity. But sometimes knowing the limit is not enough, and we also want to know the speed at which the functions approach their limit. Notions like Big-O help compare and classify functions by their speed of convergence.

Let's find out if f = O(g) by applying the definition. We have f(n)/g(n) = 100 + 10/n + 1/n^2. Since 10/n is 10 when n is 1 and is decreasing, and since 1/n^2 is 1 when n is 1 and is also decreasing, we have ̀f(n)/g(n) <= 100 + 10 + 1 = 111. The definition is satisfied because we have found a bound of f(n)/g(n) (111) and so f = O(g) (we say that f is a Big-O of n^2).

This means that f tends to infinity at approximately the same speed as g. Now this may seem like a strange thing to say, because what we have found is that f is at most 111 times bigger than g, or in other words when g grows by 1, f grows by at most 111. It may seem that growing 111 times faster is not "approximately the same speed". And indeed the Big-O notation is not a very precise way to classify function convergence speed, which is why in mathematics we use the equivalence relationship when we want a precise estimation of speed. But for the purposes of separating algorithms in large speed classes, Big-O is enough. We don't need to separate functions that grow a fixed number of times faster than each other, but only functions that grow infinitely faster than each other. For instance if we take h(n) = n^2*log(n), we see that h(n)/g(n) = log(n) which tends to infinity with n so h is not O(n^2), because h grows infinitely faster than n^2.

Now I need to make a side note : you might have noticed that if f = O(g) and g = O(h), then f = O(h). For instance in our case, we have f = O(n^3), and f = O(n^4)... In algorithm complexity analysis, we frequently say f = O(g) to mean that f = O(g) and g = O(f), which can be understood as "g is the smallest Big-O for f". In mathematics we say that such functions are Big-Thetas of each other.

How is it used ?

When comparing algorithm performance, we are interested in the number of operations that an algorithm performs. This is called time complexity. In this model, we consider that each basic operation (addition, multiplication, comparison, assignment, etc.) takes a fixed amount of time, and we count the number of such operations. We can usually express this number as a function of the size of the input, which we call n. And sadly, this number usually grows to infinity with n (if it doesn't, we say that the algorithm is O(1)). We separate our algorithms in big speed classes defined by Big-O : when we speak about a "O(n^2) algorithm", we mean that the number of operations it performs, expressed as a function of n, is a O(n^2). This says that our algorithm is approximately as fast as an algorithm that would do a number of operations equal to the square of the size of its input, or faster. The "or faster" part is there because I used Big-O instead of Big-Theta, but usually people will say Big-O to mean Big-Theta.

When counting operations, we usually consider the worst case: for instance if we have a loop that can run at most n times and that contains 5 operations, the number of operations we count is 5n. It is also possible to consider the average case complexity.

Quick note : a fast algorithm is one that performs few operations, so if the number of operations grows to infinity faster, then the algorithm is slower: O(n) is better than O(n^2).

We are also sometimes interested in the space complexity of our algorithm. For this we consider the number of bytes in memory occupied by the algorithm as a function of the size of the input, and use Big-O the same way.

Bellman–Ford Algorithm

Given a directed graph G, we often want to find the shortest distance from a given node A to rest of the nodes in the graph. Dijkstra algorithm is the most famous algorithm for finding the shortest path, however it works only if edge weights of the given graph are non-negative. Bellman-Ford however aims to find the shortest path from a given node (if one exists) even if some of the weights are negative. Note that, shortest distance may not exist if a negative cycle is present in the graph (in which case we can go around the cycle resulting in infinitely small total distance ). Bellman-Ford additionally allows us to determine the presence of such a cycle.

Total complexity of the algorithm is O(V*E), where V - is the number of vertices and E number of edges

Merge Sort

Binary Search Trees

Trees

Trees are a sub-category or sub-type of node-edge graphs. They are ubiquitous within computer science because of their prevalence as a model for many different algorithmic structures that are, in turn, applied in many different algorithms

Insertion Sort

When we analyze the performance of the sorting algorithm, we interest primarily on the number of comparison and exchange.

Average Exchange

Let En be the total average number of exchange to sort array of N element. E1 = 0 (we do not need any exchange for array with one element). The average number of exchange to sort N element array is the sum of average number of number of exchange to sort N-1 element array with the average exchange to insert the last element into N-1 element array.

enter image description here

Simplify the summation (arithmetic series)

enter image description here

Expands the term

enter image description here

Simplify the summation again (arithmetic series)

enter image description here

Average Comparison

Let Cn be the total average number of comparison to sort array of N element. C1 = 0 (we do not need any comparison on one element array). The average number of comparison to sort N element array is the sum of average number of number of comparison to sort N-1 element array with the average comparison to insert the last element into N-1 element array. If last element is largest element, we need only one comparison, if the last element is the second smallest element, we need N-1 comparison. However, if last element is the smallest element, we do not need N comparison. We still only need N-1 comparison. That is why we remove 1/N in below equation.

enter image description here

Simplify the summation (arithmetic series)

enter image description here

Expand the term

enter image description here

Simplify the summation again (arithmetic series and harmonic number)

enter image description here

Hash Functions

Travelling Salesman

The Travelling Salesman Problem is the problem of finding the minimum cost of travelling through N vertices exactly once per vertex. There is a cost cost[i][j] to travel from vertex i to vertex j.

There are 2 types of algorithms to solve this problem: Exact Algorithms and Approximation Algorithms

Exact Algorithms

  1. Brute Force Algorithm
  2. Dynamic Programming Algorithm

Approximation Algorithms

To be added

Substring Search

Dijkstra’s Algorithm

Floyd-Warshall Algorithm

Breadth-First Search

Bucket Sort

Quicksort

Sometimes Quicksort is also known as Partition-Exchange sort.
Auxiliary Space: O(n)
Time complexity: worst O(n²), bestO(nlogn)

Depth First Search

Knapsack Problem

Counting Sort

Cycle Sort

Heap Sort

Prim's Algorithm

Matrix Exponentiation

Pigeonhole Sort

Radix Sort

Equation Solving

Odd-Even Sort

Pseudocode

Pseudocode is by definition informal. This topic is meant to describe ways to translate language-specific code into something everyone with a programming background can understand.

Pseudocode is an important way to describe an algorithm and is more neutral than giving a langugage-specific implementation. Wikipedia often uses some form of pseudocode when describing an algorithm

Some things, like if-else type conditions are quite easy to write down informally. But other things, js-style callbacks for instance, may be hard to turn into pseudocode for some people.

This is why these examples may prove useful

Catalan Number Algorithm

Integer Partition Algorithm

A* Pathfinding

Shell Sort

Selection Sort

Pancake Sort

Longest Common Subsequence

Longest Increasing Subsequence

Maximum Path Sum Algorithm

Maximum Subarray Algorithm

Dynamic Time Warping

Pascal's Triangle

Line Algorithm

Shortest Common Supersequence Problem

Sliding Window Algorithm

Applications of Greedy technique

Sources

  1. The examples above are from lecture notes frome a lecture which was taught 2008 in Bonn, Germany. They in term are based on the book Algorithm Design by Jon Kleinberg and Eva Tardos:

Online algorithms

Theory

Definition 1: An optimization problem Π consists of a set of instances ΣΠ. For every instance σ∈ΣΠ there is a set Ζσ of solutions and a objective function fσ : Ζσ → ℜ≥0 which assigns apositive real value to every solution.
We say OPT(σ) is the value of an optimal solution, A(σ) is the solution of an Algorithm A for the problem Π and wA(σ)=fσ(A(σ)) its value.

Definition 2: An online algorithm A for a minimization problem Π has a competetive ratio of r ≥ 1 if there is a constant τ∈ℜ with

wA(σ) = fσ(A(σ)) ≤ r ⋅ OPT(&sigma) + τ

for all instances σ∈ΣΠ. A is called a r-competitive online algorithm. Is even

wA(σ) ≤ r ⋅ OPT(&sigma)

for all instances σ∈ΣΠ then A is called a strictly r-competitive online algorithm.

Proposition 1.3: LRU and FWF are marking algorithm.

Proof: At the beginning of each phase (except for the first one) FWF has a cache miss and cleared the cache. that means we have k empty pages. In every phase are maximal k different pages requested, so there will be now eviction during the phase. So FWF is a marking algorithm.
Lets assume LRU is not a marking algorithm. Then there is an instance σ where LRU a marked page x in phase i evicted. Let σt the request in phase i where x is evicted. Since x is marked there has to be a earlier request σt* for x in the same phase, so t* < t. After t* x is the caches newest page, so to got evicted at t the sequence σt*+1,...,σt has to request at least k from x different pages. That implies the phase i has requested at least k+1 different pages which is a contradictory to the phase definition. So LRU has to be a marking algorithm.

Proposition 1.4: Every marking algorithm is strictly k-competitive.

Proof: Let σ be an instance for the paging problem and l the number of phases for σ. Is l = 1 then is every marking algorithm optimal and the optimal offline algorithm cannot be better.
We assume l ≥ 2. the cost of every marking algorithm for instance σ is bounded from above with l ⋅ k because in every phase a marking algorithm cannot evict more than k pages without evicting one marked page.
Now we try to show that the optimal offline algorithm evicts at least k+l-2 pages for σ, k in the first phase and at least one for every following phase except for the last one. For proof lets define l-2 disjunct subsequences of σ. Subsequence i ∈ {1,...,l-2} starts at the second position of phase i+1 and end with the first position of phase i+2.
Let x be the first page of phase i+1. At the beginning of subsequence i there is page x and at most k-1 different pages in the optimal offline algorithms cache. In subsequence i are k page request different from x, so the optimal offline algorithm has to evict at least one page for every subsequence. Since at phase 1 beginning the cache is still empty, the optimal offline algorithm causes k evictions during the first phase. That shows that

wA(σ) ≤ l⋅k ≤ (k+l-2)k ≤ OPT(σ) ⋅ k

Corollary 1.5: LRU and FWF are strictly k-competitive.

Is there no constant r for which an online algorithm A is r-competitive, we call A not competitive.

Proposition 1.6: LFU and LIFO are not competitive.

Proof: Let l ≥ 2 a constant, k ≥ 2 the cache size. The different cache pages are nubered 1,...,k+1. We look at the following sequence:

enter image description here

First page 1 is requested l times than page 2 and so one. At the end there are (l-1) alternating requests for page k and k+1.
LFU and LIFO fill their cache with pages 1-k. When page k+1 is requested page k is evicted and vice versa. That means every request of subsequence (k,k+1)l-1 evicts one page. In addition their are k-1 cache misses for the first time use of pages 1-(k-1). So LFU and LIFO evict exact k-1+2(l-1) pages.
Now we must show that for every constant τ∈ℜ and every constan r ≤ 1 there exists an l so that

enter image description here

which is equal to

enter image description here

To satisfy this inequality you just have to choose l sufficient big. So LFU and LIFO are not competetive.

Proposition 1.7: There is no r-competetive deterministic online algorithm for paging with r < k.

Sources

Basic Material

  1. Script Online Algorithms (german), Heiko Roeglin, University Bonn
  2. Page replacement algorithm

Further Reading

  1. Online Computation and Competetive Analysis by Allan Borodin and Ran El-Yaniv

Source Code

  1. Source code for offline caching
  2. Source code for adversary game

Fast Fourier Transform

A* Pathfinding Algorithm

Check if a tree is BST or not

Binary Tree traversals

Lowest common ancestor of a Binary Tree

Graph Traversals

polynomial-time bounded algorithm for Minimum Vertex Cover

The first thing you have to do in this algorithm to get all of the vertices of the graph sorted in descending order according to its degree.

After that you have iterate on them and add each one to final vertices set which don't have any adjacent vertex in this set.

In the final stage iterate on the final vertices set and remove all of the vertices which have one of its adjacent vertices in this set.

Multithreaded Algorithms

Algo:- Print a m*n matrix in square wise

Check two strings are anagrams

Edit Distance Dynamic Algorithm

Applications of Dynamic Programming

Definitions

Memoization - an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

Dynamic Programming - a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions.

Knuth Morris Pratt (KMP) Algorithm