# What are partition-based algorithms

## Programming 2 (Java)

Transcript

1 Programming 2 (Java) Summary v2.0 Kälin Thomas, Department I SS 06 Updated and supplemented by Raphael Horber, Department I FS 2013 Kälin Thomas, Abbot I 1 /

2 1.

3 14.1.

4 21.5.

5 29.

6 39.1. BFS Algorithm Applications DFS vs. BFS Directed Graph / Digraph Strong Connectivity Transitive Termination Floyd-Warshall's Algorithm Example Implementation Topological Sorting DAG Shortest Path Dijkstra's Algorithm Example Implementation Bellman-Ford Algorithm DAG-based Algorithm Minimum Spanning Tree Kruskal's Algorithm Example implementation Data structure for the algorithm Representation of a partition Partition-based implementation Prim-Jarnik s algorithm Baruvka s algorithm XML (Extensible Markup Language) Example attributes or child elements? Names CDATA Well-formed XML / Validity XML: Document Type Definition (DTD) Example (internal DTD) Example (external DTD) Specials Kälin Thomas, Abbot I 6 /

7 1. compareto compare (a, b): a return <0 a = b => return = 0 a> b => return> 0 2. Recursive method values ​​of the current parameters for which no recursive call was made are referred to as base cases or anchors. Every possible recursive call chain must reach a base case. Any recursive call should be defined to move execution towards the anchor. Example: factorial public static int factor (int n) {if (n == 0) {return 1; // anchoring else {return n * factor (n-1); 3. Iterative / Recursive / Explicit Iterative Recursive Explicit lin (n) = sum (i = 1; n; i) lin (1) = 1 lin (n) = n + lin (n-1) lin (n) = n * (n + 1) / 2 sqr (n) = sum (i = 1; n; i ^ 2) sqr (1) = 1 sqr (n) = n ^ 2 + sum (n-1) sqr (n) = n * (n + 1) * (2n + 1) / example series sequence: 1, 5, 9, 13, 17, ... Iterative: an = 1 + sum (i = 1; n-1; 4) Recursive: a 1 = 1; a n = 4 + a n -1; Explicit: a n = 4n-3 row: 1, 6, 15, 28, ... (sum of the sequence 1!) Iterative: s n = sum (i = 1; n; 4n-3); Recursive: s 1 = 1; s n = 4n-3 + s n-1; Explicitly: s n = n * (a1 + an) / 2 = n * (1 + 4n-3) / 2; // Use explicit formula // Use explicit formula // Explicit formula used in relation: recursion / induction The anchoring corresponds to the termination condition in the recursion and the induction step corresponds to a recursive call. One proves n + 1 by n and the induction step. This is also given in the recursion by calculating n + 1 from n. 4. Design Patterns A design pattern is a solution to a typical software design problem, which can be used in very different situations Structure of a design pattern Pattern name: Name of the pattern Problem section: Description of when to use the pattern Solution section: Description of the elements that make up the pattern. No implementation, just a general concept. Consequences section: List of advantages and disadvantages when using pattern 4.2. Object adapter / class adapter Kälin Thomas, Dept. I 7 / object adapter: class adapter: work with original object with class extend original class

8 5. Analysis of algorithms An algorithm is a step-by-step approach to solving a problem that takes a finite amount of time. Can be easily compared with a cooking recipe. Runtime behavior Most algorithms transfer inputs into outputs. The runtime of an algorithm typically increases with the size of the input. When analyzing the runtime behavior, we concentrate on the worst case, as this is comparatively easy to analyze. Experimental studies We write a program that implements the algorithm and run it with different, representative inputs. We record the runtimes with System.currentTimeMillis (), for example, and compare the results. Problem: The algorithm has to be implemented in full (expenditure of time!), The selection of the inputs is very critical and can lead to wrong conclusions. Of course, tests must always be carried out on the same hardware and software. Theoretical analysis Here, a high-level description of the algorithm (pseudocode) is used, no implementation. You are looking for a functional description (as a function of the input). This means that all possible inputs are taken into account, and the method is independent of HW and SW Important functions constant currentmax) then {currentmax <- A [i] increment counter i; return currentmax 1 -> n + 2 (n-1) + 2 (n-1) + 2 (n-1) + 1 = 7n-2 best case: conditions do not count; Worst case: Conditions count Asymptotic Algorithm Analysis 2 // Access = 1, Assignment = 1 1 + n // Assignment = 1, Compare = n, 2 (n-1) // Indexing and Comparison 2 (n-1) / / In the stupidest case, Zuw. + Index. 2 (n-1) // i = i + 1 !! This means determining the runtime behavior in the big-oh notation. To do this, we first look for the worst-case behavior as a function of primitive operations (7n-2). We describe this function in the big-oh notation (arraymay runs in O (n) time). Kälin Thomas, Abbot I 8 /

9 5.4. Recursive estimation 1. Assumption: T is quadratic 2. Insert parameter (part in brackets after T) with a * n 2 + b * n + c 3. Multiply 4. Divide into n 2, n 1, n 0 (parts with n 2 to a, with n 1 to b, with n 0 to c) 5. Calculate a using the anchoring Example T (n) = 16 * T (n / 4) 3n 15; T (4) = 21 2 .: 16 (an 2/16 + bn / 4 + c) 3n 15 3 .: 16an 2 / bn / 4 + 16c 3n 15 = an 2 + 4bn 3n + 16c 15 4 .: n 2: a = an 1: b = 4b 3 => 3b = -3 => b = 1 n 0: c = 16c 15 => -15c = -15 => c = 1 5 .: T (4) = 21 // n = 4 21 = an 2 + 4bn 3n + 16c = 16a = 16a = 16a => a = Big-Oh / Big-Omega / Big-Theta f (n) is O (g (n)) if a c > 0 and a no> = 1 exists such that: f (n) <= c * g (n) for n> = n 0 f (n) is Ω (g (n)) if a c> 0 and a no> = 1 exists such that: f (n)> = c * g (n) for n> = n 0 f (n) is Θ (g (n)) if a c> 0 and c 0 and a no > = 1 exists, so that: c * g (n) <= f (n) <= c * g (n) n> = n 0 The Big-Oh notation allows the specification of an upper limit for the growth function. The statement f (n) is O (g (n)) means: The growth function f (n) is limited above by g (n). f (n) is Ω (g (n)) if f (n) is asymptotically greater than or equal to g (n) (limited by below). f (n) is Θ (g (n)) if f (n) is asymptotically equal to g (n) (bounded above and below) Big-Oh rules omit all lower powers omit all constants omit the coefficient of the highest power way examples 7n-2 is O (n)> c = 7, no = 1 3n 3 + 20n 2 +5 is O (n 3)> c = 4, no = 21 3 * log (n) + 5 is O ( log (n))> c = 8, no = 2 5n 2 is Ω (n 2)> c = 5, no = 1 5n 2 is Θ (n 2)> c = 5, c = 5, no = required math log b (x * y) = log b (x) + log b (y) a (b + c) = ab * ac log b (x / y) = log b (x) log b (y) a (bc ) = ab / ac log b (x ^ a) = a * log b (x) a bc = (ab) c log b (a) = log x (a) / log x (b) b = a loga (b ) bc = a * c * loga (b) Kälin Thomas, Abbot I 9 /

10 5.6. Analysis log (n) Programming 2 In an interval of length n, n values ​​are saved. The interval is halved. There are now only n / 2 values ​​available. Now a sub-interval is halved again, so only n / 4 values ​​are available. Abort if there is only one value left. This behavior is typical for logarithms with O (log (n))! 6. Recursion Checking the anchoring (base cases). Every recursive call chain must ultimately lead to an anchoring. Each recursive call will typically lead in the direction of a base case. When programming recursive methods, it is important to define the methods so that the recursion becomes easy. End recursion End recursion occurs when the last statement in a linear recursive method is the recursive call. They can be transformed into a non-recursive algorithm. int sum (int n) {if (n> 0) return n + sum (n - 1); else return 0; int sum (int n, int s) {if (n == 0) return s; else return sum (n 1, n + s); 6.2. Binary recursion Binary recursion always occurs when two recursive calls are made in all non-terminal calls (ex: Fibonacci). These algorithms typically have a behavior of O (2 n)! 6.3. PuzzleSolve 7. Position ADT A position represents the concept of space in a data structure. A position can save an element, e.g. an array element or a node of a list Operation (exactly one) Object element (): returns the element of the position Kälin Thomas, Abbot I 10 /

11 8. Entry ADT An entry consists of a key-value pair of operations 9. Stack key (): returns the key value (): returns the value A stack stores any objects. Insertion and deletion are carried out according to the LIFO scheme. Use for: History in the web browser, undo function, method calls in the JVM Operations push (object): Insert an element Object pop (): Remove and return the topmost element. Object top (): Returns the top element without removing it. int size (): Returns the number of stored elements bool isempty (): Indicates whether there are any elements in the stack. Exceptions EmptyStackException is thrown in the case of an empty stack. The index of the top element is saved in a variable. In the event that the array is full, a FullStackException must be programmed (implementation-dependent exception!). Each operation takes O (1) time ParenthesesMatcher (Character [] par) {Stack stack = new Stack <> (); for (Character c: par) {// insert the opening bracket if (c.equals ('(') c.equals ('[') c.equals ('{')) {stack.push (c); / / closing bracket else {if (c.equals (')') c.equals (']') c.equals ('')) {if (stack.isempty ()) {System.err.println (number .. do not match!); return false; Character tmp = stack.pop (); if (tmp.equals ('(') &&! c.equals (')') tmp.equals ('[') &&! c.equals (']') tmp.equals ('{') &&! c. equals ('')) {System.err.println (chamber type does not match!); return false; if (stack.isempty ()) {return true; else {return false; Kälin Thomas, Abbot I 11 /

14 14. Sequence Summarizes the array and node lists. Access possible via index or position. Generic replacement for stack, queue, array or node list operations int size (), bool isempty (): Generic methods // Operations of the array list (e get (i), set (i, e),) // Operations of the node list (Node first (), Node last (), Node getprev (),) // Bridge operations: Node atindex (i), i indexof (node) Implementation with linked list For a graphic of the implementation, see Node List Nodes implement position and save element, link to predecessor / successor. There is a special header and trailer node. Index-based accesses always search (counting the steps) from the beginning / end of the list and are therefore linear in time (O (n)). Array-based implementation Each array entry contains a reference to a position. These contain the index and a reference to the element. Variables F and L save the first and last occupied position. 15. Iterator An iterator abstracts the process of scanning over a collection of elements. Typically, an iterator is used in conjunction with another ADT. The ADT in question is supplemented by an Iterator iterator () method. Implementation: with snapshot or dynamic operations bool hasnext (): Still other elements available? e next (): Returns the next element For / Iterator For is used with arrays, an iterator for complex data structures (list).Here's another example of a For-Each loop: String arrvalues ​​[] = {That's, crap. ; for (String strvalue: arrvalues) {// strvalue contains in sequence all elements of arrvalues. 16. Trees / Trees In computer science, trees represent abstract, hierarchical data structures Tree Terminology Root: Node without parent node (A) Internal node: Node with at least one child (A, B, F, C) External node (leaf) : Knot without a child. (E, I, J, K, G, H, D) Predecessor: Node of a higher level (B is the predecessor of E) Successor: Node of a lower level (E is the successor / child of B) Depth of a node: Number of predecessors ( Depth A: 0, B: 1, E: 2, K: 3) Height of a node: Maximum number of successors down to one leaf (height B: 2) (leaf: 0, internal node: 1 + max (height of successors )) Height of the tree: height of the root (height A: 3) Subtree: tree made up of a node and its successors Sibling: sibling node Kälin Thomas, Abbot I 14 /

15 16.2. Operations of the Tree ADT Programming 2 int size (): Number of nodes in the tree bool isempty (): Nodes contained? Iterator iterator (): Iterator over the elements (contents) IterableCollection positions (): Iterator over the nodes Node root (): Returns the root node Node parent (node): Returns the parent node of a node IterableCollection children (node ): Iterator over all successors of a Kn bool isinternal (knoten): Is this node internal? bool isexternal (node): is this node a leaf? bool isroot (node): is this node the root? E replace (node, e): Replace element of a node. Returns old E! Traversing Preorder In the preorder traversal, a node is visited BEFORE its successors. Application: Printing a structured document. Example: A, B, E, F, I, J, K, C, G, H, D (italics: leaves) visit (v); for each child w of v preorder (w); Post order In a post order traversal, a node is visited AFTER its successors. Application: Specification of the used memory in a directory. Example: E, I, J, K, F, B, G, H, C, D, A (italics: leaves) for each child w of v postorder (w); visit (v); Inorder (binary trees) In an inorder traversal, a node is visited AFTER its left subtree and BEFORE its right subtree. ONLY works with binary trees. Application: Representation of binary trees. Example WITHOUT sheet J and D: E, B, I, F, K, A, G, C, H (read from left to right). if hasleft (v) inorder (left (v)); visit (v); if hasright (v) inorder (right (v)); Euler Tour Generic traversal of BINARY trees. Each node is visited 3 times: once from the left (preorder), once from below (inorder), once from the right (postorder). Example WITHOUT sheet J and D: LA, LB, LE, UE, RE, UB, LF, LI, UI, RI, UF, LK, UK, RK, RF, RB, UA, LC Binary trees A binary tree is a tree with the following properties: Each node has a maximum of two successors o Exactly two successors are referred to as a REAL binary tree The children of a node are an ordered pair (left child / right child) Use: o Arithmetic expressions (internal nodes are operators, external nodes are the Operands) o Decision-making processes (internal nodes are questions, external nodes are the decisions yes / no) Kälin Thomas, Abbot I 15 /

16 Additional Operations Position left (v), Position right (v) // Position = node bool hasleft (v), bool hasright (v) Properties of real binary trees n = node e = i + 1 e <= 2 he = external node n = 2e 1 h> = log 2 (e) i = internal nodes h <= ih> = log 2 (n + 1) 1 h = height h <= (n 1) / implementations of binary trees linked list Each node contains a reference on the element (position ADT), the parent node and the two child nodes. A general storage method for normal trees can easily be derived from this node by replacing the two references to the child accounts with a reference to a sequence of children. Array-based The nodes are stored in an array. Index 0 is left empty! Root receives index 1. Left child = 2 * index of the parent element Right child = 2 * index of the parent element Template Method Pattern Intention: Define a body (skeleton) of an algorithm, whereby some steps are only specified in subclasses. The Template Method pattern allows subclasses to refine parts of the algorithm without changing the structure of the algorithm. Problem: Two different components are very similar, but have no interface and no implementation in common. If a change is necessary, the effort would have to be made several times. Solution: The component designer has to decide which parts of the algorithm are immutable and which can be customized. The common parts are implemented in an abstract class, the variable parts in a default implementation or not at all. Description of binary trees Only one node type. Disadvantage: Inner and outer nodes can only be differentiated using left / right queries. Two types of nodes. Disadvantage: tree is not modeled optimally. Safe Composite Pattern: A node interface. Two implementing nodes (leaf and inner nodes). Allows review at compile time. Transparent Composite Pattern: An abstract base knot. Two implementing nodes (leaf and inner nodes). Needs exceptions. Kälin Thomas, Abbot I 16 /

17 17. Priority Queue ADT A priority queue stores a collection of entries. Each entry consists of a key-value pair, whereby the same key can appear several times. The key can be any object on which an order relation is defined Operations void insert (k, v): Add a new entry with key k and value v E removemin (): Returns & removes the entry with the smallest key E min (): Returns the entry with the smallest key without removing it int size (): Number of elements in the PQ bool isempty (): Elements available in the PQ? Implementation using unsorted list Insert takes O (1) time, since it is inserted at the end. Queries require O (n) time, since the whole sequence has to be run through to find the minimum. Selection Sort The PQ contains the elements in an unsorted sequence. With every removemin () operation we have to look for the smallest key. Therefore the running time is O (n 2)! Implementation using sorted list Insert takes O (n) time, since the insertion point has to be searched for. Queries required O (1) time for this, since the smallest key is always at the beginning of the list. Insertion Sort The PQ contains the elements in a sorted sequence. Each time insert () is called, the entire PQ must be searched in order to find the corresponding position. The running time is therefore O (n 2) here too! In-Place Insertion Sort Here, all elements are first inserted unsorted into the PQ and only then sorted by swapping the elements directly in the PQ until the order relation is correct. Also O (n 2)! In-Place Insertion-Sort insertionsort (t [] a) {int i, j, t; for (i = 1; i = 0; j--) {if (a [j]> a [(j + 1)]) {// swap (a [j + 1], a [j]) t = a [(j + 1)]; a [(j + 1)] = a [j]; a [j] = t; return a; Bubble sort bubblesort (t [] a) {boolean swapped = false; do {swapped = false; for (int i = 1; i a [i]) {swap (a [i-1], a [i]); swapped = true; while (swapped); Kälin Thomas, Abbot I 17 /

18 18. Adaptable Priority Queue The adaptable PQ extends the normal PQ by the ability to access specific entries of the PQ. With the classic PQ, access is only possible to the smallest entry (min ()) Operations E remove (e): Remove and return entry E k replacekey (e, k): Replaces key of an entry. Returns the old key. v replacevalue (e, v): Replaces the value of an entry. Returns the old value Localization of Entries In order to access specific entries, the entire data structure must be searched every time. 19. Heap ADT A heap is a binary tree that stores keys in its nodes. In addition, the condition must be met that a child's key is always greater than or the same as the key of its parent node. That means the deeper in the tree the value is, the higher it is. Only the minimum or the maximum is relevant. Complete binary tree A heap is a complete binary tree. This means that new nodes are always appended from left to right and lines that have been started are filled first before a new one is started. From this it follows that at the depths i = 0..h-1 there are always all nodes (2 ^ i). There is a maximum of one node with only one child that must be a left child. The last node of a complete binary tree is thus the rightmost node at level (depth) h. Height of the heap: h = lb (n). (n = number of nodes, x = floor = round off) Implementation of a priority queue using a heap Using a heap, we can implement a priority queue. We store an entry (key, value) in each node of the heap. Due to the condition that the lowest key must always be in the root element, access can be made using min () / removemin () in O (1). Inserting an element When inserting, the last node must be found first. The new element is inserted there. The heap property may then have to be created using Upheap. findinsertposition (w) {// w: last position if (isexternal (root)) {z = root; else {u = w; while (u! = root &&! isleftchild (u)) {u = u.parent; if (isleftchild (u)) {u = siblingof (u); while (! isexternal (u)) {u = u.leftchild; z = u; return z; Kälin Thomas, Abbot I 18 /

19 Upheap procedure (bubbeling) Programming 2 After inserting a new key k, the heap ordering property could be violated. The Upheap algorithm restores the order property by swapping the new key k along a path from the insertion node in the direction of the root with the node above it. Upheap is terminated as soon as the new key k has either reached the root or if the order relation is established. Because of the height of the heap, the method requires O (log (n)) time. upheapbubble (z) {z.setelement (key-element); // pair to be inserted done = false; while (! done) {if (z == root) {done = true; else {u = parent (z); if (key (z)

20 19.3. Implementation using a vector / array A heap with n nodes can be implemented with a vector of length n + 1. Left child is saved at position 2 * i, right child at 2 * i + 1. Insert () corresponds to inserting at position n + 1. removemin () corresponds to the removal at position 1. Sorting can be implemented using the usual methods (In-Place Sort) Construction of a heap Bottom-Up The bottom-up process is carried out in several phases, with the heap from the bottom (leaves) to the top (Root) is built. It takes O (n) time. 1) Select and place half of the elements (rounded). 2) Select half of the elements (rounded) and place them above the elements from step 1. 3) With the binary trees created in step 2, the heap property must now be restored using the downheap method. Repeat steps 2 and 3 until there are no more elements. 20. Map ADT A map models a searchable collection of key-value entries. Each key can only appear once. Use, for example, in an address book Operations v get (k): Returns value v for key k. Key not available: null. v put (k, v): sets value v for key k. Returns the old value at Key. if new entry: null v remove (k): Returns and removes value v for key k. if not available, null int size (): number of elements in the map bool isempty (): elements in the map? C keys (): Delivers iterable collection with all keys C values ​​(): Delivers iterable collection with all values ​​C entries (): Delivers iterable collection with all entries Implementation by means of linked list A MAP can be implemented efficiently using an unsorted list. The entries of the map can be saved in a list (or doubly linked list) in any order. When retrieving or inserting entries, the entire list is iterated and the key is searched for. Put (), get () and remove () therefore always take O (n) time. Sentinel trick With the Sentinel trick, the node you are looking for is ALWAYS inserted at the end of the list. We can therefore save the hasnext () query in the iteration loop, since we will surely find the value we are looking for at some point. After finding the value, we just need to check whether the node in question is the Sentinel or a real entry. 21. Dictionary ADT The dictionary ADT describes an unordered collection, consisting of key-value entries. In contrast to the map, the key can appear several times. Use, for example, for word definition pairs or DNS lists. Kälin Thomas, Abbot I 20 /

21 21.1. Operations Programming 2 E find (k): Returns the first entry with key k. If the key does not exist: null. C findall (k): Returns iterable collection of entries with the key k. E insert (k, v): Adds a new entry with key k / value v. Delivers new entry. E remove (e): Removes and returns Entry E. C entries (): Returns iterable collection of all entries. int size (): number of elements in the dictionary bool isempty (): elements in the dictionary? Implementation using a linked list We can implement a dictionary with an unsorted (double) linked list. Insert () needs O (1) time because we simply add new entries to the end of the list. Find (), remove () and findall () always require O (n) time, since the complete list has to be searched. Implementation by means of a map Entries of the type are stored in a map. All values ​​belonging to a key are stored in the set. Implementation by means of hash table Separate-Chaining: use one list per entry (hash entry) Binary Search If the dictionary is implemented using a sorted array, we can set the search time to O (log (n)) reduce. 1. Read out the value in the middle of the array. 2. Compare the search value with the value from step 1. If the search value is smaller, it continues with the left half of the array, otherwise with the right side. 3. We start again at step 1 with the rest of the array. Ordered dictionaries The keys follow a complete ordering relation: km <= kn first (): returns the first entry of the dictionary order last (): returns the last entry of the dictionary order successors (k): returns iterators with entries with keys greater than or equal to k, non-decreasing order predecessors (k): returns iterator over entries with keys less than or equal to k, non-decreasing order Search Table A search table is a dictionary, which is implemented using a sorted array. The entries of the dictionary are stored in an array-based sequence, sorted by key. They are only effective if the dictionary is small and special searches are carried out. find: O (log (n)), if binary search is used insert: O (n), since all entries may have to be shifted by one remove: O (n), like insert 22. Hash table implementation like Map ADT / Interface . Modulo before the line! In the hash tables, a key is mapped to an integer in a fixed interval by means of a hash function h. This integer (= hash value of the key) is used as an index in an array (table). Kälin Thomas, Abbot I 21 /

23 Programming changes with linear probing 2 We can define an object of the type AVAILABLE as a placeholder for deleted elements. remove (k): search for key k; if Entry (k, v) is found, replace it with AVAILABLE and return v, otherwise return zero put (k, v): if the table is full, throw an exception; start at cell i = h (k); Check all cells until: i is empty or contains AVAILABLE and store (k, v) in i, or N cells were searched unsuccessfully Double hashing (open addressing) The probing function (open addressing) is a double hash function whose value is linear to the first hash -Function is added hybrid (mixture of open and closed) defectors are stored in the hash table, but chained with references Each sector contains a data record and a hash address that points to a defector. Performance load factor (a = occupied cells / total Cells) determines time behavior. In practice, hashing is very fast if the load factor is not too high (a <0.8). Worst case: O (n) time to search, insert and delete. Optimal access in O (1) -> index! 23. Skip list Definition of a skip list A skip list consists of a series of lists (S 0 -S H). Each of these lists contains artificial starting (+) and ending (-) nodes. The bottom list (S 0) contains all keys in non-descending order, the top list (S H) only the start and end nodes Perfect Skip List Each list contains nodes in the middle of the intervals of the successor list. The perfect skip list is usually not used because the administrative effort is too great. Random Skip List Unlike the perfect skip list, the height for the individual list elements is determined randomly. The attempt is made to distribute the heights evenly and randomly across the list. The further up the list is, the fewer nodes it should contain. Java code SkipListNode A node in a skiplist represents a tower of the structure. Height can be read from the next array. class SkipListNode {int key; // Contains the key of the node SkipListNode [] next; // References to the next node of each level next: for 31 a SkipListeNode array with references to 34, 56 and - Kälin Thomas, Abbot I 23 /