What are partition-based algorithms

Programming 2 (Java)

Transcript

1 Programming 2 (Java) Summary v2.0 Kälin Thomas, Department I SS 06 Updated and supplemented by Raphael Horber, Department I FS 2013 Kälin Thomas, Abbot I 1 /

2 1.

3 14.1.

4 21.5.

5 29.

6 39.1. BFS Algorithm Applications DFS vs. BFS Directed Graph / Digraph Strong Connectivity Transitive Termination Floyd-Warshall's Algorithm Example Implementation Topological Sorting DAG Shortest Path Dijkstra's Algorithm Example Implementation Bellman-Ford Algorithm DAG-based Algorithm Minimum Spanning Tree Kruskal's Algorithm Example implementation Data structure for the algorithm Representation of a partition Partition-based implementation Prim-Jarnik s algorithm Baruvka s algorithm XML (Extensible Markup Language) Example attributes or child elements? Names CDATA Well-formed XML / Validity XML: Document Type Definition (DTD) Example (internal DTD) Example (external DTD) Specials Kälin Thomas, Abbot I 6 /

7 1. compareto compare (a, b): a return <0 a = b => return = 0 a> b => return> 0 2. Recursive method values ​​of the current parameters for which no recursive call was made are referred to as base cases or anchors. Every possible recursive call chain must reach a base case. Any recursive call should be defined to move execution towards the anchor. Example: factorial public static int factor (int n) {if (n == 0) {return 1; // anchoring else {return n * factor (n-1); 3. Iterative / Recursive / Explicit Iterative Recursive Explicit lin (n) = sum (i = 1; n; i) lin (1) = 1 lin (n) = n + lin (n-1) lin (n) = n * (n + 1) / 2 sqr (n) = sum (i = 1; n; i ^ 2) sqr (1) = 1 sqr (n) = n ^ 2 + sum (n-1) sqr (n) = n * (n + 1) * (2n + 1) / example series sequence: 1, 5, 9, 13, 17, ... Iterative: an = 1 + sum (i = 1; n-1; 4) Recursive: a 1 = 1; a n = 4 + a n -1; Explicit: a n = 4n-3 row: 1, 6, 15, 28, ... (sum of the sequence 1!) Iterative: s n = sum (i = 1; n; 4n-3); Recursive: s 1 = 1; s n = 4n-3 + s n-1; Explicitly: s n = n * (a1 + an) / 2 = n * (1 + 4n-3) / 2; // Use explicit formula // Use explicit formula // Explicit formula used in relation: recursion / induction The anchoring corresponds to the termination condition in the recursion and the induction step corresponds to a recursive call. One proves n + 1 by n and the induction step. This is also given in the recursion by calculating n + 1 from n. 4. Design Patterns A design pattern is a solution to a typical software design problem, which can be used in very different situations Structure of a design pattern Pattern name: Name of the pattern Problem section: Description of when to use the pattern Solution section: Description of the elements that make up the pattern. No implementation, just a general concept. Consequences section: List of advantages and disadvantages when using pattern 4.2. Object adapter / class adapter Kälin Thomas, Dept. I 7 / object adapter: class adapter: work with original object with class extend original class

8 5. Analysis of algorithms An algorithm is a step-by-step approach to solving a problem that takes a finite amount of time. Can be easily compared with a cooking recipe. Runtime behavior Most algorithms transfer inputs into outputs. The runtime of an algorithm typically increases with the size of the input. When analyzing the runtime behavior, we concentrate on the worst case, as this is comparatively easy to analyze. Experimental studies We write a program that implements the algorithm and run it with different, representative inputs. We record the runtimes with System.currentTimeMillis (), for example, and compare the results. Problem: The algorithm has to be implemented in full (expenditure of time!), The selection of the inputs is very critical and can lead to wrong conclusions. Of course, tests must always be carried out on the same hardware and software. Theoretical analysis Here, a high-level description of the algorithm (pseudocode) is used, no implementation. You are looking for a functional description (as a function of the input). This means that all possible inputs are taken into account, and the method is independent of HW and SW Important functions constant currentmax) then {currentmax <- A [i] increment counter i; return currentmax 1 -> n + 2 (n-1) + 2 (n-1) + 2 (n-1) + 1 = 7n-2 best case: conditions do not count; Worst case: Conditions count Asymptotic Algorithm Analysis 2 // Access = 1, Assignment = 1 1 + n // Assignment = 1, Compare = n, 2 (n-1) // Indexing and Comparison 2 (n-1) / / In the stupidest case, Zuw. + Index. 2 (n-1) // i = i + 1 !! This means determining the runtime behavior in the big-oh notation. To do this, we first look for the worst-case behavior as a function of primitive operations (7n-2). We describe this function in the big-oh notation (arraymay runs in O (n) time). Kälin Thomas, Abbot I 8 /

9 5.4. Recursive estimation 1. Assumption: T is quadratic 2. Insert parameter (part in brackets after T) with a * n 2 + b * n + c 3. Multiply 4. Divide into n 2, n 1, n 0 (parts with n 2 to a, with n 1 to b, with n 0 to c) 5. Calculate a using the anchoring Example T (n) = 16 * T (n / 4) 3n 15; T (4) = 21 2 .: 16 (an 2/16 + bn / 4 + c) 3n 15 3 .: 16an 2 / bn / 4 + 16c 3n 15 = an 2 + 4bn 3n + 16c 15 4 .: n 2: a = an 1: b = 4b 3 => 3b = -3 => b = 1 n 0: c = 16c 15 => -15c = -15 => c = 1 5 .: T (4) = 21 // n = 4 21 = an 2 + 4bn 3n + 16c = 16a = 16a = 16a => a = Big-Oh / Big-Omega / Big-Theta f (n) is O (g (n)) if a c > 0 and a no> = 1 exists such that: f (n) <= c * g (n) for n> = n 0 f (n) is Ω (g (n)) if a c> 0 and a no> = 1 exists such that: f (n)> = c * g (n) for n> = n 0 f (n) is Θ (g (n)) if a c> 0 and c 0 and a no > = 1 exists, so that: c * g (n) <= f (n) <= c * g (n) n> = n 0 The Big-Oh notation allows the specification of an upper limit for the growth function. The statement f (n) is O (g (n)) means: The growth function f (n) is limited above by g (n). f (n) is Ω (g (n)) if f (n) is asymptotically greater than or equal to g (n) (limited by below). f (n) is Θ (g (n)) if f (n) is asymptotically equal to g (n) (bounded above and below) Big-Oh rules omit all lower powers omit all constants omit the coefficient of the highest power way examples 7n-2 is O (n)> c = 7, no = 1 3n 3 + 20n 2 +5 is O (n 3)> c = 4, no = 21 3 * log (n) + 5 is O ( log (n))> c = 8, no = 2 5n 2 is Ω (n 2)> c = 5, no = 1 5n 2 is Θ (n 2)> c = 5, c = 5, no = required math log b (x * y) = log b (x) + log b (y) a (b + c) = ab * ac log b (x / y) = log b (x) log b (y) a (bc ) = ab / ac log b (x ^ a) = a * log b (x) a bc = (ab) c log b (a) = log x (a) / log x (b) b = a loga (b ) bc = a * c * loga (b) Kälin Thomas, Abbot I 9 /

10 5.6. Analysis log (n) Programming 2 In an interval of length n, n values ​​are saved. The interval is halved. There are now only n / 2 values ​​available. Now a sub-interval is halved again, so only n / 4 values ​​are available. Abort if there is only one value left. This behavior is typical for logarithms with O (log (n))! 6. Recursion Checking the anchoring (base cases). Every recursive call chain must ultimately lead to an anchoring. Each recursive call will typically lead in the direction of a base case. When programming recursive methods, it is important to define the methods so that the recursion becomes easy. End recursion End recursion occurs when the last statement in a linear recursive method is the recursive call. They can be transformed into a non-recursive algorithm. int sum (int n) {if (n> 0) return n + sum (n - 1); else return 0; int sum (int n, int s) {if (n == 0) return s; else return sum (n 1, n + s); 6.2. Binary recursion Binary recursion always occurs when two recursive calls are made in all non-terminal calls (ex: Fibonacci). These algorithms typically have a behavior of O (2 n)! 6.3. PuzzleSolve 7. Position ADT A position represents the concept of space in a data structure. A position can save an element, e.g. an array element or a node of a list Operation (exactly one) Object element (): returns the element of the position Kälin Thomas, Abbot I 10 /

11 8. Entry ADT An entry consists of a key-value pair of operations 9. Stack key (): returns the key value (): returns the value A stack stores any objects. Insertion and deletion are carried out according to the LIFO scheme. Use for: History in the web browser, undo function, method calls in the JVM Operations push (object): Insert an element Object pop (): Remove and return the topmost element. Object top (): Returns the top element without removing it. int size (): Returns the number of stored elements bool isempty (): Indicates whether there are any elements in the stack. Exceptions EmptyStackException is thrown in the case of an empty stack. The index of the top element is saved in a variable. In the event that the array is full, a FullStackException must be programmed (implementation-dependent exception!). Each operation takes O (1) time ParenthesesMatcher (Character [] par) {Stack stack = new Stack <> (); for (Character c: par) {// insert the opening bracket if (c.equals ('(') c.equals ('[') c.equals ('{')) {stack.push (c); / / closing bracket else {if (c.equals (')') c.equals (']') c.equals ('')) {if (stack.isempty ()) {System.err.println (number .. do not match!); return false; Character tmp = stack.pop (); if (tmp.equals ('(') &&! c.equals (')') tmp.equals ('[') &&! c.equals (']') tmp.equals ('{') &&! c. equals ('')) {System.err.println (chamber type does not match!); return false; if (stack.isempty ()) {return true; else {return false; Kälin Thomas, Abbot I 11 /

12 9.5. Span / Visibility: linear algorithm 10. Queue Spans (int [] x, int n) {int [] s = new int [n]; StackInterface a = new ArrayStack <> (n); Programming 2 for (int i = 0; i (10-3 + 5)% 10 = 2 elements available bool isempty (): (f == r) 11. Linked List A single linked list consists of a sequence of nodes. Each node has an element (content) and a link to the next node. The last node points to zero. Alternative: Define special headers and trailer elements, analogous to the node list operations addfirst (object): Insert element at the beginning of the list, create a new instance, points to the old head, head points to the new element. addlast (object): insert element at the end of the list new instance with next = null, tail: next = new, tail points to new element. Object removefirst (): Read element at the beginning of the list Object removelast (): Read element at the end of the list Applications Stack: A stack can be implemented with the help of a simply linked list. The top () element corresponds to the first node in the list. Memory consumption goes with O (n), each stack operation requires O (1) time. Queue: Can also be set up using a simply linked list. The first element of the queue (header) corresponds to the first node in the list. The end element of the queue corresponds to the last node in the list. Memory requirement is O (n), each operation takes O (1) time. Kälin Thomas, Abbot I 12 /

13 12. Array-List The Array-List ADT extends the concept of an array by allowing any objects to be saved as elements. An element can be inserted, deleted or accessed by specifying its index. Operations Element get (int i): Returns the element i without removing it. (O (1)). Element set (int i, element e): Replaces the element at position i with e and returns the old one. (O (1)). add (int i, element e): Inserts a new element at position i without overwriting the existing one. (O (n) because of the displacement of all elements). Element remove (int i): Remove i and return the element. (O (n)) int size (): size of the array list (O (1)) boolean isempty (): container empty? (O (1)) Exceptions An exception is thrown when access is outside the permitted range. Array-based implementation We use an array V of size N. A variable n stores the size of the vector. Operation get (i) can be implemented as an O (1) time operation: V [i]. With add (i, e) we first have to create space and move the elements backwards from position i. In the worst case (i = 0) we need O (n) time. In the operation remove (i) we have to fill in the resulting hole and move the elements forward. We need O (n) time again in the worst case. With circular usage, inserting and deleting at the end / beginning of the array only require O (1) time. Growing the array We can eliminate throwing an exception when inserting by letting the array grow. There are two strategies for this: Incremental strategy: We increase the array size by a constant c. Amortization time of O (n) Doubling strategy: We double the array every time. Amortization time of O (1) 13. Node-List The Node-List is closely related to the Linked-List. The difference is that it is doubly chained, i.e. each node has references to the previous (prev) and subsequent (next) element operations int size (), bool isempty (): Generic methods Node first (), Node last (): Access to first / last node setprev (node), setnext (node): Set previous / next node Node getprev (), Node getnext (): Get previous / next node setelement (e): Fill content of current node with element e e getelement (): get content from current node addbefore (node, e), addafter (node, e): insert element e before / after node. addfirst (e), addlast (e): Insert element at the beginning / end of the node list. remove (node): Remove element from the node list AddAfter / Remove Node addafter (p, e) {Object remove (p) {Node v = new Node (); Object tmp = p.getelement (); v.setelement (e); p.getprev (). setnext (p.getnext ()); v.setprev (p); p.getnext (). setprev (p.getprev ()); v.setnext (p.getnext ()); p.setprev (null); p.getnext (). setprev (v); p.setnext (null); p.setnext (v); return tmp; return v; Kälin Thomas, Abbot I 13 /

14 14. Sequence Summarizes the array and node lists. Access possible via index or position. Generic replacement for stack, queue, array or node list operations int size (), bool isempty (): Generic methods // Operations of the array list (e get (i), set (i, e),) // Operations of the node list (Node first (), Node last (), Node getprev (),) // Bridge operations: Node atindex (i), i indexof (node) Implementation with linked list For a graphic of the implementation, see Node List Nodes implement position and save element, link to predecessor / successor. There is a special header and trailer node. Index-based accesses always search (counting the steps) from the beginning / end of the list and are therefore linear in time (O (n)). Array-based implementation Each array entry contains a reference to a position. These contain the index and a reference to the element. Variables F and L save the first and last occupied position. 15. Iterator An iterator abstracts the process of scanning over a collection of elements. Typically, an iterator is used in conjunction with another ADT. The ADT in question is supplemented by an Iterator iterator () method. Implementation: with snapshot or dynamic operations bool hasnext (): Still other elements available? e next (): Returns the next element For / Iterator For is used with arrays, an iterator for complex data structures (list).Here's another example of a For-Each loop: String arrvalues ​​[] = {That's, crap. ; for (String strvalue: arrvalues) {// strvalue contains in sequence all elements of arrvalues. 16. Trees / Trees In computer science, trees represent abstract, hierarchical data structures Tree Terminology Root: Node without parent node (A) Internal node: Node with at least one child (A, B, F, C) External node (leaf) : Knot without a child. (E, I, J, K, G, H, D) Predecessor: Node of a higher level (B is the predecessor of E) Successor: Node of a lower level (E is the successor / child of B) Depth of a node: Number of predecessors ( Depth A: 0, B: 1, E: 2, K: 3) Height of a node: Maximum number of successors down to one leaf (height B: 2) (leaf: 0, internal node: 1 + max (height of successors )) Height of the tree: height of the root (height A: 3) Subtree: tree made up of a node and its successors Sibling: sibling node Kälin Thomas, Abbot I 14 /

15 16.2. Operations of the Tree ADT Programming 2 int size (): Number of nodes in the tree bool isempty (): Nodes contained? Iterator iterator (): Iterator over the elements (contents) IterableCollection positions (): Iterator over the nodes Node root (): Returns the root node Node parent (node): Returns the parent node of a node IterableCollection children (node ): Iterator over all successors of a Kn bool isinternal (knoten): Is this node internal? bool isexternal (node): is this node a leaf? bool isroot (node): is this node the root? E replace (node, e): Replace element of a node. Returns old E! Traversing Preorder In the preorder traversal, a node is visited BEFORE its successors. Application: Printing a structured document. Example: A, B, E, F, I, J, K, C, G, H, D (italics: leaves) visit (v); for each child w of v preorder (w); Post order In a post order traversal, a node is visited AFTER its successors. Application: Specification of the used memory in a directory. Example: E, I, J, K, F, B, G, H, C, D, A (italics: leaves) for each child w of v postorder (w); visit (v); Inorder (binary trees) In an inorder traversal, a node is visited AFTER its left subtree and BEFORE its right subtree. ONLY works with binary trees. Application: Representation of binary trees. Example WITHOUT sheet J and D: E, B, I, F, K, A, G, C, H (read from left to right). if hasleft (v) inorder (left (v)); visit (v); if hasright (v) inorder (right (v)); Euler Tour Generic traversal of BINARY trees. Each node is visited 3 times: once from the left (preorder), once from below (inorder), once from the right (postorder). Example WITHOUT sheet J and D: LA, LB, LE, UE, RE, UB, LF, LI, UI, RI, UF, LK, UK, RK, RF, RB, UA, LC Binary trees A binary tree is a tree with the following properties: Each node has a maximum of two successors o Exactly two successors are referred to as a REAL binary tree The children of a node are an ordered pair (left child / right child) Use: o Arithmetic expressions (internal nodes are operators, external nodes are the Operands) o Decision-making processes (internal nodes are questions, external nodes are the decisions yes / no) Kälin Thomas, Abbot I 15 /

16 Additional Operations Position left (v), Position right (v) // Position = node bool hasleft (v), bool hasright (v) Properties of real binary trees n = node e = i + 1 e <= 2 he = external node n = 2e 1 h> = log 2 (e) i = internal nodes h <= ih> = log 2 (n + 1) 1 h = height h <= (n 1) / implementations of binary trees linked list Each node contains a reference on the element (position ADT), the parent node and the two child nodes. A general storage method for normal trees can easily be derived from this node by replacing the two references to the child accounts with a reference to a sequence of children. Array-based The nodes are stored in an array. Index 0 is left empty! Root receives index 1. Left child = 2 * index of the parent element Right child = 2 * index of the parent element Template Method Pattern Intention: Define a body (skeleton) of an algorithm, whereby some steps are only specified in subclasses. The Template Method pattern allows subclasses to refine parts of the algorithm without changing the structure of the algorithm. Problem: Two different components are very similar, but have no interface and no implementation in common. If a change is necessary, the effort would have to be made several times. Solution: The component designer has to decide which parts of the algorithm are immutable and which can be customized. The common parts are implemented in an abstract class, the variable parts in a default implementation or not at all. Description of binary trees Only one node type. Disadvantage: Inner and outer nodes can only be differentiated using left / right queries. Two types of nodes. Disadvantage: tree is not modeled optimally. Safe Composite Pattern: A node interface. Two implementing nodes (leaf and inner nodes). Allows review at compile time. Transparent Composite Pattern: An abstract base knot. Two implementing nodes (leaf and inner nodes). Needs exceptions. Kälin Thomas, Abbot I 16 /

17 17. Priority Queue ADT A priority queue stores a collection of entries. Each entry consists of a key-value pair, whereby the same key can appear several times. The key can be any object on which an order relation is defined Operations void insert (k, v): Add a new entry with key k and value v E removemin (): Returns & removes the entry with the smallest key E min (): Returns the entry with the smallest key without removing it int size (): Number of elements in the PQ bool isempty (): Elements available in the PQ? Implementation using unsorted list Insert takes O (1) time, since it is inserted at the end. Queries require O (n) time, since the whole sequence has to be run through to find the minimum. Selection Sort The PQ contains the elements in an unsorted sequence. With every removemin () operation we have to look for the smallest key. Therefore the running time is O (n 2)! Implementation using sorted list Insert takes O (n) time, since the insertion point has to be searched for. Queries required O (1) time for this, since the smallest key is always at the beginning of the list. Insertion Sort The PQ contains the elements in a sorted sequence. Each time insert () is called, the entire PQ must be searched in order to find the corresponding position. The running time is therefore O (n 2) here too! In-Place Insertion Sort Here, all elements are first inserted unsorted into the PQ and only then sorted by swapping the elements directly in the PQ until the order relation is correct. Also O (n 2)! In-Place Insertion-Sort insertionsort (t [] a) {int i, j, t; for (i = 1; i = 0; j--) {if (a [j]> a [(j + 1)]) {// swap (a [j + 1], a [j]) t = a [(j + 1)]; a [(j + 1)] = a [j]; a [j] = t; return a; Bubble sort bubblesort (t [] a) {boolean swapped = false; do {swapped = false; for (int i = 1; i a [i]) {swap (a [i-1], a [i]); swapped = true; while (swapped); Kälin Thomas, Abbot I 17 /

18 18. Adaptable Priority Queue The adaptable PQ extends the normal PQ by the ability to access specific entries of the PQ. With the classic PQ, access is only possible to the smallest entry (min ()) Operations E remove (e): Remove and return entry E k replacekey (e, k): Replaces key of an entry. Returns the old key. v replacevalue (e, v): Replaces the value of an entry. Returns the old value Localization of Entries In order to access specific entries, the entire data structure must be searched every time. 19. Heap ADT A heap is a binary tree that stores keys in its nodes. In addition, the condition must be met that a child's key is always greater than or the same as the key of its parent node. That means the deeper in the tree the value is, the higher it is. Only the minimum or the maximum is relevant. Complete binary tree A heap is a complete binary tree. This means that new nodes are always appended from left to right and lines that have been started are filled first before a new one is started. From this it follows that at the depths i = 0..h-1 there are always all nodes (2 ^ i). There is a maximum of one node with only one child that must be a left child. The last node of a complete binary tree is thus the rightmost node at level (depth) h. Height of the heap: h = lb (n). (n = number of nodes, x = floor = round off) Implementation of a priority queue using a heap Using a heap, we can implement a priority queue. We store an entry (key, value) in each node of the heap. Due to the condition that the lowest key must always be in the root element, access can be made using min () / removemin () in O (1). Inserting an element When inserting, the last node must be found first. The new element is inserted there. The heap property may then have to be created using Upheap. findinsertposition (w) {// w: last position if (isexternal (root)) {z = root; else {u = w; while (u! = root &&! isleftchild (u)) {u = u.parent; if (isleftchild (u)) {u = siblingof (u); while (! isexternal (u)) {u = u.leftchild; z = u; return z; Kälin Thomas, Abbot I 18 /

19 Upheap procedure (bubbeling) Programming 2 After inserting a new key k, the heap ordering property could be violated. The Upheap algorithm restores the order property by swapping the new key k along a path from the insertion node in the direction of the root with the node above it. Upheap is terminated as soon as the new key k has either reached the root or if the order relation is established. Because of the height of the heap, the method requires O (log (n)) time. upheapbubble (z) {z.setelement (key-element); // pair to be inserted done = false; while (! done) {if (z == root) {done = true; else {u = parent (z); if (key (z)

20 19.3. Implementation using a vector / array A heap with n nodes can be implemented with a vector of length n + 1. Left child is saved at position 2 * i, right child at 2 * i + 1. Insert () corresponds to inserting at position n + 1. removemin () corresponds to the removal at position 1. Sorting can be implemented using the usual methods (In-Place Sort) Construction of a heap Bottom-Up The bottom-up process is carried out in several phases, with the heap from the bottom (leaves) to the top (Root) is built. It takes O (n) time. 1) Select and place half of the elements (rounded). 2) Select half of the elements (rounded) and place them above the elements from step 1. 3) With the binary trees created in step 2, the heap property must now be restored using the downheap method. Repeat steps 2 and 3 until there are no more elements. 20. Map ADT A map models a searchable collection of key-value entries. Each key can only appear once. Use, for example, in an address book Operations v get (k): Returns value v for key k. Key not available: null. v put (k, v): sets value v for key k. Returns the old value at Key. if new entry: null v remove (k): Returns and removes value v for key k. if not available, null int size (): number of elements in the map bool isempty (): elements in the map? C keys (): Delivers iterable collection with all keys C values ​​(): Delivers iterable collection with all values ​​C entries (): Delivers iterable collection with all entries Implementation by means of linked list A MAP can be implemented efficiently using an unsorted list. The entries of the map can be saved in a list (or doubly linked list) in any order. When retrieving or inserting entries, the entire list is iterated and the key is searched for. Put (), get () and remove () therefore always take O (n) time. Sentinel trick With the Sentinel trick, the node you are looking for is ALWAYS inserted at the end of the list. We can therefore save the hasnext () query in the iteration loop, since we will surely find the value we are looking for at some point. After finding the value, we just need to check whether the node in question is the Sentinel or a real entry. 21. Dictionary ADT The dictionary ADT describes an unordered collection, consisting of key-value entries. In contrast to the map, the key can appear several times. Use, for example, for word definition pairs or DNS lists. Kälin Thomas, Abbot I 20 /

21 21.1. Operations Programming 2 E find (k): Returns the first entry with key k. If the key does not exist: null. C findall (k): Returns iterable collection of entries with the key k. E insert (k, v): Adds a new entry with key k / value v. Delivers new entry. E remove (e): Removes and returns Entry E. C entries (): Returns iterable collection of all entries. int size (): number of elements in the dictionary bool isempty (): elements in the dictionary? Implementation using a linked list We can implement a dictionary with an unsorted (double) linked list. Insert () needs O (1) time because we simply add new entries to the end of the list. Find (), remove () and findall () always require O (n) time, since the complete list has to be searched. Implementation by means of a map Entries of the type are stored in a map. All values ​​belonging to a key are stored in the set. Implementation by means of hash table Separate-Chaining: use one list per entry (hash entry) Binary Search If the dictionary is implemented using a sorted array, we can set the search time to O (log (n)) reduce. 1. Read out the value in the middle of the array. 2. Compare the search value with the value from step 1. If the search value is smaller, it continues with the left half of the array, otherwise with the right side. 3. We start again at step 1 with the rest of the array. Ordered dictionaries The keys follow a complete ordering relation: km <= kn first (): returns the first entry of the dictionary order last (): returns the last entry of the dictionary order successors (k): returns iterators with entries with keys greater than or equal to k, non-decreasing order predecessors (k): returns iterator over entries with keys less than or equal to k, non-decreasing order Search Table A search table is a dictionary, which is implemented using a sorted array. The entries of the dictionary are stored in an array-based sequence, sorted by key. They are only effective if the dictionary is small and special searches are carried out. find: O (log (n)), if binary search is used insert: O (n), since all entries may have to be shifted by one remove: O (n), like insert 22. Hash table implementation like Map ADT / Interface . Modulo before the line! In the hash tables, a key is mapped to an integer in a fixed interval by means of a hash function h. This integer (= hash value of the key) is used as an index in an array (table). Kälin Thomas, Abbot I 21 /

22 22.1. Programming hash functions 2 A hash function usually consists of two parts, the hash function and the compression function. The aim of the hash function is to distribute the keys as randomly as possible. The task of the compression function is to transform the generated key into a fixed interval. Hash function Basic function: h 1 = Key -> Integer A hash function is called perfect if there are no collisions.Horner scheme: The aim of the Horner scheme is to reduce the number of arithmetic operations. The running time is reduced to O (n). // Basic pattern Horner scheme P (z) = a 0 * z 0 + a 1 * z 1 + a 2 * z 2 + a 3 * z 3 = a 0 + z (a 1 + z (a 2 + z ( a 3))) // Example binary numbers P (1101) = 1 * * * * 2 3 = 13 // 3 Add, 7 Mul = 1 + 2 (0 + 2 (1 + 2 (1))) = 13 / / 2 Add, 3 Mul Compression function Basic function: Example Modulo: h 2 = Integer -> [0, N-1] h 2 = y% N or h 2 = (a * y + b)% N // Mulitply, Add, Divide A typical compression function is modulo division. A prime number is often chosen as the size of the hash table, as this allows the hash function to be selected more optimally. Collision handling Closed addressing (open hash method) With closed addressing, each cell in the table points to a list. This is also called separate chaining. The advantage is that the chaining is easy. On the other hand, the disadvantage is that the concatenation requires storage space outside the table. Open addressing (closed hash method) With open addressing, a location near the occupied cell is sought for colliding elements (defectors) using a probing function. Each cell that is inspected is called a probe, hence the name probing. (Modulo before the line!) Linear probing: Search linear forwards for free space => +1, +2, +3, Linear negative probing: Linear backwards search for free space => -1, -2, -3, Square probing: Search square forwards for free space => +1, +4, +9, alternating probing: linear alternating search for free space => -1, +2, -3, alternating quadr. Probing: Squares alternately looking for vacancies => -1, +4, -9, random: with random function Open addressing requires special handling of the delete operations. If a data record is to be deleted, this can interrupt the probing sequence for another data record. Data records to be deleted may therefore not be physically deleted, but only marked as deleted. Kälin Thomas, Abbot I 22 /

23 Programming changes with linear probing 2 We can define an object of the type AVAILABLE as a placeholder for deleted elements. remove (k): search for key k; if Entry (k, v) is found, replace it with AVAILABLE and return v, otherwise return zero put (k, v): if the table is full, throw an exception; start at cell i = h (k); Check all cells until: i is empty or contains AVAILABLE and store (k, v) in i, or N cells were searched unsuccessfully Double hashing (open addressing) The probing function (open addressing) is a double hash function whose value is linear to the first hash -Function is added hybrid (mixture of open and closed) defectors are stored in the hash table, but chained with references Each sector contains a data record and a hash address that points to a defector. Performance load factor (a = occupied cells / total Cells) determines time behavior. In practice, hashing is very fast if the load factor is not too high (a <0.8). Worst case: O (n) time to search, insert and delete. Optimal access in O (1) -> index! 23. Skip list Definition of a skip list A skip list consists of a series of lists (S 0 -S H). Each of these lists contains artificial starting (+) and ending (-) nodes. The bottom list (S 0) contains all keys in non-descending order, the top list (S H) only the start and end nodes Perfect Skip List Each list contains nodes in the middle of the intervals of the successor list. The perfect skip list is usually not used because the administrative effort is too great. Random Skip List Unlike the perfect skip list, the height for the individual list elements is determined randomly. The attempt is made to distribute the heights evenly and randomly across the list. The further up the list is, the fewer nodes it should contain. Java code SkipListNode A node in a skiplist represents a tower of the structure. Height can be read from the next array. class SkipListNode {int key; // Contains the key of the node SkipListNode [] next; // References to the next node of each level next: for 31 a SkipListeNode array with references to 34, 56 and - Kälin Thomas, Abbot I 23 /

24 SkipList class SkipList {int maxheight; int height; SkipListNode head; SkipListNode tail; SkipListNode [] update; // Maximum height of the list // Current height of the list // Head of the list // End of the list // Auxiliary array, see insert SkipList () {maxheight = 5; // height goes from 0-5! -> 6 levels! height = 0; update = new SkipListNode [maxHeight + 1]; head = new SkipListNode (Integer.MIN_VALUE, maxHeight); tail = new SkipListNode (Integer.MAX_VALUE, 0); // Create basic links for (int i = 0; i <= maxheight; ++ i) {head.next [i] = tail; Search search (int key) {// We start in the head at the highest level SkipListNode p = head; for (int i = height; i> = 0; i--) {// go through individual levels while (p.next [i] .key [] update = ListElement.createArray (newHeight + 1); // Find the update node ListElement aktelement = header; for (int i = newheight; i> = 0; i--) {while (actelement.forward [i] .key MAX_LEVEL) {ListElement oldhead = header; header = new ListElement <> (newHeight, minkey, oldhead.value); Kälin Thomas, Abbot I 24 /

25 for (int i = 0; i <= MAX_LEVEL; i ++) {header.forward [i] = oldhead.forward [i]; for (int i = MAX_LEVEL + 1; i <= newheight; i ++) {header.forward [i] = tail; MAX_LEVEL = newheight; for (int i = highest level; i> = 0 && update [i] == oldhead; i--) {update [i] = header; Programming 2 if (newheight> highestlevel) {for (int i = highestlevel + 1; i <= newheight; i ++) {update [i] = header; highestlevel = newheight; actelement = new ListElement <> (newHeight, key, value); for (int i = 0; i <= newheight; i ++) {actelement.forward [i] = update [i] .forward [i]; update [i] .forward [i] = actelement; return null; Delete public V remove (k key) {ListElement [] update = ListElement.createArray (highestLevel + 1); ListElement actelement = header; for (int i = highestlevel; i> = 0; i--) {while (p.forward [i] .key.compareto (key) <0) {actelement = actelement.forward [i]; update [i] = act element; actelement = actelement.forward [0]; if (aktelement.key! = key) {System.out.println ("Key '" + key + "' not available!"); return null; // remove element from the levels for (int i = 0; i = 0 && header.forward [highestlevel] == tail) {highestlevel--; // if the height has shrunk too much if (4 * highestlevel <= MAX_LEVEL) {MAX_LEVEL = 2 * highestlevel; ListElement oldhead = header; header = new ListElement <> (MAX_LEVEL, minkey, oldhead.value); for (int i = 0; i <= MAX_LEVEL; i ++) {header.forward [i] = oldhead.forward [i]; Kälin Thomas, Abbot I 25 /

26 random number generator int randheight () {// probability is 0.5 i height = 0; while (rand ()% 2 == 0 && height Node.Key) {return TreeSearch (Key, Node.rightChild ()); return node; // Current node is wanted element! Kälin Thomas, Abbot I 26 /

27 24.4. Inserting Programming 2 We proceed in the same way as the search. However, the isexternal () query does not return a zero, but generates a new node (with 2 leaves) matching the key. If it is a dictionary and k is available, search on the left to the sheet and insert. void insert (int k) {root = insert (root, k); Insert node (node ​​p, int k) {if (p == null) {return new node (k); else if (k p.key) {p.rightson = insert (p.rightson, k); return p; Deletion When deleting we have to distinguish 3 cases. The node v with key k is: A node with two leaves (Ex .: 1). Deleting is easy here. The node can be removed and replaced with null. (Create a leaf) A node with a leaf (Ex .: 9). Here the node to be deleted can be replaced by the child node that is not a leaf. A knot without leaves (ex .: 2). This is a little more complex. The node to be deleted must be replaced by the node that would follow next in the inorder traversal (see auxiliary method). The node that takes the place of the node to be deleted must then run through the delete algorithm again. (Delete leaves) remove void (int k) {root = remove (root, k); Remove node (node ​​p, int k) {if (p == null) {return p; if (k p.key) {p.rightson = remove (p.rightson, k); else {if (p.leftson == null) {return p.rightson; if (p.rightson == null) {return p.leftson; Node q = SymNach (p); // see symmetrical successor if (q == p) {p.key = p.rightson.key; q.rightson = q.rightson.rightson; else {p.key = q.leftson.key; q.leftson = q.leftson.rightson; return p; Symetric successor Auxiliary method for finding the inorder successor. Node SymNach (node ​​p) {if (p.rightson.leftson! = Null) {p = p.rightson; while (p.leftson.leftson! = null) {p = p.leftson; return p; Kälin Thomas, Abbot I 27 /

28 June 24. Inorder method The auxiliary method traverses an inorder through void Inorder (node ​​p) {if (p.leftson! = Null) {Inorder (p.leftson); System.out.print ((+ p.getkey () +)); if (p.rightson! = null) {Inorder (p.rightson); 25. AVL tree (balanced binary search tree) An AVL tree is a binary search tree that tries to balance the distribution of the nodes. Therefore, for every internal node v of T, the height of the children of v must differ by at most 1. The following applies to the height of a tree: O (log (n)) Balance B (node) = Height (left) Height (right) If, after inserting a new node, B (K)> = 2, the tree must be restructured. The process of equalization is called rotation height height node: 1 + max (height left, height right); // Leaves do not count Insert Insertion is the same as with the binary search tree. An external node is therefore always expanded. After the insertion, the node must be traversed in the direction of the root. The balance must be recalculated for each predecessor. If the tree is no longer balanced, an adjustment must be made. In this case we hike up the new tree until we find the first node x whose grandparent node z is unbalanced Rotations Single Rotation Left rotation (zag) Right rotation (zig) BinaryNode rotatewithleftchild (binarynode k2) {// Right -Rotation BinaryNode k1 = k2.left; k2.left = k1.right; k1.right = k2; return k1 BinaryNode rotatewithrightchild (binarynode k1) {// left rotation BinaryNode k2 = k1.right; k1.right = k2.left; k2.left = k1; return k2; Kälin Thomas, Abbot I 28 /

29 Double rotations doublerotatewithrightchild doublerotatewithrightchild Cut / Link Restructuring Algorithm The simplest method of balancing is the cut / link algorithm. I will explain this using an example. The tree opposite is obviously unbalanced. Let's say we just inserted 88. 1. Find unbalanced node from (88) towards the root: (44) = Z 2. Child of Z with greater height: (62) = Y 3. Child of Y with greater height: (78) = X 4. T0-T3 define (remaining sub-trees!). Numbering from LnR! 5. Name nodes X-Z with A-C according to the inorder order. We get the tree shown on the left. When programming, these 7 elements are stored in an array according to the following scheme. T0 A T1 B T2 C T3 The new tree can now be drawn from this pattern according to the following pattern: Here we can now see the completely restructured tree. This method leads to the same result as the normal (double) rotations! The restructuring can possibly generate a new imbalance. Therefore it is essential to continue checking to the root! Deletion Deletion is based on the same principle as with the binary search trees. Of course, the balance of the tree has to be checked again. Runtimes A single restructuring takes O (1) time, using a linked binary tree find (), insert () and delete () are O (log (n)), since each Action must first be sought through the tree, which is determined by the height of the tree (log (n)). Kälin Thomas, Abbot I 29 /

30 25.8. Implementation // AVL tree needs a node class public class AVLItem extends Item {int height; AVLItem (Object k, Object e, int h) {super (k, e); height = j; public int height () {return height; public int setheight (int h) {int oldheight = height; height = h; return oldheight; public class AVLTree extends BinarySearchTree implements Dictionary {public SimpleAVLTree (Comparator c) {super (c); T = new RestructurableNodeBinaryTree (); private int height (position p) {if (T.isExternal (p)) {return 0; else {return ((AVLItem) p.element ()). height (); Programming 2 private void setheight (position p) {// called only if p is internal ((AVLItem) p.element (). Setheight (1 + Math.max (height (T.leftChild (p)), height (t. rightchild (p)))); private boolean isbalanced (position p) {// test wheter node p has balanced factor between -1 and 1 int bf = height (t.leftchild (p)) height (t.rightchild (p) ); return ((-1 <= bf) && (bf <= 1)); private position tallerchild (position p) {// return a child of p with height no smaller than that of the other child if (height (t .leftchild (p)> = height. (t.rightchild (p)) {return T.leftChild (p); else {return T.rightChild (p); private void rebalance (position zpos) {// traverse the path of T from zpos to the root; foreach node encountered // recomputed its height and, if it is unbalanced, perform a rotation while (! T.isRoot (zPos)) {zpos = T.parent (zPos); setheight (zpos); if (! isbalanced (zpos)) {// perform rotation position xpos = tallerchild (tallerchild (zpos)); zpos = ((RestructurableNodeBinaryTree) T). restructure (xPos ); setheight (t.leftchild (zpos)); setheight (t.rightchild (zpos)); setheight (zpos); Kälin Thomas, Abbot I 30 /

31