Is DNA a Turing machine

DNA implementation of a Turing machine


1 For the seminar Unconventional Computers the elaboration on the subject of biocomputers with a focus on DNA implementation of a Turing machine by Gerald Winter Technical University of Braunschweig July 2001

2 Contents 1 Overview biocomputer example amorphous computer example DNA computer example experiments with leeches neurons example programming of cells DNA Turing machine Basics of DNA Basics of the Turing machine Class IIS restriction endonucleases Basics operations with class IIS restriction endonucleases Conversion of a Turing machine to DNA Coding of the ribbon symbols Coding of the writing - / read head Coding of the states Coding of the transfer functions Sequence of a Turing step Usable DNA sequences A universal 4 x 7 Turing machine Evaluation Concluding remarks References

3 1 Overview of biocomputers When searching for a biocomputer on the Internet, you quickly discover that there are a large number of ideas about this topic. The following definition of the term biocomputer therefore makes sense: Biocomputer = Computer using biological processes, see [1] The human brain is also referred to in some documents as a biocomputer due to its enormous performance. A few typical research examples are briefly outlined here before the sub-area of ​​the implementation of a Turing machine based on DNA is discussed. 1.1 Example of amorphous computers Amorphous computer systems consist of irregularly placed, asynchronously working, locally interacting, programmable particles. These particles can be biological cells. On the basis of so-called growing points, which are realized, for example, by the outflow of messenger substances, the cells can differentiate and form complex structures. The MIT Artificial Intelligence Laboratory is particularly concerned with this topic, see [2]. The key points are the extremely high reliability, the simple possibility of duplicating programmed Figure 1: Simulation of an amorphous computer, source [2] cells, the high storage potential and the low maintenance and operating costs. Amorphous computers will be covered in this seminar. 1.2 Example of a DNA computer With DNA molecules, some combinatorial tasks can be solved, as already shown in this seminar. Efforts are also underway to develop a universal, i.e. versatile, DNA computer. The focus of this lecture is to be assigned to this aspect. 1.3 Example experiments with leech neurons Professor Ditto, University of Georgia Tech, carried out experiments with leech neurons, see [3]. By independently linking the nerve cells, it was previously possible to carry out simple calculations such as five plus three. The aim is to create a flexible computer that independently searches for a way to solve problems. Figure 2: Sticky end of a DNA strand Figure 3: Experiment with leech neurons, source [3] 3

4 1.4 Example of programming cells Ron Weiss, also an employee at the MIT Artificial Intelligence Laboratory, suggested a way of programming cells, see [4]. Several inverters are implemented in one cell, which can work together to create complex gates. The signals are then implemented by special proteins. Figure 4: Gate generated by an inverter, source [4] 2 DNA Turing machine At the beginning, the basics of the DNA and Turing machine are repeated, and then the possible realization of a DNA Turing machine according to Paul WK Rothemund A DNA and restriction enzyme implementation of Turing machines is discussed, see [5]. This is followed by some information regarding the actual feasibility. 2.1 Basics of DNA The backbone of DNA consists of two ribbons, called the sugar-phosphate part, which are connected by base pairs. Only the base combinations adenine thymine and cytosine guanine exist. One speaks of a plasmid when the DNA molecule is present as a ring. There are numerous ways to manipulate DNA with the help of enzymes. For the model of the Turing machine presented here, for example, a certain class of restriction enzymes play a major role, as will be shown in more detail later. Figure 5: Sketches of the DNA, source [6] 2.2 Basics of the Turing machine There are various definitions of a Turing machine, one of which can be represented formally as a seventh tuple: TM = (Q, Σ, Γ, δ, q 0, B, F), q 0 Q, B Γ, Σ Γ, FQ, δ: (Q x Γ) P (Q x Γ x {L, R}) Q = finite set of states, Σ = alphabet, Γ = ribbon alphabet, δ = finite set of Transitions, q 0 = start state, B = blank symbol, F = set of final states Translated, this is a device that has a tape as a storage medium and a read / write head. In every Turing step the head reads a symbol and depending on the state the head overwrites this symbol, makes a movement to the left or right and changes its state. The behavior is defined in a transfer table. Church's thesis states that every calculable problem can also be calculated using Turing machines. So-called universal Turing machines also exist, which receive another (binary-coded) Turing machine as input, which is then simulated. Such a universal Turin machine will be presented later. 4th

5 2.3 Class IIS restriction endonucleases basics Restriction endonucleases are enzymes that can cut a strand of DNA. They do this at a certain distance from a special nucleotide sequence, which is therefore called a restriction site or recognition region. Figure 6: Sketch of the function of a class IIS restriction endonuclease, source: [5] The cut is not made in a straight line, but always creates an overhang of a certain length. This overhang is called the sticky end because it can easily reconnect with a complementary overhang. The term complementary is used to describe sticky ends that fit together precisely due to the sequence of their bases and when they are joined together again form correct base pairs, see Figure 2. The joining of two DNA strands can be caused by the enzyme ligase. An example is shown in the graphic above, where N {A, C, T, G}. This has the great advantage of being able to produce different sticky ends with just one enzyme of this type, depending on what the N currently stand for Operations by class IIS restriction endonucleases Figure 7: Insertion of an aligned DNA fragment, source: [5] The presented enzymes result in several operations. These are carried out on plasmids. The figure above illustrates the separation of the plasmid by the class IIS restriction enzyme Fok I, which results in the sticky ends mentioned. A directed DNA fragment can now be added if it has the appropriate sticky ends. Ligase recombines the molecules into a plasmid. Figure 8: Deleting a fragment, source: [5] The same properties can also be used to delete a fragment. Here two detection regions indicate the interfaces. 5

6 Figure 9: Replacing a fragment, source: [5] In this example, the DNA fragment C is replaced with D. Note that in this case, after the cut by Fok I, the sticky ends (of the previous plasmid) are not self-complementary and so cannot connect. Only the addition of the fragment D with the corresponding ends can a plasmid arise again with ligase. Figure 10: Deleting a fragment with self-removal, source: [5] The detection regions can also delete themselves, as outlined above. The plasmid is restored without having any recognition regions. Figure 11: Replacing an aligned DNA fragment, source: [5] This operation is a combination of the second and third operations already presented. 6 Figure 12: Movement of a sequence through a DNA strand, source: [5]

7 The last illustration shows the possibility of running a head sequence through a DNA strand and modifying it in the process. The enzyme first removes this head sequence, as it has corresponding recognition regions, including the piece of DNA to be modified. The new DNA fragment again contains a head sequence and the modified DNA sequence. The separation areas are again chosen so that the plasmid cannot combine without a fragment. This already resembles the head movement of a Turing machine. 2.4 Conversion of a Turing machine to DNA The following describes how the methods already mentioned can be used to create a Turing machine. Coding of the ribbon symbols Figure 13: Coding of the ribbon symbols, source: [5] A ribbon symbol consists of two different halves (A ). Each symbol is surrounded by two short sequences called invariants. Since these are not identical, they are denoted by the letters L for left and R for right (B). The tape of the Turing machine is created by stringing such symbols together (C) Coding of the read / write head Figure 14: Coding of a head with the current symbol black, source: [5] As was already shown in the last operation of the class IIS restriction enzymes the head of the Turing machine realized in the band. Two asymmetrical recognition regions standing back to back represent the head. The state cutter, named after the state q1 in Figure 14, points to the current symbol. The second recognition region Inv always refers to an invariant. The two examples shown in Figure 14 represent the same situation, but the head is sometimes to the right and sometimes to the left of the current symbol. This can be explained by the last movement of the head: if the head is to the right of the current symbol, the last movement of the head was to the left. The same also applies vice versa. 7th

8 2.4.3 Coding of the states Figure 15: Coding of different states by varying the length, source: [5] The length of the space between the recognition region of the state cutter (Fok I) and the current symbol determines the state of the Turing machine. Let W be half of a symbol here. By varying the length, the State Cutter Fok I creates sticky ends that differ from one another. A suitable selection of the DNA sequences for the symbols allows unique sticky ends to be generated for each state-symbol combination, as should be clear in Figure 16. Figure 16: There is a unique sticky end for each state-symbol combination, source: [5] Coding of the transfer functions To code the transfer functions, oligonucleotides (short DNA sequences) are used, which are used due to the clear sticky ends described above. Basically, these oligonucleotides can be divided into five areas: 8

9 Figure 17: five areas of a transfer oligonucleotide, source: [5] - Coh is the specific counterpart of a sticky end of a state-symbol combination. This oligonucleotide is therefore only used with the appropriate status and symbol. Coh is generated by the recognition region Em = End Marker. This is made possible by a restriction enzyme that uses the same cutting pattern as the State Cutter, but has a different recognition region. This is necessary to allow gradual reactions to take place. - The new head is again defined by the State Cutter and Inv recognition regions. - Res denotes the new symbol that is to be created by applying this rule, framed with the previously explained invariants L and R. - The recognition regions X stand for symbol excision and are responsible for deleting the old symbol. - Cap is also a recognition region, which creates the counterpart for a divided invariant. The function of the five areas described is illustrated in the next section using an example. Figure 18: Transfer oligonucleotide for the hold state, source: [5] The exception to this construction is the hold rule, which has a special hold sequence. This is used, among other things, to select completed calculations. Since there can be two different head positions for one situation (see Figure 14), two oligonucleotides must be coded per rule. If the future movement of the head is taken into account, two of the following forms are to be selected: Figure 19: Forms of the transfer oligonucleotides, source: [5] 9

10 2.4.5 Sequence of a Turing step Figure 20: Sequence of a Turing step, source: [5] The above sketch should illustrate how the DNA Turing machine now works. - Step 0 is the initial situation, which should already be known. - Step 1 shows the removal of the read / write head using the enzymes indicated in the detection regions. Please note that two sticky ends are formed, with the right one being created by splitting a symbol and thus only fits one transfer oligonucleotide. - In step 2 all oligonucleotides were added and the only suitable transfer rule was bound with ligase to the previously created sticky end of the symbol. Since the sticky end of the split invariant is not unique, the cap sequence of an oligonucleotide protects against previously wrong molecules from binding to the invariant. - Only in step 3 does Cap create the matching counterpart to the invariant - In step 4 the plasmid is restored by ligase, but the old symbol and the X recognition regions still exist. - Step 5 deletes the old symbol and creates complementary ends of the invariant R. - In step 6, the plasmid is restored by ligase and the new band situation is present. It should be noted that this is not a one-pot reaction. Each step is carried out separately. The order in which the enzymes and oligonucleotides are added plays an important role. For example, it must be avoided that oligonucleotides bind prematurely to the invariant that does not match the sticky end of the divided symbol. 10

11 2.4.6 Usable DNA sequences Figure 21: Class IIS restriction endonucleotides, source: [5] Figure 22: Usable symbols, source: [5] Since the symbols and recognition regions were previously shown schematically, the actual sequences are listed here. The same pattern of the Fok I and the End Maker Bbv I can be seen. They differ in their recognition region, which is relevant for controlling the sequence of reactions. There is also an additional State Cutter Sta, which is used when dealing with more than two symbols in order to create even more unequal sticky ends. It should be noted that all of these enzymes are commercially available, but very expensive. 2.5 A universal 4 x 7 Turing machine Figure 21: Transfer table of the UTM by Minsky, source: [5] As already mentioned at the beginning, a universal Turing machine receives an additional Turing machine as input, which is to be simulated. Marvin Minsky has developed a 4 x 7 universal Turing machine that requires 7 states and 4 symbols. Another State Cutter Sta (see above) is used to implement the 4 symbols. 56 transfer oligonucleotides (4x7x2) encode the transfer functions. So far, the function of the universal Turing machine based on DNA has only been proven theoretically. 11

12 2.6 Evaluation For a better assessment of the presented model, some noteworthy aspects are listed below. - Various errors can occur during the reactions described, for example connecting two plasmids, incorrect ligations or failed restrictions. Such errors can be selected or minimized with additional effort. - The enzymes used are usually very expensive, and - It is not a one-pot reaction, all steps must be carried out separately. - In order to minimize the fusing of several DNA bands with each other, the water concentration and the length of the DNA are crucial. According to estimates, the largest meaningful DNA has a length of 4.7 million nucleotides, which corresponds to 80 KB of memory per tape with 4 symbols. - A restriction takes 1 hour, a ligation 30 minutes, which means that a Turing step with solid support takes 4.5 hours. - Depending on the number of known class IIS restriction endonucleases, the realization of a Turing machine with a symbol status product of 60 is considered possible. To clarify: The universal Turing machine from Marvin has a symbol state product of Unfortunately, such universal Turing machines are extremely slow, as the running time increases exponentially with the length of the input tape. A 10x6 universal Turing machine needs steps to simulate a step of a 2x2 Turing machine with 6 symbols. Turing machines designed for special applications may only need small polynomial time in relation to the input variable, perhaps even linear time. Selection algorithms, for example, would be conceivable. Therefore, such Turing machines could possibly be imagined as preprocessors for other DNA computers. 3 Concluding remarks As was shown at the beginning, there are various researches under the umbrella term biocomputer, mostly with the aim of replacing today's computer structures. As should be shown in the lecture The Limits of Moore's Law, some researchers assume that the conventional technology will reach its physical limits in about 15 years. It is unlikely that implementation of the model presented here will ever replace today's home computers.However, alternative efforts are in progress, which are not based on the Turing model, but on the so-called lambda calculus. This is a purely functional stateless calculation model, but it can also calculate all calculable problems. Ralf Zimmer, see [7], points out that the lambda calculus can be better mapped onto DNA technology and sees a greater chance for such universal DNA computers in the future. Figure 22: Calculation models and implementation, source: [7] 12

13 4 References [1] Biocomputing - Computing using biological processes [2] Amorphous Computing Home Page [3] BBC News: Biological computer born [4] Ron Weiss Programming Colonies of Biological Cells [5] Paul WK Rothemund A DNA and restriction enzyme implementation of Turing Machines [6] Rolf Knippers Molecular Genetics Georg Thieme Verlag Stuttgart [7] Ralf Zimmer A universal DNA computer From the chip to the genetic molecules html / sites / 06zimmer.html [8] M. Godisch DNA implementation of Turing machines 13