# 0.1 Molecular biology

[Molecular biology](https://en.wikipedia.org/wiki/Molecular_biology) concerns all molecular basis of a life from composition to activity, including the interactions between [DNA](https://en.wikipedia.org/wiki/DNA), [RNA](https://en.wikipedia.org/wiki/RNA), [proteins](https://en.wikipedia.org/wiki/Protein), their [biosynthesis](https://en.wikipedia.org/wiki/Protein_biosynthesis), as well as the regulation of these interactions.  Molecular biology also is the study of molecular underpinnings of the processes of [replication](https://en.wikipedia.org/wiki/DNA_replication), [transcription](https://en.wikipedia.org/wiki/Transcription_\(genetics\)), [translation](https://en.wikipedia.org/wiki/Translation_\(biology\)) and cell function, which is a good starting point to understand the field.

To better understand the content about 3D genome and serve a broad spectrum of readers, let's first review some important concepts. Folks who are familiar with these content feel free to skip this chapter.&#x20;

## Restriction enzyme

Restriction Enzymes that cuts DNA at specific nucleotide sequences known as restriction sites, were first discovered in bacteria where these enzymes are designed to **selectively** cut exogenous DNA (like virus) to protect themselves.&#x20;

To cut the DNA, restriction enzyme makes two incisions, each strand of the DNA double helix. These restriction sites are palindrome ( which means the  sequence read the same forwards as backwards ). There are over 3000 RE have been identified and more than 600 of them are commercially available. Details of enzyme can be found in [database](http://rebase.neb.com/rebase/rebase.html) or commercial website. &#x20;

Naturally occurring restriction endonucleases are categorized into four groups (Types I, II III, and IV) based on their composition and [enzyme cofactor](https://en.wikipedia.org/wiki/Enzyme_cofactor) requirements, the nature of their target sequence, and the position of their DNA cleavage site relative to the target sequence. Here we will focus on the last criteria.

|          |                      Cleavage Site                      |                    Examples                   |
| -------- | :-----------------------------------------------------: | :-------------------------------------------: |
| Type I   | cut DNA at random far from their recognition sequences. |    <p>EcoK I</p><p>EcoA I</p><p>CfrA I</p>    |
| Type II  |    <p>Specific</p><p>Within the recognition site</p>    |   <p>EcoR I</p><p>BamH I</p><p>Hind III</p>   |
| Type III | <p>Random</p><p>24-26 bp away from recognition site</p> | <p>EcoP I</p><p>Hinf III  </p><p>EcoP15 I</p> |

## Hydrogen bonds

Hydrogen bond  is a partially [electrostatic](https://en.wikipedia.org/wiki/Electrostatics) attraction between a [hydrogen](https://en.wikipedia.org/wiki/Hydrogen) (H) which is bound to a more [electronegative](https://en.wikipedia.org/wiki/Electronegativity) atom such as [nitrogen](https://en.wikipedia.org/wiki/Nitrogen) (N), [oxygen](https://en.wikipedia.org/wiki/Oxygen)(O), or [fluorine](https://en.wikipedia.org/wiki/Fluorine) (F), and another adjacent atom bearing a [lone pair](https://en.wikipedia.org/wiki/Lone_pair) of electrons (according to wiki). In another word it is an attraction between a slightly positive hydrogen atom and a slightly negative atom.

All life depends on hydrogen bonds in basic form water. It also plays an important role in determining the three-dimensional structures and the properties adopted by many synthetic and natural proteins. DNA, proteins, cellulose and so on.

**In epigenetics**, there are 142 hydrogen bonds between DNA and the histone core in each nucleosome.  More than 1/5 of the amino acids in each of the core histones are either lysine or arginine, and their positive charges neutralize the negatively charged DNA backbone.&#x20;

For **protein-DNA recognition**,  the mechanism is based on base readout and shape readout. Base readout is when the protein recognizes the specific chemical signatures of different nucleic acid bases which extensively depends on hydrogen bonds. In protein-DNA recognition, it is a greater source of specificity in the major groove as compared to the minor groove due to the pattern of hydrogen bond donors and acceptors available [\[1\]](https://en.wikibooks.org/wiki/Structural_Biochemistry/Protein_function/DNA_Binding).

## Gel electrophoresis

[Gel electrophoresis](http://en.wikipedia.org/wiki/Gel_electrophoresis) is a method to separate [DNA](http://en.wikipedia.org/wiki/DNA), [RNA](http://en.wikipedia.org/wiki/RNA) or even proteins by size and charge. It is used in [biochemistry](https://en.wikipedia.org/wiki/Biochemistry) and [molecular biology](https://en.wikipedia.org/wiki/Molecular_biology) to separate a mixed population of DNA and RNA fragments by length, to estimate the size of DNA and RNA fragments or to separate proteins by charge.[\[2\]](https://en.wikipedia.org/wiki/Gel_electrophoresis#cite_note-1)

This is achieved by moving negatively charged nucleic acid molecules (like DNA) through an [agarose](http://en.wikipedia.org/wiki/Agarose) matrix with an [electric field](http://en.wikipedia.org/wiki/Electric_field) ([electrophoresis](http://en.wikipedia.org/wiki/Electrophoresis)). The speed of molecular depends on various factors like: strength of the electrical field, buffer, density of agarose gel and so on, but most importantly is the size of DNA. Shorter molecules move faster and migrate farther than longer ones due to their mobility through pores in the gel. Within an agarose gel, linear DNA migrate inversely proportional to the log10 of their molecular weight.

Gel electrophoresis has many applications such like:

* Estimation of the size of DNA molecules after using restriction enzyme digestion, the size could be estimated by comparing with some standard reference markers ( whose length is known ).&#x20;
* Analysis of [PCR](https://en.wikipedia.org/wiki/PCR) products, e.g. in molecular [genetic diagnosis](https://en.wikipedia.org/wiki/Preimplantation_genetic_diagnosis) or [genetic fingerprinting](https://en.wikipedia.org/wiki/Genetic_fingerprinting).
* Separation of restricted genomic DNA by cutting down the block of gel contains specific sequence.&#x20;

## PCR

[Polymerase Chain Reaction](https://en.wikipedia.org/wiki/Polymerase_chain_reaction) is a method to amplify **a particular piece** of DNA, it can make billions of copies of a target sequence of DNA in a few hours. PCR was invented in the 1984 as a way to make numerous copies of DNA fragments in the laboratory, this technique has been applied in enormous various applications and really enriched the field of molecular biology.&#x20;

We can think PCR as an [in vitro](https://en.wikipedia.org/wiki/In_vitro) version of **DNA replication** in vivo. DNA replication is semi-conservative which means only one strand of the DNA is used as the template for the growth of a new DNA strand. We need the following major components to start with a PCR reaction:

* **DNA template**: include regions of interest to be amplified.
* **DNA Polymerase**: a type of enzyme that synthesizes new strands of DNA complementary to the target sequence and heat resistant.
* two **DNA primers** that are complementary to the 3' (three prime) ends of each of the sense and anti-sense strands of the DNA target, at which DNA polymerase will start synthesizing.
* **dNTP**: deoxynucleoside triphosphates, building blocks of newly synthesized DNA.

Besides all these materials, there are three main steps that finish the magic to replicate the desired sequence from one to millions:

* Step 1: Denature DNA

  &#x20;     At 95°C, the DNA is denatured (i.e. the two strands are separated).
* Step 2: Primers Anneal.

  &#x20;    At 40°C- 65°C, the primers anneal (or bind to) their complementary sequences on the single strands of DNA.
* Step 3: DNA polymerase Extends the DNA chain

  &#x20;    At 72°C, DNA Polymerase extends the DNA chain by adding nucleotides to the 3’ ends of the primers.

{% embed url="<https://www.youtube.com/watch?v=nHi-3jP6Mvc>" %}

Even though PCR is such an powerful tool, there are still some **caveats** we should notice and they may produce **biases** during many applications. Reviews about this topic can be found [\[3\]](https://academic.oup.com/nar/article/43/21/e143/2468099), [\[4\], ](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1317340/)[\[5\]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3188800/).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zhonglab.gitbook.io/3dgenome/chap0-preparation/0.1-molecular-biology.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
