3dgenome
  • Initial page
  • Cover
  • Preface
  • Figurelist
  • Chap0 Preparation
    • 0.1 Molecular biology
    • 0.2 Sequencing technologies
    • 0.3 RNA-seq Data Mapping & Gene Quantification
    • 0.4 RNA-seq Differential Analysis
  • Chap1 Why we care about 3D genome
    • 1.1 From 2D to 3D nuclear structure
    • 1.2 From static to dynamic
    • 1.3 From intra to inter chromosomes "talk"
    • 1.4 From aggregation to division - phase separation
  • Chap2 experiment tools for exploring genome interaction
    • 2.1 Image based
    • 2.2 Primary order
    • 2.3 Higher order C-techs
  • Chap3 Computational analysis
    • 3.1 Primary order analysis
    • 3.2 Higer order data analysis
      • 3.2.1 Read mapping consideration
      • 3.2.2 Analytical Pipelines
        • GITAR Pipeline
        • HiC-Pro Pipeline
      • 3.2.3 TAD calling algorithms
    • 3.3 3D structure
  • Chap4 RNA-genome interaction
    • 4.1 Experimental Methods
    • 4.2 Computational Analysis
  • Chap5 Integrative Data Visualization Tools
    • 5.1 GIVE
    • 5.2 HiGlass
  • Chap6 4DN Project
  • Appendix
    • Homework
    • Student's presentation
      • A Brief Introduction to Machine Learning
      • Precision medicine
      • CHIP-Seq
Powered by GitBook
On this page
  • General features of 3D genome organization
  • 1.1 2D cis elements in the genome
  • 1.2 Multi-scale folding
  • Chromosome territories
  • A/B Compartments
  • TAD (Topologically Associating Domains)
  • Sub TAD and insulation neighborhoods
  • Chromatin loops
  • Nucleosome-nucleosome interactions
  • 1.3 Architectural proteins and RNAs
  • Mediator (coactivator)
  • Cohesin
  • CTCF
  • Non-coding RNAs binding
  1. Chap1 Why we care about 3D genome

1.1 From 2D to 3D nuclear structure

General features of 3D genome organization

  1. 2D cis elements in the genome

  2. Multi-scale folding

    2.1. Chromosome territories

    2.2. A/B Compartments

    2.3. TAD (Topologically Associating Domains)

    2.4. Sub TAD and insulation neighborhoods

    2.5. Chromatin loops

  3. Architectural proteins and RNAs

    3.1. Mediator

    3.2. Cohesin

    3.3. CTCF

    3.4. non-coding RNAs binding

1.1 2D cis elements in the genome

The term cis is derived from the Latin root “cis,” meaning “the same side as.” In contrast, the term trans comes from the Latin root “trans,” meaning “across from.” In molecular biology, a cis-regulatory element refers to a region of the chromosomal DNA that regulates the transcription or expression of a gene that is on the same chromosome. Trans-regulatory element refers to a soluble protein that binds to the cis-acting element of a gene to control its expression. However, the soluble trans-acting protein can reside on any chromosome, often located on the chromosome differ from where it regulates.

Biochemically active regulatory elements (bound by sequence-specific regulatory TFs) :

Figure1. Schematic overview of regulatory elements in eukaryotes.

1.2 Multi-scale folding

The largest chromosomes contain hundreds of millions of base pairs that fold in a limitted space, which leads to multi-scale, hierarchical structures like: nucleosomes, chromatin fibres, chromosome domains, compartments and finally in chromosome territories.

Information resides at all levels, from the histone–DNA interactions at the sub-nucleosomal scale to the chromosome–chromosome and chromosome–lamina interactions in the nuclear space. This multi-level architecture can be regulated and/or exploited by a variety of components such as transcription factors, architectural proteins and non-coding RNAs in order to coordinate gene expression and cell fate.

With the help of currently developed chromosome capture technologies, we can see how them expanded our knowledge on chromosome structure.

Chromosome territories

A/B Compartments

The Hi-C data, after normalization and converting to an observed/expected matrix, display a plaid pattern, analyzing by Principal-component analysis (PCA), the first principal component (can be interpreted as the surrogate of maxim variance of the data, the most prominent feature) contains compartments A and B (with positive PC1 regions reflecting "active/permissive" chromatin and negative PC1 regions indicative of "inactive/inert" chromatin).

TAD (Topologically Associating Domains)

  • insulator proteins: CTCF (detected at ~76% of all boundaries):

  • active transcription marks: H3K4me3 and H3K36me3

  • nascent transcripts

  • housekeeping genes (present in ~34% of TAD boundaries)

  • repeat elements

There are also evidence to support that TADs are conserved between different cell types and across species.

Sub TAD and insulation neighborhoods

Chromatin loops

Nucleosome-nucleosome interactions

1.3 Architectural proteins and RNAs

An important question in chromatin biology is how the structural features of 3D chromatin organization are established. Few architectural proteins have shown to be essential for chromatin architecture.

Mediator (coactivator)

Cohesin

Cohesin is protein with multiple functions:

  • It regulates the separation of sister chromatids during cell division.

  • For chromatin architecture: cohesin interacts with both CTCF and mediator48 and

CTCF

Non-coding RNAs binding

PreviousChap1 Why we care about 3D genomeNext1.2 From static to dynamic

Last updated 6 years ago

Promoter: The promoter is a region around the TSS (+1) of a gene, which contains several DNA elements that facilitate the binding of regulatory proteins. It provides a secure initial binding site for RNA polymerase and for proteins ( transcription factors ) that recruit RNA polymerase to make transcription take place .

Enhancer: Enhancers are CRE (cis-regulatory elements which means they are non-coding DNA that does not code for transcription factor but engage in regulation). They can be located up to 1M bp (1,000,000 bp) away from the gene, upstream or downstream from the start site .

Insulator: An insulator is a genetic boundary element that blocks the interaction between enhancers and promoters. It has been found to cluster at the boundaries of topological association domains (TADs) and may have a role in partitioning the genome into "chromosome neighborhoods" - genomic regions within which regulation occurs .

Silencer: A silencer is a DNA sequence capable of binding transcription regulation factors that block the binding of RNA polymerase to DNA sequence, thus prevent genes from being expressed as proteins .

. Inside the nucleus, euchromatin and heterochromatin give rise to several grades of higher order structures: chromosome loops, Topological Associated Domains (TADs), Lamin Associated Domains (LADs) and chromosomal territories. Also the nucleolus, the “assembly-chain” of ribosomes, associates with specific DNA regions: the Nucleolar Associated Domains (NADs), that surround the highly transcribed region of nucleolus, giving rise to another grade of chromatin organization.Figure by Bianchi et al., AIMS Biophysics, 2015, 2(4): 585-612.

At larger scales, chromatin is organized into individual chromosome territories (one for each chro- mosome), which rarely intermix. This observation, initially coming from FISH studies , was later validated by genome-wide Hi-C data, which showed that interactions between loci on the same chromosome are much more frequent than contacts in trans between different chromosomes .

. Compartment identification with PCA. ©HOMER

A topologically associating domain (TAD) is a self-interacting genomic region, meaning that DNA sequences within a TAD physically interact with each other more frequently than with sequences outside the TAD. These three-dimensional chromosome structures are present in animals as well as some plants, fungi, and bacteria. TADs can range in size from thousands to millions of DNA bases (.

TADs typically manifest as contiguous square domains along the diagonal of Hi-C maps. The spatial partitioning of the genome into TADs correlates with many linear genomic features such as histone modifications, coordinated gene expression, association with the lamina and DNA replication timing, enhancer–promoter interactions .

TAD boundaries are enriched for

The positioning of TAD is relatively stable across cell types and appears to be independent of tissue-specific gene expression or histone modifications. During ESC differentiation, genome-wide switching of compartments A and B occurs, whereas TAD positioning remains stable .

TAD positioning is evolutionarily conserved: 50–70% of TAD boundaries are shared between human and mouse ESCs .

TAD is a stable unit of replication-time regulation .

. Hi-C Detected Chromatin Folding Paradigms. TADs (more tightly folded than regions between them) are on-diagonal boxes of contact enrichment. Loops are radially symmetric peaks of contact intensity, often located at the corners of TADs in mammalian cells. Off-diagonal boxes indicate interactions due to compartmentation. Right: TADs and loops may be either mostly transcriptionally active (grey) or inactive (black). Loops may also be more tightly folded, but additionally have an increased likelihood of contact between their boundaries or anchors. Compartmentation is indicated by homotypic (active–active or inactive–inactive) TAD–TAD interactions. The bona fide pattern of chromatin folding is unknown and indicated only schematically. Figure by Eagen, Kyle P. Trends in Biochemical Sciences (2018).

TADs can be further divided into smaller sub-TADs observed from high-resolution 5C of mouse ESCs . It resembles TADs displays the self-association feature with a decrease in contact frequency across sub-TAD boundaries, and some sub-TAD boundaries are associated with CTCF/Cohesin-mediated interactions . However, it differs from TADs that sub-TADs are less conserved across cell/tissue types and appear to be related to cell type–specific gene expression , .

It has been recognized that, cis-regulatory elements like promoter-enhancer are usually far away along the linear genome in vertebrate creatures. However, in order to elicit the regulatory effect, the genome structure evolved to form a loop that bring together two elements to a spatial proximity. This chromatin formation is usually called "chromatin loops". One well known example is the locus control region (LCR) of the β-globin cluster, which inter-acts strongly, via long-range chromatin contacts, with its target genes in enrythroid cells (where the β-globin gene is active) but shows little or no interaction in cells from different lineages.

This is the smallest scale of chromatin organization. For a long time, on the basis of in vitro electron microscopy, nucleosomes were thought to form arrays (often called the 30 nm chromatin fibres) with either solenoid or zig-zag shapes. However, recent studies provide more evidence to stand by a more flexible, heterogeneous groups arranged structure .

. Schematic of architecture proteins.©Image: Tom DiCesare/Whitehead Institute.

Mediator is found at both the enhancers and the promoters of actively transcribed genes and promotes transcription by enabling pre-initiation complex (PIC) assembly and RNAPII elongation .

It's important for DNA repair .

is proposed to be a part of the loop-extrusion complex in interphase cells .

CTCF was originally characterized as an insulator protein, capable of restricting enhancer–promoter interactions. Around 15% CTCFs are enriched at TAD boundaries in mammals, the majority lie within TADs and are thought to be involved in intra-TAD interactions . Another prominent feature is that CTCF sites at loop anchors occur predominantly in a convergent orientation, which suggests that not only binding but also directionality of binding sequence is important for the formation of a loop .

One interesting observation is that both mediator and CTCF seem to be able to bind directly to RNA, and knock down some of mediator binding non-coding RNAs led to a diminished loop formation between the ncRNA locus and its targets. .

[1]
[2]
[3]
[4]
Figure2
[5]
[6]
Figure3
hundreds kb usually)
[7]
[8]
[9]
[10]
[11]
Figure4
[12]
[13]
[14]
[15]
[16]
[17]
Figure5
[18]
[19]
[18]
[20]
[18]
[18]