A bit of biology
Cells, chromosomes and DNA
Every living organism is made of of cells.
Who has the highest number of cells?
common name |
bacteria E.coli |
nematode C.elegans |
strawberry | chimpanzee | humans |
number of cells |
1 | 1,000 | a few million |
~100,000 billions |
~100,000 billions |
![]() |
![]() |
![]() |
![]() |
![]() |
Sources:
Post-embryonic cell lineages of the nematode, Caenorhabditis elegans (1977)
Cell count and size in relation to fruit size among strawberry cultivars (1992)
An estimation of the number of cells in the human body (2013)
Each cell contains chromosomes.
Who has the largest number of chromosomes?
common name |
bacteria E.coli |
gold fish |
fern
Ophioglossum |
chimpanzee | humans | banana |
Australian ant |
number of chromosomes (2n) in a cell |
1 |
100 | 1,440 | 48 |
46 |
11 |
1 or 2* |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
*Females, diploid, have 2 chromosomes; males, haploid, have only one.
Source: wikipedia
DNA and genomes
A chromosome can be compared more or less to a compact ball of thread, where the thread is the DNA.
Adapted from Wikimedia Commons
DNA usually has a characteristic ‘double helix’ structure composed of two strands. Each strand is a long molecule consisting of a succession of 4 nucleotides, called A (adenine), T (thymine), G (guanine) and C (cytosine). The 2 strands are complementary:
an A on one strand faces a T in the other strand, a G faces a C.
DNA is universal: it is found in all living organisms! It is also found in some viruses, sometimes in a slightly different form (single-stranded DNA).
What is the sequence of banana chromosome 3? What is the length of human chromosome 1 in centimeters (cm)?
Beginning of the sequence of banana chromosome 3 (total length: 30,470,407 bp; 1 cm):
>NC_025204.1 ACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAA ACCCTAAACCCTAAACCCTAAAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCTAAACCCT AAACCCTAAACCCTAAAACCAAAAAAAATGGAATAATTACTTTAAATCTTAATTATTCCTTTATTTTTGT TTTTTTTTTTTTTAATCTTGATGCCCGATTACCCGATATGTCGGCTGGGCGGGCGCTTGGACATTGCGCT CGTTGGGCCCAACCTGTGCTGGGCTTTTGCGTCGGCCTTTTCAATGTACTGGGTCAAACCTGAGTCATGA...
Beginning of the chromosome sequence of the E.coli bacterium (total length: 4,646,332 bp; 0.15 cm):
>AP009048.1 AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA...
A piece of sequence from human chromosome 1 (total length: 248,956,422 bp; 8.2 cm):
>CM000663.2 GGTAGAACCTCAGTAATCCGAAAAGCCGGGATCGACCGCCCCTTGCTTGCAGCCGGGCACTACAGGACCC GCTTGCTCACGGTGCTGTGCCAGGGCGCCCCCTGCTGGCGACTAGGGCAACTGCAGGGCTCTCTTGCTTA GAGTGGTGGCCAGCGCCCCCTGCTGGCGCCGGGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGG CACGCCGCCTGCTGGCAGCTAGGGACATTGCAGGGTCCTCTTGCTCAAGGTGTAGTGGCAGCACGCCCAC CTGCTGGCAGCTGGGGACACTGCCGGGCCCTCTTGCTCCAACAGTACTGGCGGATTATAGGGAAACACCC...
Note: ‘N’s can be found in the sequences: this means that the nucleotides could not be identified during sequencing.
The size of genomes, usually expressed in base pairs (bp) or in millions of bases (Mb), is highly variable from one organism to another. And it is not always the organism that you think has the largest genome!
Who has the largest genome?
common name |
bacteria E.coli |
virus SARS-CoV-2 |
fruit fly | plant Paris japonica |
humans | banana |
size of the genome (bp) | 4,646,332 | 29,903 | 143 millions |
150 billions | 3 billions | 472 millions |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Note: this is the size of the ‘haploid’ genome: for humans, for example, 3 billion bp correspond to the sequence of 23 chromosomes.
Sources:
E.coli: NCBI Genome; SARS-COV-2: NCBI Genome; Drosophila Melanogaster: NCBI Genome; Paris japonica: Harvard bionumbers; Homo sapiens: NCBI Genome; Banana: NCBI Genome
Why is it so important to know the sequence of genomes?
Because genomes contain the information necessary for the construction of organisms and in particular to make proteins…
DNA and genes
The order of the nucleotides in DNA is very important and constitutes what is known as genetic information. A bit like a cookbook, DNA contains a number of recipes, called ‘genes‘. We will be interested in the genes that code for proteins.
Who, among the following species, has the largest number of protein-coding genes?
common name |
bacteria E.coli |
nematode C.elegans |
banana | chimpanzee | humans |
number of genes coding for proteins | 4,140 | 20,356 | 36,439 | 23,534 | 20,430 |
![]() |
![]() |
![]() |
![]() |
![]() |
The number of genes varies depending on the method used for finding genes and can change over time!
Sources:
E coli: OMA; C. elegans: OMA; Chimp: OMA; Human: OMA; Banana: OMA
Here is a piece of DNA located on human chromosome 11 that corresponds to the gene coding for the protein insulin.
In eukaryotes, genes are not 'continuous.' They are composed of non-coding regions (introns, in black), and coding regions, (exons, in red). Exons are translated into an amino acid sequence.During translation, or protein synthesis, the introns are removed.
Finding genes (and exons) in the genomes of different organisms is still a major challenge today!
Proteins
Proteins are essential to life.
Comics: A protein? A what?
When a cell or organism needs a protein, the corresponding gene will first be copied.
The copy, called messenger RNA (mRNA), is then transmitted to the ribosomes, the cellular machines which produce the proteins.
A note from the experts: a gene can code for several proteins
RNA undergoes a maturation process which leads to the elimination of introns. This process is called ‘splicing’.
It can be alternative: the combination of exons present at the end can be different according. A gene can produce different mRNAs…and hence different proteins.
One of the most extreme examples is the Drosophila (fruit fly) Dscam gene: this gene is composed of 95 alternative exons and can produce up to 38,000 different proteins. That is, there are more different proteins that could be produced from this one gene than the total number of genes in the entire genome!
(Source: Role of RNA secondary structures in regulating Dscam alternative splicing (2019)).
In evolutionary studies at the molecular level, biologists associate each gene with a representative (‘canonical’, consensus) protein. This is the (non-biological!) reason why, very often, the words gene or protein are used interchangeably to talk about genes or proteins which are in common between different species!
The ribosome translates the nucleotide sequence of the mRNA into an amino acid sequence and thus gives rise to a protein.
Proteins consist of a chain of amino acids. There are 20 different amino acids, also referred to by letters (G, E, N, I, A, L, ...) : 3 nucleotide 'letters' (codon) correspond to one amino acid 'letter': for example, the codon GTG codes for a V amino acid..
Biologists use the genetic code to translate a sequence of nucleotides into a sequence of amino acids.
In this representation of the genetic code, the amino acids are on the outer 2 rings of the circle and the the 1,2,3rd nucleotides are shown as the inner circles.
Once synthesized, the chain of amino acids folds upon itself to adopt a specific 3D structure, which is essential to the proper functioning of the protein.
Different representations of the BRAF protein's 3D structure, composed of 766 amino acids. The positions of the amino acids L, A, T, V, and K are shown.
Proteins and biological functions
Proteins come in different sizes and shapes and perform different biological functions.
Certain proteins are only found in certain species:
- The proteins involved in photosynthesis are only found in plants, algae and cyanobacteria;
- The proteins involved in vision are neither found in plants, nor algae, nor cyanobacteria...
Proteins that are found in a large number of species are involved in universal biological processes, such as protein synthesis or DNA replication.
These proteins (or the corresponding genes) are very useful for studying evolution!
What did you think?