Skip to main content
DNA Icon Sequence Icon Icon

Glossary

AJAX progress indicator
  • 3

  • The three-dimensional structure (3D structure) of a protein is the form it has in space; it depends on the amino acid sequence and is very important for the function of the protein.
  • a

  • A representation of 2 or more sequences, either nucleotides (DNA, RNA), or amino acids (proteins), where the nucleotides or amino acids that are the same positioned one under the other. The goal is to highlight the similarities or the differences between the sequences.
  • Chromosomes are found in multiple copies in the cell: for example, humans have 23 pairs of chromosomes. An allele is one of the versions of the nucleotide sequence found on one of the chromosomes.
  • Building block of proteins. Each protein consists of a succession of amino acids linked to each other. There are 20 amino acids, symbolized by the letters: A (Alanine), C (Cysteine), G (Glycine), T (Threonine), V (Valine), E (Glutamate), …
  • Information associated with a DNA sequence of a protein sequence (gene name, protein name, biological function, etc.). This annotation is added by experts (annotator) or by prediction programs. They are stored in databases, for instance UniProtKB.
  • b

  • Microorganisms consisting of a single cell, devoid of a nucleus and able to live in various environments (air, soil, water, organisms…). The term ‘bacteria’, ïn everyday language, includes the group of bacterias and that of archae (formerly archaebacteria) which generally live in extreme conditions (high acidity, high temperature…).
  • Bioinformatics is a branch of the life sciences that makes use of informatic tools, mathematics and statistics to store, analyse, interpret and visualise biological data, such as DNA sequences, the 3D structure of proteins or the results of experiments, for instance. Bioinformatics play a key role in the study of evolution.
  • Abbreviation for ‘base pairs’ used to indicate the number of nucleotides in a DNA sequence.
  • c

  • The cell is the smallest building block required to constitute a living being. Their number varies from one species to another: a bacteria is constituted of a single cell, the small worm Caenorhabditis elegans of about 1,000 cells, while a human contains several 100’000 billion (10^14).
  • One chromosome can be compared to a more or less compact ball made out of a string of DNA. Each cell of an organism contains the same number of chromosomes. The number of chromosomes and the number of copies varies from one species to another. Humans have 23 pairs of chromosomes, the banana has 11 chromosomes, most often in 3 copies.
  • d

  • Electronic encyclopedia in which data are organised in a structured manner in order to efficiently store large amounts of information. For example, the UniProtKB database stores information on the proteins from all organisms and the OMA database stores information on orthologous genes. These databases are updated regularly.
  • A cell is said to be haploid when it contains only one copy of its chromosomes (n chromosomes). Cells called diploid contain two copies of their chromosomes (2n chromosomes). A human liver cell, for example, is diploid and contains 23 pairs of chromosomes (2x23 = 46 chromosomes). A human sex cell is haploid and contains 23 chromosomes.
  • DNA consists of a succession of 4 nucleotides linked to each other and symbolized by the letters a (adenine), c (cytosine), g (guanine) et t (thymine). The nucleotide order in the sequence is very important : it carries the genetic information.
  • e

  • Refers to an organism whose cells contain a nucleus containing the chromosomes. This is the case of animals (including humans), plants and yeasts. Contrary to eukaryotes, prokaryotes do not have a nucleus and their chromosomes are ‘free’ within the cell.
  • Evolution refers to the transformation undergone by living organisms (animals, plants, bacteria) or viruses over several successive generations. It is the consequence of gradual and random genetic changes. It is at the origin of the creation of new species from a common ancestor. The story of species can be represented graphically by a phylogenetic tree.
  • In eukaryotes, genes are composed of coding regions (exons) and non-coding regions (introns). The exons contain the genetic information to build proteins.
  • g

  • A gene is a piece of DNA which contains all the necessary information to build, in most cases, a protein
  • Almost universal code that allows a sequence of nucleotides (DNA, mRNA) to be translated  into a sequence of amino acids (protein). One amino acid is coded by 3 nucleic acids (codon).
  • Any material which contains the genetic material and transmits it from one generation to the next. For all living organisms known to date, the genetic material is almost exclusively constituted of DNA. In certain viruses, the carrier of the genetic information is RNA.
  • Totality of the DNA found in a cell. Generally, all cells of an organism contain the same genome.
  • h

  • A cell is said to be haploid when it contains only one copy of its chromosomes (n chromosomes). Cells called diploid contain two copies of their chromosomes (2n chromosomes). A human liver cell, for example, is diploid and contains 23 pairs of chromosomes (2x23 = 46 chromosomes). A human sex cell is haploid and contains 23 chromosomes.
  • Having a common ancestor. One talks about ‘homologous’ genes, proteins or species. For the genes and proteins, one sometimes distinguishes between two subtypes: orthologs and paralogs.
  • i

  • In eukaryotes, genes are composed of coding regions (exons) and non-coding regions (introns). The exons contain the genetic information to build proteins.
  • l

  • “Last Universal Common Ancestor”: the hypothetical common ancestor of all species. It theoretically lived between 3.5 to 4 billion years ago and was composed of a single cell.
  • m

  • A mutation is a change of one or more nucleotides in a nucleotide sequence. Mutations occur randomly throughout the life and divisions of the cell.
  • n

  • Basic building block of DNA and RNA. Nucleotides are molecules symbolized by letters. For DNA : the letters are A for adenine, T for thymine, G for guanine and C for cytosine. For RNA, T is replaced by U for uracil.
  • o

  • Used to refer to homologous genes or proteins, that originate from one and the same gene present in the last common ancestor of the species under consideration. Frequently (but not always), orthologous proteins have the same function in the different species and the corresponding genes are considered as “the same genes in different species”. Not to be confused with “paralog”.
  • p

  • Used to refer to homologous genes or proteins resulting from the duplication of genes (for example, hemoglobin alpha and hemoblogin beta in vertebrates are paralogs). Often (but not always), paralogs have different biological roles. Not to be confused with “ortholog”.
  • A schematic representation of a tree that shows the relationships between different species or groups of species.
  • Used to refer to an organism consisting of a single cell and whose DNA is not contained in a nucleus, contrary to eukaryotes. Bacteria and archae are prokaryotes.
  • Proteins are essential to the construction and functioning of all organisms. They are responsible for thousands of functions, such as the control of cell division or the synthesis of DNA, for example. Just like a pearl necklace, a protein consists of a succession of amino acids linked to each other by chemical forces. There are 20 different amino acids. The order of amino acids in the chain is determined by the genetic information. Proteins have variable lengths and fold upon themselves to adopt a specific structure - their 3D structure - which is very important for their function.
  • r

  • RNA can be compared to a photocopy of a piece of DNA. With a chemical composition similar to that of DNA and consisting of a single strand, it most often serves as the ‘messenger’, essential in the production of a protein. Other RNAs exist that are not directly implicated in the synthesis of proteins.
  • s

  • A DNA sequence is a succession of 4 nucleotides, symbolised by the letters A, T, C, G (G=guanine): ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCC… A protein sequence is a succession of 20 amino acids, symbolised by the letters G, E, N, I, A, L, ...(G = Glycine), in an order that is specific to each protein: MALWMRLLPLLALLALW…
  • Sequencing refers to the use of laboratory techniques to sequence DNA in particular, that is, to determine the order in which the nucleotides A, T, C and G are found in the DNA.
  • Single nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single nucleotide. For example, the nucleotide cytosine (C) may be replaced with the nucleotide thymine (T) in a certain DNA sequence.
  • Evolving process, most often very long, giving rise to the creation of a new species.
  • A biological species is a population consisting of individuals that can reproduce with each other and give rise to viable and fertile descendants, under natural conditions.
  • v

  • Microorganism which is necessarily a parasite and must infect a cell in order to reproduce. A viral particle consists of a protein envelope protecting its genetic information (DNA or RNA)