3D structureThe three-dimensional structure (3D structure) of a protein is the form it has in space; it depends on the amino acidsequence and is very important for the function of the protein.
a
alignmentA representation of 2 or more sequences, either nucleotides (DNA, RNA), or amino acids (proteins), where the nucleotides or amino acids that are the same positioned one under the other. The goal is to highlight the similarities or the differences between the sequences.
alleleChromosomes are found in multiple copies in the cell: for example, humans have 23 pairs of chromosomes. An allele is one of the versions of the nucleotide sequence found on one of the chromosomes.
amino acidBuilding block of proteins. Each protein consists of a succession of amino acids linked to each other. There are 20 amino acids, symbolized by the letters: A (Alanine), C (Cysteine), G (Glycine), T (Threonine), V (Valine), E (Glutamate), …
annotationInformation associated with a DNA sequence of a protein sequence (gene name, protein name, biological function, etc.). This annotation is added by experts (annotator) or by prediction programs. They are stored in databases, for instance UniProtKB.
b
bacteriaMicroorganisms consisting of a single cell, devoid of a nucleus and able to live in various environments (air, soil, water, organisms…). The term ‘bacteria’, ïn everyday language, includes the group of bacterias and that of archae (formerly archaebacteria) which generally live in extreme conditions (high acidity, high temperature…).
bioinformaticsBioinformatics is a branch of the life sciences that makes use of informatic tools, mathematics and statistics to store, analyse, interpret and visualise biological data, such as DNA sequences, the 3D structure of proteins or the results of experiments, for instance. Bioinformatics play a key role in the study of evolution.
bpAbbreviation for ‘base pairs’ used to indicate the number of nucleotides in a DNA sequence.
c
cellThe cell is the smallest building block required to constitute a living being. Their number varies from one species to another: a bacteria is constituted of a single cell, the small worm Caenorhabditis elegans of about 1,000 cells, while a human contains several 100’000 billion (10^14).
chromosomesOne chromosome can be compared to a more or less compact ball made out of a string of DNA. Each cell of an organism contains the same number of chromosomes. The number of chromosomes and the number of copies varies from one species to another. Humans have 23 pairs of chromosomes, the banana has 11 chromosomes, most often in 3 copies.
d
databaseElectronic encyclopedia in which data are organised in a structured manner in order to efficiently store large amounts of information. For example, the UniProtKB database stores information on the proteins from all organisms and the OMA database stores information on orthologous genes. These databases are updated regularly.
diploidA cell is said to be haploid when it contains only one copy of its chromosomes (n chromosomes). Cells called diploid contain two copies of their chromosomes (2n chromosomes). A human liver cell, for example, is diploid and contains 23 pairs of chromosomes (2x23 = 46 chromosomes). A human sex cell is haploid and contains 23 chromosomes.
DNADNA consists of a succession of 4 nucleotides linked to each other and symbolized by the letters a (adenine), c (cytosine), g (guanine) et t (thymine). The nucleotide order in the sequence is very important : it carries the genetic information.
e
eukaryoteRefers to an organism whose cells contain a nucleus containing the chromosomes. This is the case of animals (including humans), plants and yeasts. Contrary to eukaryotes, prokaryotes do not have a nucleus and their chromosomes are ‘free’ within the cell.
evolutionEvolution refers to the transformation undergone by living organisms (animals, plants, bacteria) or viruses over several successive generations. It is the consequence of gradual and random genetic changes. It is at the origin of the creation of new species from a common ancestor. The story of species can be represented graphically by a phylogenetic tree.
exonIn eukaryotes, genes are composed of coding regions (exons) and non-coding regions (introns). The exons contain the genetic information to build proteins.
g
genesA gene is a piece of DNA which contains all the necessary information to build, in most cases, a protein
genetic codeAlmost universal code that allows a sequence of nucleotides (DNA, mRNA) to be translated into a sequence of amino acids (protein). One amino acid is coded by 3 nucleic acids (codon).
genetic materialAny material which contains the genetic material and transmits it from one generation to the next. For all living organisms known to date, the genetic material is almost exclusively constituted of DNA. In certain viruses, the carrier of the genetic information is RNA.
genomeTotality of the DNA found in a cell. Generally, all cells of an organism contain the same genome.
h
haploidA cell is said to be haploid when it contains only one copy of its chromosomes (n chromosomes). Cells called diploid contain two copies of their chromosomes (2n chromosomes). A human liver cell, for example, is diploid and contains 23 pairs of chromosomes (2x23 = 46 chromosomes). A human sex cell is haploid and contains 23 chromosomes.
homologHaving a common ancestor. One talks about ‘homologous’ genes, proteins or species. For the genes and proteins, one sometimes distinguishes between two subtypes: orthologs and paralogs.
i
intronIn eukaryotes, genes are composed of coding regions (exons) and non-coding regions (introns). The exons contain the genetic information to build proteins.
l
LUCA“Last Universal Common Ancestor”: the hypothetical common ancestor of all species. It theoretically lived between 3.5 to 4 billion years ago and was composed of a single cell.
m
mutationsA mutation is a change of one or more nucleotides in a nucleotide sequence. Mutations occur randomly throughout the life and divisions of the cell.
n
nucleotideBasic building block of DNA and RNA. Nucleotides are molecules symbolized by letters. For DNA : the letters are A for adenine, T for thymine, G for guanine and C for cytosine. For RNA, T is replaced by U for uracil.
o
orthologUsed to refer to homologous genes or proteins, that originate from one and the same gene present in the last common ancestor of the species under consideration. Frequently (but not always), orthologous proteins have the same function in the different species and the corresponding genes are considered as “the same genes in different species”. Not to be confused with “paralog”.
p
paralogUsed to refer to homologous genes or proteins resulting from the duplication of genes (for example, hemoglobin alpha and hemoblogin beta in vertebrates are paralogs). Often (but not always), paralogs have different biological roles. Not to be confused with “ortholog”.
phylogenetic treeA schematic representation of a tree that shows the relationships between different species or groups of species.
prokaryoteUsed to refer to an organism consisting of a single cell and whose DNA is not contained in a nucleus, contrary to eukaryotes. Bacteria and archae are prokaryotes.
proteinsProteins are essential to the construction and functioning of all organisms. They are responsible for thousands of functions, such as the control of cell division or the synthesis of DNA, for example. Just like a pearl necklace, a protein consists of a succession of amino acids linked to each other by chemical forces. There are 20 different amino acids. The order of amino acids in the chain is determined by the genetic information. Proteins have variable lengths and fold upon themselves to adopt a specific structure - their 3D structure - which is very important for their function.
r
RNARNA can be compared to a photocopy of a piece of DNA. With a chemical composition similar to that of DNA and consisting of a single strand, it most often serves as the ‘messenger’, essential in the production of a protein. Other RNAs exist that are not directly implicated in the synthesis of proteins.
s
sequenceA DNA sequence is a succession of 4 nucleotides, symbolised by the letters A, T, C, G (G=guanine): ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCC… A protein sequence is a succession of 20 amino acids, symbolised by the letters G, E, N, I, A, L, ...(G = Glycine), in an order that is specific to each protein: MALWMRLLPLLALLALW…
sequencingSequencing refers to the use of laboratory techniques to sequence DNA in particular, that is, to determine the order in which the nucleotides A, T, C and G are found in the DNA.
SNPSingle nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single nucleotide. For example, the nucleotide cytosine (C) may be replaced with the nucleotide thymine (T) in a certain DNA sequence.
speciationEvolving process, most often very long, giving rise to the creation of a new species.
speciesA biological species is a population consisting of individuals that can reproduce with each other and give rise to viable and fertile descendants, under natural conditions.
v
virusMicroorganism which is necessarily a parasite and must infect a cell in order to reproduce. A viral particle consists of a protein envelope protecting its genetic information (DNA or RNA)