Measures for community richness within an environment or variation between different environments

Aligning and merging short fragments of sequenced DNA in order to reconstruct the original genome.

Coverage depth
Average number of times a base of a genome is sequenced.

Contiguous fragments of DNA sequence from an incomplete draft genome.

GC content
The GC content of a DNA sequence is the percentage of nucleotides that are either G or C.

Giga base pairs (Gb)
Size of a metagenomics sample in numbers of base pairs

Horizontal gene transfer (HGT)
Exchange or absorption of genetic material independent of reproduction

Environmental gene tags (EGTs)
Short DNA sequences that characterize microbial environments.

Collection of biological DNA fragments prepared for sequencing

Identifying the complete set of transcripts (RNA-seq) from microbial environments

profiling of community-wide protein abundances

Multilocus sequence typing (MLST)
Technique to detect variability of housekeeping genes for identifying bacterial strains

Operational taxonomic units (OTU)
Sequence based species cluster defined by 16S gene sequence similarity

Orthologous genes
Functional identical genes in different species that evolved from a common ancestral gene

The entire gene set of a species

Reads per kilo base per million (RPKM)
Normalization to compare coverage of genes

Ribosomal sequences 16S
To identify and compare bacteria based on differences in their 16S ribosomal sequence

Sampling depth
How deep is enough for metagenomic shotgun sequencing

Strain-level metagenomics
Identifying the gene composition of individual strains in metagenomic samples

