INTRODUCTION
Waterton et al. (Reference Waterton, Ellis and Wynne2013) emphasized that ‘Taxonomy entered the 21st century wringing its hands in self-reflective concern at the fragmentation and lack of standing (including funding and new recruitment) of the field’. Parasitology, epidemiology and medical veterinary face similar concerns with, on the one hand, an increase of emerging and re-emerging infectious and parasitic diseases and, on the other, a decrease in taxonomic expertise challenging the need of rapid and accurate identification of pathogens, parasites, vectors and reservoirs.
Facing the loss of expertise in taxonomy, DNA barcoding was proposed by Hebert et al. (Reference Hebert, Cywinska, Ball and deWaard2003), as a new system of species identification using a short section of DNA from a standardized region of the genome. In the case of animals, the mitochondrial cytochrome c oxidase 1 gene (CO1) was chosen to establish species delineation and identification. However, this barcode marker soon appeared to be far from universal. In the case of fungi, for example, the most common marker used is the large ribosomal DNA ‘Internal Transcriber Spacer’ (ITS), although this marker does not work for all fungal groups. ITS is an excellent marker to distinguish species of the genus Pneumocystis, which comprises fungal pathogens residing in the pulmonary parenchyma of a wide range of mammals (Danesi et al. Reference Danesi, da Rold, Rizzoli, Hauffe, Marangon, Samerpitak, Demanche, Guillot, Capelli and de Hoog2016; Latinne et al. Reference Latinne, Bezé, Delhaes, Pottier, Gantois, Nguyen, Blasdell, Dei-Cas, Morand and Chabé2017). In the same manner, ITS appears to be a good candidate in several groups of protists (Wang et al. Reference Wang, Liu, Huang, Bengtsson-Palme, Chen, Zhang, Cai and Li2015), such as Trypanosoma species (Desquesnes et al. Reference Desquesnes, Kamyingkird, Yangtara, Milocco, Ravel, Wang, Lun, Morand and Jittapalapong2011).
The Barcode of Life (BOLI) promoted DNA barcoding as a way to speed up (and even reinventing taxonomy as emphasized by their promoters) the identification work of traditional taxonomy, with the urgent task to identify all the unknown species before their disappearance (Meier, Reference Meier and Wheeler2008). BOLI is considered as a tool in support of the Convention on Biological Diversity (CBD). The Consortium for the Barcode of Life (CBOL) was established in 2004 as an international initiative devoted to developing DNA barcoding (http://www.barcodeoflife.org/). The International Barcode of Life project (iBOL) was subsequently launched in 2010 as a research alliance of scientists, technologists and ethicists from 25 nations to construct a DNA barcode reference library (http://ibol.org/phase1/). The Barcode of Life Data Systems database (BOLD) was established as the identification tool for all organism barcodes. BOLD's infrastructure was initially designed to process and analyse the only CO1, but it can now process multiple genes (http://www.boldsystems.org/). Barcoding needs both universal barcodes (CO1, ITS) and a high quality of accessible databases (sequences, systematics) (Shen et al. Reference Shen, Chen and Murphy2013).
BOLI was established at the rise of the genomics era. New throughput technologies [next-generation sequencing (NGS)] and new adapted technologies such as matrix-assisted laser desorption–ionization time-of-flight (MALDI–TOF) mass spectrometry open new avenues (Ilina et al. Reference Ilina, Borovskaya, Malakhova, Vereshchagin, Kubanova, Kruglov, Svistunova, Gazarian, Maier, Kostrzewa and Govorun2009; Michelet et al. Reference Michelet, Delannoy, Devillers, Umhang, Aspan, Juremalm, Chirico, van derWal, Sprong, Boye Pihl, Klitgaard, Bødker, Fach and Moutailler2014).
Barcoding is now a common tool in parasitology and epidemiology which need good identification assessment not only of parasites and pathogens (Prosser et al. Reference Prosser, Velarde-Aguilar, León-Règagnon and Hebert2013; Ondrejicka et al. Reference Ondrejicka, Locke, Morey, Borisenko and Hanner2014) but also vectors (Ruiz-Lopez et al. Reference Ruiz-Lopez, Wilkerson, Conn, McKeon, Levin, Quĩones, Póvoa and Linton2012; Chan et al. Reference Chan, Chiang, Hapuarachchi, Tan, Pang, Lee, Lee, Ng and Lam-Phua2014; Kumlert et al. Reference Kumlert, Chaisiri, Anantatat, Stekolnikov, Morand, Prasartvit, Makepeace, Sungvornyothin and Paris2018) and reservoirs (Galan et al. Reference Galan, Pagès and Cosson2012). Barcoding in parasitology concerns a wide range of organisms from viruses, bacteria, fungi, protists, helminths, arthropods and molluscs, but also vertebrate animals.
REFERENCE SPECIMENS AND OPEN DATABASES
The tasks are to collect and curate specimens, to obtain barcode records from these specimens, to use (or to be built) the informatics platform to store these records and to enable their use by a large community. Following collection and barcoding of the specimens, two imperatives in barcoding practices remain: (i) to reference DNA barcoding to voucher specimens (and collection) and (ii) to link data in open databases.
By definition, a morphological voucher is a preserved specimen archived in a collection facility such as a museum. In DNA barcoding, the preservation of morphological vouchers is a standard practice for specimens from which DNA barcode sequences were obtained. At the beginning of BOLI, CO1 barcoding was mostly used to link established Linnaean taxonomy with curated voucher specimens in museums. BOLI should be seen as a classification system and not as a taxonomic system (Vogler and Monahan, Reference Vogler and Monahan2006). With more accessible technologies in barcoding, it remains even more imperative to keep voucher specimens, which are representative of individual organisms identified using current technology and taxonomic classification. Collection facilities and biobanks for tissues or even living organisms (e.g. bacteria and protists) are then complementary to parasitology barcoding initiatives.
The BOLD was established in 2005 as a repository platform of DNA barcodes for all eukaryotic life. The latest version of BOLD was released in 2015 (http://v4.boldsystems.org/) and now hosts more than 6 million barcodes from more than 270 000 species (including animals, plants and fungi). Barcode sequences are catalogued in GenBank (http://www.ncbi.nlm.nih.gov/genbank). Linking barcode to voucher specimen, an important process in barcoding, should operate through accurate and updated taxonomic classification like the Catalogue of Life (http://www.catalogueoflife.org/), which helps at resolving inaccurate identification and/or change in the systematics of the organisms in consideration.
Investigation of BOLD showed that parasitic organisms are far from all being barcoded and/or that their barcodes if existing were not archived in BOLD. For example, approximately 1300 species of Acanthocephala have been morphologically described (Poulin and Morand, Reference Poulin and Morand2004; Garcıa-Varela and Pérez-Ponce de León, Reference Garcıa-Varela, Pérez-Ponce de León, Morand, Krasnov and Littlewood2015), but only 38 species (< 3%) have their barcodes recorded in BOLD. Similarly, there are 663 species of Platyhelminthes with barcodes in BOLD, whereas there are currently around 30 000 known species (Caira and Littlewood, Reference Caira, Littlewood and Levin2013) (a little more than 2% are barcoded).
Another concern is the geographic localization of barcode specimens. Adding accurate geo-localization will enable the barcode specimens to be geo-referenced and processes to other international databases such as Global Biodiversity Information Facility (GBIF) (https://www.gbif.org/).
ADVANCES AND NEW APPLICATIONS IN BARCODING
Development of the barcoding approach was enhanced by new throughput technologies such as NGS. The breakthrough of integration of genomic data has been acknowledged in ecological genetics (Shafer et al. Reference Shafer, Northrup, Wikelski, Wittemyer and Wolf2016) and found high relevance in epidemiology and public health. In microbiology, rapid NGS of whole-genome sequencing (WGS) associated with bioinformatic pipelines found increasing applications from microbial taxonomy to public health surveillance of pathogens (Allard, Reference Allard2016).
Environmental genomics is a growing domain studying molecular components, DNA and RNA in (meta)genomes and (meta)transcriptomes, in environmental samples (Taberlet et al. Reference Taberlet, Coissac, Hajibabaei and Rieseberg2012; Joly and Faure, Reference Joly and Faure2015), with wide applications in biodiversity, monitoring and conservation biology (Stat et al. Reference Stat, Huggett, Bernasconi, DiBattista, Berry, Newman, Harvey and Bunce2017). Environmental DNA (eDNA) as named in biodiversity screening has found recent applications in parasitology (Bass et al. Reference Bass, Stentiford, Littlewood and Hartiakinen2015). eDNA may help at detecting free-living stages of parasites (eggs, cysts, larvae) in environmental samples collected in water or soil surveys or within intermediate hosts (Huver et al. Reference Huver, Koprivnikar, Johnson and Whyard2015).
Among the available new technologies, MALDI–TOF) mass spectrometry starts to be widely used as a new tool for barcoding (Sandrin et al. Reference Sandrin, Goldstein and Schumaker2013; Rothen et al. Reference Rothen, Githaka, Kanduma, Olds, Pflüger, Mwaura, Bishop and Daubenberger2016; Yssouf et al. Reference Yssouf, Almeras, Raoult and Parola2016; Diarra et al. Reference Diarra, Almeras, Laroche, Berenger, Koné, Bocoum, Dabo, Doumbo, Raoult and Parola2017), although this technique should be based on accurate species identification both morphologically and genetically.
Advances can be observed in a more friendly and user-oriented access to the different databases thanks to the development of specific packages in the freeware statistical programming language R (R Core Team, 2018, www.R-project.org/). Among several available packages, one can cite ‘bold’ (Chamberlain, Reference Chamberlain2017) developed by BOLD, which offers functions to search sequences and specimens and download trace files or ‘BarcodingR’ (Zhang et al. Reference Zhang, Hao, Yangan and Shi2017), which provides a comprehensive implementation of species identification methods. Packages have been also developed in R for creating and analysing DNA barcodes such as ‘DNAbarcodes’ (Buschmann, Reference Buschmann2017), which finds utilities for manipulating large datasets obtained by NGS.
NEW APPLICATIONS OF BARCODING
A first new application of DNA barcoding concerns the identification of ancient parasite DNA (Côté and Le Bailly, Reference Côté and Le Bailly2018; Wood, Reference Wood2018) irrigating a growing interest in paleogenomics and paleoparasitology.
In ecological parasitology, barcoding allows to follow vectors, the parasites they carry, but also the feeding activities of these vectors if they are blood feeders (i.e. biting arthropods). Applications then extend to use blood-feeding arthropods as vertebrate samplers (Kocher et al. Reference Kocher, de Thoisy, Catzeflis, Valière, Bañuls and Murienne2017; Muturi et al. Reference Muturi, Ouma, Malele, Ngure, Rutto, Mithöfer, Enyaru and Masiga2011), or as ‘flying syringes’ to collect blood samples and detect parasites in them (Bitome-Essono et al. Reference Bitome-Essono, Ollomo, Arnathau, Durand, Mokoudoum, Yacka-Mouele, Okouga, Boundenga, Mve-Ondo, Obame-Nkoghe, Mbehang-Nguema, Njiokou, Makanga, Wattier, Ayala, Ayala, Renaud, Rougeron, Bretagnolle and Prugnolle2017).
CHALLENGES IN BARCODING
Several pitfalls can occur in DNA barcoding linked to the events that have contributed to the evolutionary history of the species into consideration.
First, DNA barcoding based on mitochondrial genes (CO1) may overestimate the number of species due to the presence of pseudogenes (Song et al. Reference Song, Buhay, Whiting and Crandall2008). The removal of nuclear mitochondrial pseudogenes requires a careful examination of sequences.
Hybridization is the second issue in barcoding. Introgression of mitochondrial DNA due to hybridization and/or incomplete lineage sorting of mitochondrial DNA haplotypes may lead to misidentification as reported in several groups of mammals such as rodents and bats (Nesi et al. Reference Nesi, Nakouné, Cruaud and Hassanin2011; Pagès et al. Reference Pagès, Bazin, Galan, Chaval, Claude, Herbreteau, Michaux, Piry, Morand and Cosson2013; Ermakov et al. Reference Ermakov, Simonov, Surin, Titov, Brandler, Ivanova and Borisenko2016), which are important reservoirs of diseases.
POLICY RELEVANCE OF BARCODING
The 2010 Nagoya Protocol identified key common goals between BOLI and the CBD, among others: to promote the capacity building in species identification and discovery, and to support CBD with respect to biodiversity targets (i.e. Aichi targets), national biodiversity strategies and action plans, monitoring, indicators and assessments, and invasive alien species (Vernooy et al. Reference Vernooy, Haribabu, Muller, Vogel, Hebert, Schindel, Shimura and Singer2010). In Mars 2018, the GBIF has formalized collaboration with the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBS) (https://www.gbif.org/news/2gg|IFqrxre4im4oSMeMmA2/gbif-formalizes-collaboration-with-biodiversity-assessment-platform). The Memorandum of Understanding says formalizes continuing cooperation between GBIF and IPBES since the platform stands that GBIF will help IPBES to identify and access biodiversity datasets relevant to IPBES assessments and indicators and, using knowledge gaps, to identify and prioritize mobilization of new data through GBIF. The ongoing process will obviously favour the iteroperability of major international databases (BOLD, GBIF, GenBank), although it still leaves open questions concerning access and sharing.
In the sector of public or animal health, the development of eDNA for pathogen discovery in the environment may have relevance for policy. As emphasized by Bass et al. (Reference Bass, Stentiford, Littlewood and Hartiakinen2015) ’ the detection of apparently specific genomic material from a politically important (listed) pathogen in an environmental matrix relates to the universally applied principles of ‘infection’ and ‘disease’ detection and reporting according to the World Organization for Animal Health (Office International des Epizooties, OIE) (www.oie.int)’ (see also Stentiford et al. Reference Stentiford, Feist, Stone, Peeler and Bass2014). Similar concerns for public health are to take into consideration when eDNA studies and metabarcoding of environmental samples (water, soil, vectors) can detect important human pathogens or parasites. (HealthBold http://www.healthbol.org/).
ETHICS, LEGAL ISSUES
The CBD has changed the old taxonomic practices for collecting and archiving specimens in museum collections (Lajaunie et al. Reference Lajaunie, Morand and Huan2014). More stringent regulation is now applying following the implementation of the Nagoya Protocol and the Access and Benefit Sharing (ABS) of biodiversity and particularly for the health sector (Lajaunie and Mazzega, Reference Lajaunie and Mazzega2016). IBOL took in charge this new international issue of access, collect and curate of materials (specimens, genes, data) and made recommendations to the ABS to facilitate access to biodiversity samples for pure ‘non-commercial’ research (using distinct Material Transfer Agreements and arrangements for Prior Informed Consent) and to improve access to provider countries of information generated by the scientific use of their biodiversity and genetic resources. A MoU was signed in Nagoya at COP10 between the iBOL Board Chair and the CBD Executive Secretary.
THE SPECIAL ISSUE
The special issue ‘Advances and challenges in the barcoding of parasites, vectors and reservoirs’ aims at illustrating some recent advances and new research avenues of barcoding in parasitology (Table 1). All types of organisms were covered with bacteria (Guernier et al. Reference Guernier, Allan and Goarant2018; Kosoy et al. Reference Kosoy, Mckee, Albayrak and Fofanov2018), protists (Hutchinson and Stevens, Reference Hutchinson, Jamie and Stevens2018; Kocher et al. Reference Kocher, Valière, Bañuls and Murienne2018; Šlapeta, Reference Šlapeta2018), platyhelminths (Aivelo and Medlar, Reference Aivelo and Medlar2018, Boon et al. Reference Boon, Van de Broeck, Faye, Volckaert, Mboup, Katja Polman and Huyse2018), arthropod vectors (Beebe, Reference Beebe2018; Laroche et al. Reference Laroche, Bérenger, Gazelle, Blanchet, Raoult and Parola2018; Nebbak et al. Reference Nebbak, Koumare, Willcox, Berenger, Raoult, Almeras and Parola2018) emphasizing new technologies such as metabarcoding (Aivelo and Medlar, Reference Aivelo and Medlar2018) and MALDI–TOF MS (Laroche et al. Reference Laroche, Bérenger, Gazelle, Blanchet, Raoult and Parola2018; Nebbak et al. Reference Nebbak, Koumare, Willcox, Berenger, Raoult, Almeras and Parola2018), but also target-enrichment capture methods based on DNA hybridization in paleoparasitology (Côté and Le Bailly, Reference Côté and Le Bailly2018), LAMP and NASBA in protistology (Hutchinson and Stevens, Reference Hutchinson, Jamie and Stevens2018), or core genome MLST in bacteriology (Guernier et al. Reference Guernier, Allan and Goarant2018).
Applications of metabarcoding in paleoparasitology were reviewed (Côté and Le Bailly, Reference Côté and Le Bailly2018; Wood, Reference Wood2018), opening new ways to investigate the health of ancient communities of humans, domesticated animals and wildlife.
The limitations of barcoding were illustrated in the case of Schistosoma (Boon et al. Reference Boon, Van de Broeck, Faye, Volckaert, Mboup, Katja Polman and Huyse2018) and mosquitoes (Beebe, Reference Beebe2018). Although mitochondrial DNA barcode (CO1) shows utility in discriminating cryptic/sibling species, its use can be problematic when incomplete lineage sorting and introgression events can lead to indistinguishable COI sequences.
Finally, and as emphasized above, there are important ethical and legal issues of barcoding and biobanking. These issues are comprehensively addressed by Lajaunie and Ho (Reference Lajaunie and Ho2018), who provided guidelines for implementing barcode research in parasitology.
ACKNOWLEDGEMENTS
I thank Professor John Ellis for his great help in preparing this special issue.
FINANCIAL SUPPORT
This study is supported by the French ANR FutureHeathSEA (ANR-17-CE35-0003).
CONFLICT OF INTEREST
None.
ETHICAL STANDARDS
Not applicable.