Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-02-02T21:29:55.378Z Has data issue: false hasContentIssue false

Structured, inbred, and plastic: the genome and population genetics of the weed false cleavers (Galium spurium)

Published online by Cambridge University Press:  30 October 2024

Sara L. Martin*
Affiliation:
Research Scientist, Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Canada
Elizabeth Sears
Affiliation:
Research Technician, Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Canada
Tracey James
Affiliation:
Research Technician, Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Canada
Shaun Sharpe
Affiliation:
Research Scientist, Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada, Saskatoon, Canada
Breanne D. Tidemann
Affiliation:
Research Scientist, Lacombe Research and Development Centre, Agriculture and Agri-Food Canada, Lacombe, Canada
*
Corresponding author: Sara L. Martin; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

False cleavers (Galium spurium L.) is an aggressive weed from the Rubiaceae. Here we assemble a chromosome-scale draft of its genome, laying the foundations for determining the genetic basis of auxinic herbicide resistance and for systematic research into its polyphyletic genus. We use the genome to examine the population genetics of material from the Canadian Prairies and, in concert with a common greenhouse experiment, to examine whether the phenotypic variation observed in the field results primarily from genetic or environmental factors. The genome assembly covers approximately 85% of G. spurium’s expected 360-Mbp genome size, with 94% of BUSCO (Benchmarking Universal Single-Copy Orthologs) genes complete and most single copy (89%). Approximately 37% of the genome is repetitive elements and 35,540 genes were annotated using RNA-seq data, including 100 homologues for genes involved in, or potentially involved in, herbicide resistance. The genome shows strong synteny with other members of the Rubiaceae, including smooth bedstraw (Cruciata laevipes Opiz) and robusta coffee [Coffea canephora (Pierre ex Froehner]. Double-digested RAD-seq data for the 19 populations from the Canadian Prairies indicated that G. spurium has high levels of population structure (FST = 0.54) and inbreeding (FIS = 0.86) with low levels of hetrozygosity (HO = 0.02) and nucleotide diversity (π = 0.0003). Variation in flowering time and seed weight largely overlapped among populations grown in the greenhouse. A redundancy analysis investigating genotype–phenotype associations showed few associations between single-nucleotide polymorphism (SNP) variation and these characteristics. In contrast, the majority of SNPs under selection were associated with mericarp hook density. This suggests that for most traits, environmental variation rather than genetic variation likely underlies phenotypic differences observed in the field. Several genes of interest, including several homologues involved in the assembly of the Skp1-Cullin-F-Box IR1/AFB E3 ubiquitin ligase complex (e.g., CAND1, ECR1), are located in areas of the genome with evidence of selection and are targets for further investigation.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© Crown Copyright - His Majesty the King in Right of Canada, as represented by the Minister of Agriculture and Agri-Food Canada, 2024. Published by Cambridge University Press on behalf of Weed Science Society of America

Introduction

Rubiaceae is the fourth-largest plant family and contains several economically important crop plants, including one essential for scientific research—coffee (Coffea arabica L.). Galium is the largest and most widespread genus within the tribe Rubieae with 600 to 700 species and a cosmopolitan distribution that primarily centers on temperate regions (Chen and Ehrendorfer Reference Chen and Ehrendorfer2001; Soza and Olmstead Reference Soza and Olmstead2010). Although Galium species have been used as a coffee substitute (Turner and Szczawinski Reference Turner and Szczawinski1978), the main economic impact of the genus is as weeds, with 18 listed as weeds in jurisdictions around the world (Holm et al. Reference Holm, Pancho, Herberger and Plucknett1991).

A particularly problematic species is false cleavers (Galium spurium L.) (2n = 20, 2C = 0.72 pg, 360 Mb), which can cause significant crop losses in cereals, canola (Brassica napus L.), and sugar beet (Beta vulgaris L.) (Malik and Vanden Born Reference Malik and Vanden Born1988). The large variability of G. spurium has resulted in many synonyms and misidentifications with other taxa of Galium, and adding to the confusion and likely to morphological diversity, the species is considered to be a parental linage within the polymorphic polyploid complex with Galium aparine L. (Chen and Ehrendorfer Reference Chen and Ehrendorfer2001). This occasionally results in G. spurium not being recognized in keys for the genus, which exacerbates difficulties in distinguishing the species by limiting the number of locally relevant keys providing for this separation (Gleason and Cronquist Reference Gleason and Cronquist1991; Looman and Best Reference Looman and Best1979; Voss and Reznicek Reference Voss and Reznicek2012). Differentiation of Galium species based on incompletely represented morphology can be extremely difficult. However, G. aparine is found in temperate zones around the world and at higher elevations in the tropics, often in non-agricultural settings such as moist deciduous woods. Galium spurium has a similarly broad distribution, but typically occurs in sunnier, more open, and disturbed environments such as fencerows, roadsides, and waste grounds. It is considered to be the weedier and more aggressive of the two species (Malik and Vanden Born Reference Malik and Vanden Born1988). As the species climbs and then grows over top of crop species, it can cause lodging and difficulties with harvest while reducing yield. A similar seed size and shape also make it a serious contaminant of canola (Hall et al. Reference Hall, Stromme, Horsman and Devine1998). The abundance of the species has rapidly increased across the Canadian Prairies since the 1970s, and it is now one of the top 10 most abundant weeds (Leeson et al. Reference Leeson, Thomas, Hall, Brenzil, Andrews, Brown and Van Acker2005).

The evolution of herbicide resistance has added complications to the control of G. spurium, with resistance to acetolactate synthesis inhibitors (ALS) and synthetic auxins such as quinclorac reported (Beckie et al. Reference Beckie, Warwick, Sauder, Kelln and Lozinski2012; Hall et al. Reference Hall, Stromme, Horsman and Devine1998; Heap Reference Heap2023). The genetic basis of ALS-inhibitor resistance is a point mutation at the target site. However, while the genetic basis of quinclorac resistance has been shown to be controlled by a locus distinct from the ALS point mutation and to be a single, recessive nuclear trait in material from Alberta, the locus involved has not yet been identified (Van Eerd Reference Van Eerd2004; Van Eerd et al. Reference Van Eerd, McLean, Stephenson and Hall2004). The spread of ALS-inhibitor resistance has been documented in the Canadian Prairie herbicide resistance surveys since 2007 (Beckie et al. Reference Beckie, Lozinski, Shirriff and Brenzil2013), but the spread and extent of the auxin-resistant biotype is currently unknown.

Here we assemble a chromosome-level draft for the genome of G. spurium and situate it in its evolutionary context by comparing its genomic structure to that of its nearest relatives with available chromosome-level genome assemblies: crosswort (Cruciata laevipes Opiz), field madder (Sherardia arvensis L.), leptodermis (Leptodermis oblonga Bunge), and robusta coffee (Coffea canephora Pierre ex Froehner). We then use the genome to facilitate investigation of the species’ population biology. Specifically, we examine the diversity and structure of 19 populations collected from Alberta and Saskatchewan, Canada, using reduced genome representation (ddRAD tags) (Peterson et al. Reference Peterson, Weber, Kay, Fisher and Hoekstra2012). We complement this genetic analysis with data from a common garden experiment conducted in the greenhouse to examine whether phenotypic variation observed in the field is the result of genetic or environmental variation. Finally, we examine homologues of genes known or suspected to be involved in the evolution of herbicide resistance and examine their proximity to single-nucleotide polymorphisms (SNPs) showing evidence of selection. This work provides insights into the population genetics and genomics of this tenacious weed species while providing a foundation for future work examining the genetic basis of herbicide resistance and, more broadly, the systematics of this complicated, currently polyphyletic, genus (Yang et al. Reference Yang, Meng, Peng, Nie and Sun2018).

Materials and Methods

Plant Material and Phenotypic Measurements

Seed was collected in bulk from 19 locations in Canada, 11 in Alberta and 8 in Saskatchewan (Figure 1), and sent to the Ottawa Research and Development Centre in Ottawa, ON, Canada for growth and analysis. While initial plans were to use a random “W” sample across fields selected for collection, drought conditions in the year of collection severely limited the availability of G. spurium populations. As a result, the G. spurium populations were collected in bulk with no set protocol, aside from collecting enough seed from the population at each location to allow conductance of various planned experiments with the material. A voucher for the individual sequenced to chromosome-level is available in the DAO herbarium with the bar code: 01-01667380. The map of locations was produced in R (v. 4.1.1 “Kick Things”) (R Core Team 2021) using the packages: maps (Brownrigg et al. Reference Brownrigg, Minka and Deckman2022), rnaturalearth (South Reference South2017), sf (Pebesma Reference Pebesma2018), tidyverse (Wickham et al. Reference Wickham, Averick, Bryan, Chang, McGowan, François, Grolemund, Hayes, Henry, Hester, Kuhn, Pedersen, Miller, Bache and Müller2019), and units (Pebesma et al. Reference Pebesma, Mailund and Heibert2016).

Figure 1. Map of the Canadian Prairie provinces Albert and Saskatchewan indicating locations for sample populations collected for use in this research, with latitude indicated on the left and longitude indicated along the bottom.

Seeds were sown into Pro-Mix® soil (soil, peat, and sand; 1:2:1; Pro-Mix, Rivière-du-Loup, QC, Canada) in 10-cm-diameter pots and placed in the greenhouse with day and night temperatures set to 20 C days and 18 C nights and a 16-h light cycle. In the first experiment, germination rates were tested, with 20 seeds per population placed on potting soil, covered lightly, and then scored for emergence weekly for 1 mo. In the second experiment, a common garden experiment, 19 to 22 plants per population were grown and randomly assigned a location in the greenhouse and were re-randomized 2 wk after planting, but were too large at 4 wk for an additional round of randomization. These plants were grown in a common environment to evaluate their morphological and phenotypic characteristics while minimizing the effect of environmental variation. Plant height was measured using a measuring tape each week for 3 wk starting when the first plants began flowering until the majority of the plants were too large and had to be folded over and tied to stakes. Plant height was measured from the soil surface to the most distal end of the longest branch. Flowering status was evaluated in three categories—vegetative, flowering but no seeds, and seeds forming—and was recorded weekly for the 3 wk over the same period. Flower and seed size was measured using a dissecting scope and digital calipers for three flowers or seeds of each of three individuals from each population. Plants were harvested after flowering by clipping the stems at the soil surface and placing them in a large paper bag. These bags were then placed in a drying oven at 30 C for 7 d before weighing until seed cleaning to keep the material dry. Total plant (Ohaus Voyager Pro Precision, Parsippany, NJ, USA) and total seed weight (Mettler-Toledo AT200, Columbus, OH, USA) as well as three replicates of 100-seed weight were measured. The total seed weight and average 100-seed weight were used to estimate seed production, unless fewer than 300 seeds were produced. In these cases, the seeds were counted exactly. Seed trichome density was scored using a dissecting scope as naked, sparsely hooked, or densely hooked (Figure 2).

Figure 2. The three visual categories scored for hook density on the mericarps of Galium spurium: (A) naked, (B) sparsely hooked, and (C) densely hooked.

Flow cytometry was conducted to verify the DNA content of each individual included in the greenhouse experiment analysis according to the procedure followed in the Martin Laboratory (Martin et al. Reference Martin, Smith, James, Shalabi, Kron and Sauder2017), except that individuals were quickly screened with a single run. Radish (Raphanus sativus L.) was used as the internal standard. The flowPloidy package (Smith et al. Reference Smith, Kron and Martin2018) in R was used to analyze data from the flow cytometer.

Statistical Analyses of Phenotypic Data

The statistical analysis of phenotypic data was conducted with R. Specifically, phenotypic data were evaluated following Box-Cox transformation from the MASS package (Venables and Ripley Reference Venables and Ripley2002) using one-way analyses with the function oneway.test, which uses Welch tests that do not assume equal variances across populations as the fixed variable followed by Tukey’s post hoc test. The proportions of plants germinating, flowering, or with seed at different time points were analyzed with Pearson’s chi-square test for binary variables (chisq.bintest) with a Benjamini and Hochberg correction for the P-values to control family-wise error rates from the package RVAideMemoire (Hervé Reference Hervé2022). Exploration of the data and analyses was aided by functions in the car package (Fox and Weisberg Reference Fox and Weisberg2019). Confidence intervals for proportions were estimated using the score method (Newcombe Reference Newcombe1998). The function mulcompLetters from the multcompView package (Graves et al. Reference Graves, Piepho, Selzer and Dorai-Raj2019) was used to simplify the results of post hoc pairwise comparisons for plotting. Additionally, functions from Hmisc (Harrell Reference Harrell2022) and plotrix (Lemon Reference Lemon2006) were used for visualization.

DNA and RNA Sampling and Sequencing

One individual was chosen for genome assembly, and leaf tissue was collected on dry ice before extraction using the protocol published by Workman et al. (Reference Workman, Renee, Kilburn, Hao, Liu and Timp2018). High-molecular-weight DNA was extracted, using a Nanobind Plant Nuclei Big DNA kit (Circulomics, Baltimore, MD, USA), followed by a Short Read Eliminator Kit (Circulomics), following the manufacturer’s directions. DNA was first quantified with an Invitrogen™ Qubit™ (Thermo Scientific, Waltham, MA, USA) using the dsDNA HS (high-sensitivity) assay according to the manufacturer’s instructions. Following quantification, DNA was then assessed for quality (impurities and fragment size) using a DropSense (Trinean, Pleasanton, CA, USA) and the Agilent TapeStation 4200 (Agilent, Santa Clara, CA, USA) on the Genomic DNA Screentape. Sequencing was completed using 4 Flow Cells on the Oxford Nanopore Technology MinION system (ONT; Oxford Nanopore Technologies, Oxford Science Park, UK). DNA was also sent to Genome Quebec for sequencing of Illumina 150-bp paired-end reads.

To facilitate annotation of the genome, RNA was extracted from rosette leaves. Tissue was first flash frozen in liquid nitrogen and stored at −80˚C until ready for extraction. RNA extraction was performed on 50 mg of rosette tissue using the Qiagen RNeasy Plant Mini Kit (Qiagen, Germantown, MD, USA) according to the manufacturer’s instructions, adding an extra wash step to ensure a thorough flushing of the column. RNA was first quantified using a NanoDrop™ One (Thermo Scientific), then assessed for quality using a DropSense (Trinean) and the Agilent TapeStation 4200 (Agilent) on the RNA Screentape following the manufacturer’s instructions. Once it was confirmed that RNA samples met the required parameters for sequencing, three samples were sent to BGI (BGI, San Jose, CA, USA) for mRNA library preparation and sequencing of 24 million 150-bp paired-ends per sample.

DNA was extracted using the NucleoSpin-96 Plant II kit (Macherey-Nagel, Dueren, Germany) (supplied by D-Mark Biosciences) for all individuals grown in the greenhouse experiment and sent to the University of Laval for ddRAD-tag library creation and to Genome Quebec for sequencing (Peterson et al. Reference Peterson, Weber, Kay, Fisher and Hoekstra2012).

Packages in R were then used to analyze or visualize the data, including functions from ape (Paradis and Schliep Reference Paradis and Schliep2019), apex (Schliep et al. Reference Schliep, Jombart, Kamvar, Archer and Harris2020), Biostrings (Pagès et al. Reference Pagès, Aboyoun, Gentleman and DebRoy2021), geosphere (Hijmans Reference Hijmans2022), ggplot2 (Wickham Reference Wickham2016), graph4lg (Savary et al. Reference Savary, Foltête, Moal, Vuidel and Garnier2020), IRanges (Lawrence et al. Reference Lawrence, Huber, Pagès, Aboyoun, Carlson, Gentleman, Morgan and Carey2013), msa (Bodenhofer et al. Reference Bodenhofer, Bonatesta, Horejs-Kainrath and Hochreiter2015), phangorn (Schliep Reference Schliep2011), plyr (Wickham et al. Reference Wickham, François and Müller2022), Rsamtools (Morgan et al. Reference Morgan, Pagès, Obenchain and Hayden2021), seqinr (Charif and Lobry Reference Charif and Lobry2007), and stringr (Wickham Reference Wickham2022).

Genome Assembly and Annotation

Long-read ONT data (116X coverage) were called with Guppy (v. 5.0) and assembled using CANU (v. 2.2) (Koren et al. Reference Koren, Walenz, Berlin, Miller, Bergman and Phillippy2017). Contigs that represented the chloroplast were removed based on identity scores and length from Mummer (v. 4.0) (Marçais et al. Reference Marçais, Delcher, Phillippy, Coston, Salzberg and Zimin2018). The remaining contigs were checked using Blobtools (v. 1.1.1) to determine whether they represented contaminates (Laetsch and Blaxter Reference Laetsch and Blaxter2017). However, no appreciable contamination was detected. This assembly was then polished three times using the Illumina data (113X coverage) using Pilon (v. 1.23) (Walker et al. Reference Walker, Abeel, Shea, Priest, Abouelliel, Sakthikumar, Cuomo, Zeng, Wortman, Young and Earl2014) after alignment with Burrows-Wheeler Alignment tool (BWA) (0.7.17) (Li and Durbin Reference Li and Durbin2009). The polished assembly was scaffolded into a chromosome-level assembly by Phase Genomics (Seattle, WA, USA) using their proprietary pipeline and chromosomal conformation data (HiC) after processing of DNA with a Proximo Hi-C 2.0 Kit in the Martin Laboratory. Mummer and the Whole-Genome Duplication Integrated analysis tool kit (WGDI v. 0.6.1) (Sun et al. Reference Sun, Jiao, Yang, Shan, Li, Li, Xi, Wang and Liu2022) were used to compare G. spurium’s genome with those of C. laevipes (GCA_963678965.1), S. arvensis (GCA_948330725.1), L. oblonga (Guo et al. Reference Guo, Wang, Zhang and Wang2021), C. canephora (AUK_PRJEB4211_v1), and common milkweed (Asclepias syriaca L.) (GCA_027405835.1). The genome assembly of G. spurium was assessed using QUAST v. 5.1.0rc1 (Gurevich et al. Reference Gurevich, Saveliev, Vyahhi and Tesler2013), BUSCO (5.4.2, with the embryophyte_odb10 dataset of 2,192 genes) (Simao et al. Reference Simao, Waterhouse, Ioannidis, Kriventseva and Zdobnov2015), and Samtools v. 1.9 (Li et al. Reference Li, Handsaker, Wysoker, Fennell, Ruan, Homer, Marth, Abecasis and Durbin2009). MCScanx (Wang et al. Reference Wang, Tang, Debarry, Tan, Li, Wang, Lee, Jin, Marler, Guo, Kissinger and Paterson2012) was used to produce files for analysis with WGDI. The genome assembly is available as National Center for Biotechnology Information (NCBI) BioProject PRJNA1143567, on the International Weed Genomics Consortium online database WeedPedia (https://www.weedgenomics.org/species/galium-spurium), and from the corresponding author on reasonable request.

RNA sequence data were trimmed, cleaned, and filtered using SOAPnuke (v. 2.1.5) (Chen et al. Reference Chen, Chen, Shi, Huang, Zhang, Li, Li, Ye, Yu, Li, Zhang, Wang, Yang, Fang and Chen2018) and aligned to the genome using HISAT2 (v. 2.2.1) (Kim et al. Reference Kim, Paggi, Park, Bennett and Salzberg2019). Ninety-nine percent of the 72 million read pairs mapped to the genome. RepeatModeler (v. 2.0.3) (Smit and Hubley Reference Smit and Hubley2008–2015) and RepeatMasker (v. 4.1.2.p1) (Smit et al. Reference Smit, Hubley and Green2010–2013) were used to generate a masked version of the genome. The BRAKER pipeline (v. 1.9) (Brůna et al. Reference Brůna, Hoff, Lomsadze, Stanke and Borodovsky2021; Hoff et al. Reference Hoff, Lomsadze, Borodovsky and Stanke2019) was used to produce annotations based on this masked genome and the hints files generated by HISAT2. This pipeline relied on AUGUSTUS (Stanke et al. Reference Stanke, Diekhans, Baertsch and Haussler2008), GeneMark-ET (Lomsadze et al. Reference Lomsadze, Burns and Borodovsky2014), and Samtools. This procedure was also followed to add annotations to the genome of L. oblonga using RNA-seq data downloaded from the NCBI’s Sequence Read Archive (SRR9839476) to allow comparative analysis with G. spurium using WGDI and GENESPACE (Lovell Reference Lovell2023; Lovell et al. Reference Lovell, Sreedasyam, Schranz, Wilson, Carlson, Harkess, Emms, Goodstein and Schmutz2022). GENESPACE uses other tools, including OrthoFinder (Emms and Kelly Reference Emms and Kelly2018, Reference Emms and Kelly2019), which constructed the phylogeny of the species included here using the STAG method. Following masking, AUGUSTUS was used to annotate the genomes of C. laevipes, S. arvensis, and A. syriaca, with tomato (Solanum lycopersicum L.) as the model.

Repetitive DNA was annotated using the Extensive de novo TE Annotator (EDTA v. 1.8.3) (Ou et al. Reference Ou, Su, Liao, Chougule, Agda, Hellinga, Lugo, Elliott, Ware, Peterson, Jiang, Hirsch and Hufford2019), which relies on LTRHarvest (Ellinghaus et al. Reference Ellinghaus, Kurtz and Willhoeft2008), LTR_Finder (Xu and Wang Reference Xu and Wang2007), LTR_Retriever (Ou and Jiang Reference Ou and Jiang2018), TIR-Learner (Su et al. Reference Su, Gu and Peterson2019), Generic Repeat Finder (Shi and Liang Reference Shi and Liang2019), HelitronScanner (Xiong et al. Reference Xiong, He, Lai, Dooner and Du2014), and TEsorter (Zhang et al. Reference Zhang, Wang, Ou and Li2019). LTRHarvest, LTR_Finder, and LTR_Retriever were also used to calculate the LTR Assembly Index (LAI) (Ou et al. Reference Ou, Chen and Jiang2018). The genome sequence was also analyzed to detect Helitrons with EAHelitron (Hu et al. Reference Hu, Xu, Wen, Yi, Shen, Ma, Fu, Ouyang and Tu2019).

Genes that could be involved in the evolution of herbicide resistance were identified through homology using blastp, the amino acid sequences of the genes identified using the BRAKER pipeline, and amino acid sequences of the genes of interest downloaded from GenBank or The Arabidopsis Information Resource (TAIR) (Supplementary Table 1). A cutoff of 1-e50 was used to identify the best hits from these searches of the annotated proteins with score considered as a secondary indication to eliminate hits that had much smaller scores (<50%) than the best hit for the reference protein. BLAST was run on the coding sequence for the protein (Supplementary Table 2) to cross-check that the best hits (Supplementary Table 3) with identifications indicated that the protein of interest had been likely been identified.

SNP Calling and Population Genetics Analyses

SNP calling was completed for the RAD-seq data using the reference-based pipeline in STACKS (v. 2.6) (Catchen et al. Reference Catchen, Hohenlohe, Bassham, Amores and Cresko2013) after alignment to the genome using BWA with the parameter (−L) for reducing soft clipping set to 500 (Li and Durbin Reference Li and Durbin2009). Following the creation of the variant call file (vcf) by STACKS, data were further filtered using VCFtools (0.1.16) (Danecek et al. Reference Danecek, Auton, Abecasis, Albers, Banks, DePristo, Handsaker, Lunter, Marth, Sherry, McVean and Durbin2011) using the recommendations of O’Leary et al. (Reference O’Leary, Puritz, Willis, Hollenbeck and Portnoy2018) to create two datasets for further analysis. STACKS initially identified 25K loci covering 6.7 million bp of the genome with 86K variants. The first filtered dataset included only genotypes with a read depth of greater than 5, loci with a mean read depth greater than 15, SNPS with a quality greater than 20, and SNPs with a minimum allele count greater than 3. These data contained just over 40K SNPs for 365 individuals with less than 1% missing and were used for examining genome-wide nucleotide diversity. This 40K set was further filtered for missingness following the O’Leary protocol (O’Leary et al. Reference O’Leary, Puritz, Willis, Hollenbeck and Portnoy2018) and thinned so that SNPs were at least 500 bp apart to reduce linkage. This second dataset contained just under 4K SNPs with less than 1% missing for 337 individuals and was used for examining population structure, the association between genetic and morphological variation, and evidence of selection. VCFtools was also used to calculate nucleotide diversity, with the remaining population summary statistics calculated with hierfstat (Goudet and Jombart Reference Goudet and Jombart2022). A Mantel test (mantel.randtest::ade4) (Thioulouse et al. Reference Thioulouse, Dray, Dufour, Siberchicot, Jombart and Pavoine2018) was used to determine whether there was a relationship between genetic and geographic distance, and redundancy analysis (vegan::rda) (Oksanen et al. Reference Oksanen, Simpson, Blanchet, Kindt, Legendre, Minchin, O’Hara, Solymos, Stevens, Szoecs, Wagner, Barbour, Bedward, Bolker and Borcard2022) was used to determine whether the morphological variation observed in the greenhouse was associated with variation within the SNPs. The R package pcadapt (v. 4.3.5) was used to examine linkage disequilibrium and to look for evidence of SNPs in regions under selection with the final analysis using K = 5, linkage disequilibrium parameters set to size = 200 and threshold = 0.1, and a minimum minor allelic frequency of 0.02 (Luu et al. Reference Luu, Bazin and Blum2017; Privé et al. Reference Privé, Luu, Vilhjalmsson and Blum2020). This method examines association between population structure and genetic variants with the assumption that outlying variants are indicative of local adaptation (Privé et al. Reference Privé, Luu, Vilhjalmsson and Blum2020). Tools for population genetics and genomics in R were used to analyze and visualize the SNP data, including functions from the following packages: adegenet (Jombart Reference Jombart2008), circlize (Gu et al. Reference Gu, Gu, Eils, Schlesner and Brors2014), colorspace (Zeileis et al. Reference Zeileis, Fisher, Hornik, Ihaka, McWhite, Murrell, Stauffer and Wilke2020), dartR (Mijangos et al. Reference Mijangos, Gruber, Berry, Pacioni and Georges2022), Gviz (Hahne and Ivanek Reference Hahne, Ivanek, Mathe and Davis2016), gwscaR (Flanagan Reference Flanagan2023), magrittr (Bache and Wickham Reference Bache and Wickham2022), pegas (v. 1.1) (Paradis Reference Paradis2010), phytools (v. 1.2-0) (Revell Reference Revell2012), png (Urbanek Reference Urbanek2022), poppr (Kamvar et al. Reference Kamvar, Tabima and Grünwald2014), pinfsc50 (Knaus and Grünwald Reference Knaus and Grünwald2016), psych (Revelle Reference Revelle2023), qvalue (Storey et al. Reference Storey, Bass, Dabney and Robinson2022), VariantAnnotation (v. 1.42.1), and vcfR (Knaus and Grünwald Reference Knaus and Grünwald2016).

ChatGPT

In line with suggestions on how to improve time use efficiency in research (Pividori Reference Pividori2024) ChatGPT version 4 (OpenAI 2024) was used to reformat the references from EndNote Style “Weed Science J” after adding a DOI field for journal articles using the following prompt “I want you to pretend you are an editor at a scientific journal and reformat these references using these 9 criteria. 1. Journal names should not be in italics. 2. Author names should be in plain text. 3. DOIs should not be hyperlinks. 4. URLs should not be hyperlinks. 5. Plant scientific names should be in italics. 6. Plant names in English should not be in italics. 7. References should not be numbered. 8. Only proper names should be capitalized in the reference titles. 9. The journal names should be abbreviated based on the abbreviations listed at this website https://woodward.library.ubc.ca/woodward/research-help/journal-abbreviations/.” References were then checked and corrections made where necessary.

Results and Discussion

With the evolution of herbicide resistance, it is becoming increasingly important to be able to correctly identify weed species, comprehend their biology well enough to enable integrated management techniques, and understand the genetic basis of herbicide resistance so that individuals with these resistances can be identified. As a result, having the genetic tools to accomplish these goals is key. Galium spurium is a weedy species that grows up and over crops, causes lodging, reduces yield quantity and quality, and creates difficulties at harvest (Hall et al. Reference Hall, Stromme, Horsman and Devine1998). It has developed herbicide resistance and belongs to a large genus of difficult to identify species. Here we produce a chromosome-level assembly of G. spurium, annotate it with RNA-seq data, and then use the genome in investigations of population genetics and to examine patterns of selection across the genome. This represents a foundational step toward identifying the genetic basis of auxinic herbicide resistance and clarifying evolutionary relationships within Galium.

Genome Assembly and Annotation

The flow cytometry estimates for 2C DNA content for G. spurium accessions grown in the greenhouse ranged from 0.71 to 0.74 pg with an average of 0.72 pg (SD ±0.02) and indicated a haploid genome size of approximately 360 Mbp. The chromosome-level draft sequence produced for Galium spurium covers 85% this expected genome size and includes 94% of the expected core eukaryotic genes. The initial assembly of the G. spurium’s genome from the long-read ONT data (116X coverage) using CANU resulted in a draft assembly with 290 contigs with an NG50, the sequence length of the shortest contig at half the expected genome size when combined with the larger contigs, of 10.7 Mb and total assembly size of 318.5 Mb. A total of 128 of these contigs were placed on 10 scaffolds by Phase Genomics using Hi-C data, resulting in a chromosome-level assembly (Table 1). After the assembly was polished with Pilon, analysis with BUSCO indicated 94% of the expected genes were complete, with the vast majority (89%) represented in a single copy and a minority (5%) found to be duplicated. Just under 150 million RNA data reads were used to annotate the genome, with 98% of these aligning to the genome. The BRAKER pipeline identified and annotated 35,549 genes based on this data, with 32,269 genes annotated using the same pipeline and downloaded data for L. oblonga. Repeat annotation using EDTA indicated that 37% of the genome was composed of repetitive elements, with the largest proportion of these identified as class 1 retrotransposons of the “Gypsy” type (11.5%) (Figure 3). Helitron density was assessed by EAHelitron as 6.7, and the LAI was 18.27, indicating that the genome meets the level of completeness for a reference-level genome (Ou et al. Reference Ou, Chen and Jiang2018).

Table 1. Genome assembly statistics for Galium spurium.

a Number of base pairs represented with N rather than a sequenced nucleotide usually as a result of Ns placed between contigs during scaffolding.

Figure 3. Circle plot of the Galium spurium genome with (A) gene density, (B) single-nucleotide polymorphism (SNP) density, (C) heterozygosity, (D) alternative allelic frequency, and (E) long terminal repeat frequency. An indication of chromosome length is plotted along the outside edge.

Comparative Genomics

The genome of G. spurium has 10 chromosomes, but for the genus, other genera in the Rubiaceae, and in sister families such as the Apocynaceae, the more common base chromosome number is n = 11. Rubiaceae is divided into three subfamilies: Rubioideae, which includes Galium; Ixoroideae, which includes Coffea; and Cinchonoideae, which includes Cinchona (sources of quinine) (Bremer Reference Bremer2009; Stevens Reference Stevens2017). Galium, Cruciata, and Sherardia are all within the tribe Rubieae (Rubioideae), while Leptodermis is in a different tribe—the Paederieae (Rubioideae). As a result, C. laevipes (n = 11) and S. arvensis (n = 11) are the currently available chromosome-level assemblies that are most closely related to G. spurium with this chromosome number followed by L. oblonga (n = 11), and then more distantly by C. canephora (n = 11) and A. syriaca (n = 11), which is an out group from the Apocynaceae (Stevens Reference Stevens2017; Figure 4).

Figure 4. A phylogenetic tree indicating the evolutionary relationships among species studied here produced by OrthoFinder. The values at the nodes indicate support calculated by OrthoFinder using the Species Tree Inference from All Genes (STAG) method and indicates the proportion of orthogroup trees that had the bipartition. Galium, Cruciata, and Sherardia are all within the tribe Rubieae (Rubioideae), while Leptodermis is in the tribe Paederieae (Rubioideae), and Coffea canephora is in a different subfamily, Ixoroideae. Asclepias syriaca is an outgroup from the Apocynaceae.

The genome of G. spurium shows strong synteny with the genomes of the other species from the Rubiaceae (Figures 5 and 6), including C. canephora, despite the divergence of their genera approximately 68 MYA (Kumar et al. Reference Kumar, Suleski, Craig, Kasprowicz, Sanderford, Li, Stecher and Hedges2022) (Figure 6B). Three chromosomes are largely syntenic when the genomes of C. laevipes and G. spurium are compared; G. spurium’s chromosomes 6, 7, and 10 show strong synteny with C. laevipes’s chromosomes 5, 7, and 9, respectively. However, there is no clearly identifiable mechanism, neither end-to-end joining nor nested chromosome fusion (Lysak Reference Lysak2022), that contributed to chromosome number reduction in G. spurium (Figure 6A). Rather, the genes from the chromosomes of this closest assembled relative are distributed among multiple chromosomes in G. spurium. For example, genes from C. laevipes’s chromosome 3 (cl03) are distributed among five of G. spurium’s chromosomes with multiple reciprocal translocations required to explain the current structure. This indicates that although the genome has strong synteny across the entirety of some chromosomes (e.g., cl05 [or cc01] and gs06), more information is needed to reconstruct the evolution of G. spurium’s current chromosome structure. Comparison with other genomes in Galium or Asperula would likely be informative for understanding chromosome evolution of G. spurium.

Figure 5. A riparian plot of syntenic relationships among species from the Rubiaceae generated by GENESPACE showing genomic structure across the family, with Coffea canephora at the bottom and working upward: Leptodermis oblonga, Sherardia arvensis, Cruciata laevipes, and finally Galium spurium at the top of the plot. Asterisks indicate that the chromosome was inverted to clarify the plot. Note that C. canephora’s cc09 has been shifted to better indicate its position within the chromosomes of the Rubiaceae species.

Figure 6. Plots of synteny between proteins of Galium spurium and (A) Cruciata laevipes and (B) Coffea canephora. Red dots indicate the best syntenic matches and blue dots indicate secondary matches, with gray dots indicating lower-order matches. For G. spurium and C. laevipes, chromosomes are colored based on syntenic relationships with C. canephora. We see from these plots that several chromosomes are almost entirely syntenic (e.g., gs06, gs07, and gs10), but there is no clear pattern suggesting that end-to-end joining or nested chromosome fusion led to the reduction in chromosome number in G. spurium from n = 11 to n = 10.

Population Genetics

The overall nucleotide diversity (π), the average number of pairwise differences between individuals, averaged 3 × 10−4. Across all populations, 83% of individuals were homozygous for the major allele, and 16% were homozygous for the minor allele (Figure 7A). Populations had low observed heterozygosity (HO) with a value of 0.02 overall (Figure 7B). The FST for the populations was 0.54, with pairwise values ranging from 0.15 to 0.79 (Figure 7C), indicating strong population differentiation (Conner and Hartl Reference Conner and Hartl2004), and there were many more homozygotes (FIS = 0.86) in the population than expected given random mating. Overall, Provesti’s genetic distance between individuals averaged 0.23. The number of alleles that were identical by state and shared between two individuals within a population averaged 88%, and the values were only a little lower (75%) when individuals from different populations were compared. This level of genetic diversity and homozygosity is most similar to that of the germplasm of genetically depauperate crop species. For example, nucleotide diversity was found to be π = 2.3 × 10−4 for 736 accessions of coffee (C. arabica) (compare with lower-ploidy ancestors: C. canephora π = 2.6 × 10−3 and Coffea eugenioides S. Moore π = 1.1 × 10−3), major allelic frequencies that were largely greater than 95% and a deficiency of heterozygotes—likely a result of reductions in diversity following an origin involving a single allopolyploidization event (Scalabrin et al. Reference Scalabrin, Toniutti, Di Gaspero, Scaglione, Magris, Vidotto, Pinosio, Cattonaro, Magni, Jurman, Cerutti, Suggi Liverani, Navarini, Del Terra and Pellegrino2020). The level of heterozygosity in G. spurium is similar to that of 376 tea accessions [Camellia sinensis (L.) Kuntze] from the Rwebitaba Tea Research Centre that had an HO of 0.06 (Tadeo et al. Reference Tadeo, Ronald, Vereriano and Robooni2024) and 31 accessions of peanut (Arachis hypogaea L.) from Taiwan that had major allelic frequencies of 0.87 and an average genetic distance of 0.17. This suggests a high rate of self-fertilization, inbreeding, low gene flow among populations, and low genetic diversity.

Figure 7. For the 19 populations studied here (A) overall proportion of individuals showing homozygosity for the major allele (blue), heterozygosity (purple), and homozygosity (magenta) for the minor allele; (B) levels of heterozygosity by population; and (C) a heat map of pairwise FST values for with cooler colors indicating lower FST values and warmer indicating higher FST values. Heterozygosity was low across all populations with the majority of individuals (83%) homozygous for the reference allele with evidence of limited outcrossing. On average, the FST values were 0.54, but AB-08 showed lower pairwise FST values across comparisons and SK-05 showed higher values.

This high level of population structure and inbreeding has likely contributed to the relatively slow spread of ALS-inhibitor and quinclorac resistance across the Canadian Prairies. For example, the rate of spread of ALS-inhibitor, glyphosate, or auxinic herbicide resistance in the wind-pollinated, outcrossing species kochia [Bassia scoparia (L.) A. J. Scott] with low population structure (FST = 0.01) (Martin et al. Reference Martin, Benedict, Wei, Sauder, Beckie and Hall2020) has been rapid, increasing from 4% of sampled sites (Hall et al. Reference Hall, Beckie, Low, Shirriff, Blackshaw, Kimmel and Neeser2014) to 78% of sites in 10 yr (Geddes et al. Reference Geddes, Pittman, Hall, Topinka, Sharpe, Leeson and Beckie2023). In contrast, ALS inhibitor–resistant Galium was first documented in 1996 in Alberta, Canada (Hall et al. Reference Hall, Stromme, Horsman and Devine1998); field surveys from 2007 to 2011 and then 2012 to 2017 found increases in Alberta from 17% to 44%, but lower, more constant levels in Saskatchewan (20%) and Manitoba (17%) (Beckie et al. Reference Beckie, Lozinski, Shirriff and Brenzil2013, Reference Beckie, Shirriff, Leeson, Hall, Harker, Dokken-Bouchard and Brenzil2020). The structured nature of these populations is also reflected in (1) a significant Mantel test indicating a correlation (P < 0.001) between geographic distance and Provesti’s genetic distance (dist.genpop::adegenet; Figure 8), (2) 55% of the variation being explained by population (44% by individual) in an analysis of molecular variance, and (3) the clumping of populations as visualized in the principal component analysis of molecular variation (Figure 9). The low genetic diversity of these populations may also limit the potential of the populations to evolve resistances in response to selection from herbicides without recruiting additional variation from processes such as mutation, hybridization, and polyploidization.

Figure 8. Provesti’s genetic distance (y axis) by geographic (landscape) distance (m) for pairings of the 19 populations. This graph (graph41g::scatter_dist) shows a smooth (loess) line with 95% confidence interval. A Mantel test with 1,000 Monte Carlo repetitions indicates a significant (P < 0.001) positive relationship providing evidence of isolation by distance.

Figure 9. Principal component analysis (PCA) of the genetic variation of single-nucleotide polymorphism (SNP) data. Circles in warm colors represent individuals from Alberta and triangles in cool colors represent those from Saskatchewan. Clumping of individuals from the same population is evident, but variation among populations provides spread. Axis1 explains 76% of the variation, while axis 2 explains 44%.

Phenotype

A striking aspect of G. spurium populations in the field is the level of morphological variation observed. In the common greenhouse environment, while the characteristics we measured showed significant variation, the majority of the populations were statistically equivalent (Figures 10 and 11). A total of 384 plants were included in the common garden greenhouse experiment, with 19 to 22 individuals representing each of the 19 populations. All accessions had the morphological characteristics expected for G. spurium, specifically green flowers and small seeds (mericarps), rather than the expected morphological characteristics for G. aparine of white flowers and larger mericarps. Some individuals from one accession were shorter, with little distance between internodes and narrower leaves giving them a shorter “mossy” habit; however, the flowers, mericarps, and DNA content were all consistent with the other individuals grown here, and other members of the same accession had more typical morphology, suggesting this is part of the variation for the species.

Figure 10. Phenological data from the 19 populations grown in the greenhouse, including (A) proportion of seed germinated after 7 d, (B) proportion flowering at 6 wk, and (C) proportion of plants with seed at 8 wk. In all cases, statistical differences are observed, but the majority of populations are highly variable and show strong overlap with values from the other populations.

Figure 11. Morphological data from the 19 populations grown in the greenhouse for key characteristics, including (A) plant height at 6 wk, (B) total estimated seed count, and (C) seed (mericarp) diameter.

In the germination experiment, 51% of the seeds (20 per population) on average had emerged after 7 d (Figure 10A). Variation was seen by population, with AB-06 showing faster germination (90%) than 10 other populations including the slowest to germinate SK-08 (15%), but most populations showed overlapping confidence intervals and were not statistically different because of high variability. At 6 wk after planting, an average of 43% of the plants had begun flowering (Figure 10B). Again, there was statistically significant variation among populations, with AB-02 and AB-06 having significantly more flowering individuals than 13 populations, including AB-09, AB-10, or SK-08, but most of the populations were not statistically different. Similarly, after 8 wk, at the end of the greenhouse experiment, the majority of populations had a high proportion of plants that had set seed (79%), and while population AB-10 had statistically fewer plants with seed than most other populations, most populations were not statistically distinct (Figure 10C).

The morphological characteristics measured showed less variation than the phenological characteristics, but they still showed a few statistical differences overall. The average height of the plants at 8 wk was 107.2 cm, with the tallest population, AB-04, averaging 132.0 cm, and the shortest, AB-10, averaging less than half of this at 53.8 cm (Figure 11A). Total estimated seed production averaged 3,158, with AB-01 and SK-01 producing the most seed (more than 3,500 seeds each), and AB-10 producing the least with half of this amount (1,720; Figure 11B). Average diameter of a mericarp was 1.8 mm, with the populations with the largest mericarps, such as AB-04 (2.04 mm), statistically different from the populations with the smallest mericarps SK-06 (1.54 mm); (Figure 11C). However, as with germination, flowering, and seed production, the majority of populations were not statistically different from each other for height, total seed production, or seed size. Flower size did not differ statistically across populations and averaged 1.9 mm (0.18 mm standard deviation).

A redundancy analysis (RDA) was used to investigate the association between genotype and the phenotype observed in the greenhouse. Five characteristics measured in the greenhouse were retained for the analysis after eliminating continuous variables that had strong correlation to each other and factors that had too few individuals in each category. The retained characteristics were height and flowering status at 4 wk, total plant weight, seed count, and the density of hooks on the mericarp. The RDA indicated these predictors explained (R2) 5.5% of the variation in the data after adjustment, and the model was significant according to a permutation test (1,000; P < 0.001). A total of 34 candidate SNPs with RDA loadings 3 SDs or more from the center of their distributions were examined. This minimized false-positive rates for SNPs of interest by setting the two-tailed P-value at 0.0027. Of these 34 SNPs, the majority (24) were associated with the density of hooks of the mericarp (Figures 2 and 12). Among the remaining SNPs with high loadings, nine were associated with total plant weight at harvest and one with early flowering time. This indicates that the majority of these SNPs are likely to be under selection as a result of association with the density of hooks on their mericarps. Individuals that were scored as without hooks or with a low density of hooks occurred sporadically across the populations, but were most frequent in populations SK06, SK07, and SK08 (Figures 1 and 2). Variation in the density of spines on the mericarps of G. spurium, has been previously described and used to separate individuals into two forms: G. spurium f. spurium with smooth fruit, and G. spurium f. vaillantii with hooked spines (Moore Reference Moore1975). Moore (Reference Moore1975) indicates that Linnaeus’s specimen of the species was the smooth form, but that the smooth form was less common in the material examined from the Canadian Prairies. Further, he noted that one of the two sites with smooth-form specimens was near Melfort, SK, close to where our populations with the most smooth-fruited individuals were collected. Malik and Vanden Born (Reference Malik and Vanden Born1988) noted that they observed an intermediate variant in material from Alberta and Saskatchewan and that the presence of more than one form in a field was common. The presence of the hooked spines on the mericarp are inferred to provide an advantage for dispersal by animals (Malik and Vanden Born Reference Malik and Vanden Born1988). More work is needed to understand the genetic basis of hooks on these fruits and how the character influences the fitness of these populations, but the species may offer an excellent opportunity for this work, as the development of nearly isogenic lines that differ for this character should be possible.

Figure 12. Redundancy analysis (RDA) examining how variation in the single-nucleotide polymorphism (SNP) data is associated with variation in key traits. A total of 5.5% of the variation in the SNP data was explained by the morphological characteristics included. The majority of highly loading SNPs (70%), with a cutoff of 3 SDs from the mean, were associated with hook density and are highlighted in red. Most of the remaining SNPs are associated with total seed weight (26%) and highlighted in purple, while the one SNP associated with flowering time is in yellow.

The data from the common environment, combined with the result of the population genetics presented earlier, suggest that some of the disjunct morphological variation observed within populations is likely the result of a combination of the inbred nature of groups of individuals observed within the same field and environmentally induced variation. Differences among individuals in a field, such as a lack of hooks on the mericarp, likely reflects homozygosity for an alternative allele. However, given the minimal amount of genetic variation observed overall and the strong genetic similarity among samples even from different populations, much of the variation in characteristics such as emergence and flowering time is likely attributable to environmental variation, given differing conditions rather than genetic differences.

Potential and Selection Herbicide-Resistance Genes

Management of G. spurium in western Canadian crops has been primarily reliant on herbicides from the following modes of action: ALS inhibitors (Group 2), synthetic auxins (Group 4), 5- enolpyruvylshikimate-3-phosphate synthase (EPSPS) inhibitors (Group 9), glutamine synthetase inhibitors (Group 10), and more recently, protoporphyrinogen oxidase (PPO) inhibitors (Group 14) and 4-hydroxyphenylpyruvate dioxygenase (HPPD) inhibitors (Group 27) (Anonymous 2022). In some crops, such as some pulse crops, the ALS-inhibiting herbicides are nearly the only option. Resistance to ALS inhibitors has spread through the populations of G. spurium on the Canadian Prairies, although it has not yet reached fixation (Beckie et al. Reference Beckie, Shirriff, Leeson, Hall, Harker, Dokken-Bouchard and Brenzil2020). The ALS proteins from G. aparine and C. arabica mapped to G. spurium’s chromosome gs06. The amino acid sequence captured in this genome assembly has one of the point mutations, Trp-574-Leu, reported to confer ALS-inhibitor resistance in G. aparine by Deng et al. (Reference Deng, Di, Cai, Chen and Yuan2019). No changes are observed at residue Pro-197 or Asp-376, which are the two other sites Deng et al. (Reference Deng, Di, Cai, Chen and Yuan2019) reported to confer ALS-inhibitor resistance in G. aparine. Two other changes in the amino acid sequence are present compared with the susceptible sequence in GenBank (Sun et al. Reference Sun, Wang, Zhang, Liu and Bian2011): Thr-59-Ala and Ala-380-Thr, but these sites do not correspond to any of the other sites (Ala-122, Ala-205, Arg-377, Ser-653, or Gly-654) reported to confer resistance in other weed species (Tranel et al. Reference Tranel, Wright and Heap2023). We used the R package PCAdapt to detect SNPs under selection and compared them with the location of the genes of interest (Luu et al. Reference Luu, Bazin and Blum2017; Privé et al. Reference Privé, Luu, Vilhjalmsson and Blum2020). Five principal components were chosen based on the scree plot, and alpha was set to 0.01 for a false discovery rate of 1:100 after correction with the Benjamini-Hochberg procedure. The SNPs were thinned to reduce the effect of linkage disequilibrium, with size set to 200, threshold set to 0.1, and the minimum minor allelic frequency set to 2%. A total of 192 SNPs were identified as outliers, and their position was compared with the position of the genes of interest for involvement in herbicide resistance (Figure 13; Table 2; Supplementary Table 4). Two outlying SNPs were found within 0.5 Mbp of the ALS gene on gs06, a potential indication of the selection on the locus that is occurring with the ongoing spread of the ALS-inhibitor resistance.

Figure 13. Manhattan plots of almost 4,000 single-nucleotide polymorphisms (SNPs) across the genome, with 192 SNPs showing evidence of selection after the Benjamini-Hochberg procedure was used to adjust significance value to control false discovery, the rate was set to 1%, and SNPs were thinned to reduce the effects of linkage disequilibrium, highlighted in red. The green bar at the top of each chromosome plot indicates gene density, with darker areas annotated with more genes. The locations of genes with proteins that are homologues to known or potential contributors to herbicide resistance are shown on the x-axes.

Table 2. Top 20 homologues of genes with known or potential roles in herbicide resistance ranked by proximity to single-nucleotide polymorphisms (SNPs) under selection. a

a Columns contain information on the homologue, the gene’s position in the genome, the distance to the nearest SNP indicating selection, and the number of these SNPs within a 1-Mbp window centered on the gene’s midpoint. See Supplementary Table 4 for all results. SNPs were identified as under selection using ordination with almost 4,000 SNPs across the genome included. The Benjamini-Hochberg procedure was used to adjust significance value to control false discovery, which was set to 1%. SNPs were thinned to reduce the effects of linkage disequilibrium.

In contrast to ALS-inhibitor resistance, resistance to quinclorac, a synthetic auxin, has not yet become widespread in these populations (Beckie et al. Reference Beckie, Shirriff, Leeson, Hall, Harker, Dokken-Bouchard and Brenzil2020; Van Eerd Reference Van Eerd2004; Van Eerd et al. Reference Van Eerd, McLean, Stephenson and Hall2004), although surveying has been limited. There has been a lower global incidence of synthetic auxin herbicide resistance compared with ALS-inhibitor resistance (Heap Reference Heap2023). This is likely a consequence of the importance of auxin, which should be viewed not as a hormone, but as an ancient, complex, connective, impetus signal with self-organizing transport streams of indole-3-acetic acid (IAA) (Zažímalová et al. Reference Zažímalová, Petrášek, Benková, Mueller-Roeber and Balazadeh2014). The target site of synthetic auxins such as quinclorac is still being elucidated, with the complexity, flexibility, and redundancy of the auxin pathway complicating our understanding of both its normal and abnormal functioning (Todd et al. Reference Todd, Figueiredo, Morran, Soni, Preston, Kubes, Napier and Gaines2020; Zažímalová et al. Reference Zažímalová, Murphy, Yang, Hoyerová and Hosek2010). Synthetic auxins in susceptible plants are highly stable IAA mimics that lead to an “auxin overdose” (Grossmann Reference Grossmann2010). The cause of demise in G. spurium is suspected to be the production and accumulation of reactive oxygen species, including hydrogen peroxide, and resistance, controlled by a recessive nuclear gene, has been hypothesized to be the result of a mutation at a target site or along the signal transduction pathway (Van Eerd et al. Reference Van Eerd, Stephenson, Kwiatkowski, Grossmann and Hall2005).

One critical auxin receptor is TIR1, an F-box protein found in the nucleus (Dharmasiri et al. Reference Dharmasiri, Dharmasiri and Estelle2005a; Kepinski and Leyser Reference Kepinski and Leyser2005). Auxins regulates gene expression by acting as “molecular glue,” filling the cavity between TIR1 and the substrate by forming a continuous hydrophobic core (Tan et al. Reference Tan, Calderon-Villalobos, Sharon, Zheng, Robinson, Estelle and Zheng2007). Multiple homologues of TIR1 have been identified (Dharmasiri et al. Reference Dharmasiri, Dharmasiri, Weijers, Lechner, Yamada, Hobbie, Ehrismann, Jürgens and Estelle2005b) and Arabidopsis mutants for auxin signaling F-box protein 5 (AFB5) have demonstrated differential resistance to picolinic acids (picloram, clopyralid) but no other synthetic auxins (Walsh et al. Reference Walsh, Neal, Merlo, Honma, Hicks, Wolff, Matsumura and Davies2006). TIR1/AFB is the substrate recognition subunit of SCFTIR1/AFB, an E3 ubiquitin ligase complex that catalyzes the conjugation of ubiquitin to Aux/IAA repressor proteins, which then is targeted by the 26S proteasome for degradation (Quint and Gray Reference Quint and Gray2006). Once the AUX/IAA repressor proteins are degraded, auxin response factor (ARF) transcription factors can bind to auxin-responsive genes in the DNA (Li et al. Reference Li, Xie, Hu and Zhang2016).

The herbicide 2,4-D was found to similarly influence the binding pocket of TIR1 as IAA, which promotes SCF ubiquitin ligase–catalyzed degradation of AUX/IAA (Tan et al. Reference Tan, Calderon-Villalobos, Sharon, Zheng, Robinson, Estelle and Zheng2007). Transcription of acetyl-CoA carboxylase ACC synthase (ACS) genes are upregulated with auxins, shown with an antisense construct in transgenic tomato (Grossmann and Schmülling Reference Grossmann and Schmülling1995). In G. aparine, this was shown to promote an upregulation of NCED gene expression (Kraft et al. Reference Kraft, Kuglitsch, Kwiatkowski, Frank and Grossmann2007), ACC synthase activity, ACC, and ethylene production, where ethylene also upregulated abscisic acid ABA levels (Hansen and Grossmann Reference Hansen and Grossmann2000). It has been suggested that the ethylene-induced increase in ABA promotes growth inhibition and senescence (Grossmann Reference Grossmann2010; Grossmann and Hansen Reference Grossmann and Hansen2001). In peas (Pisum sativum L.) treated with 2,4-D, it was shown that reactive oxygen species accumulated, leading to protein and lipid membrane damage (Pazmiño et al. Reference Pazmiño, Rodriguez-Serrano, Romero-Puertas, Archilla-Ruiz, Del Río and Sandalio2011). The authors also found that there was differential tolerance based on staging, whereby mature tissue was more tolerant than reproductively immature tissue.

In their recent review, Todd et al. (Reference Todd, Figueiredo, Morran, Soni, Preston, Kubes, Napier and Gaines2020) suggest that target-site mutations in auxin perception and signaling could involve three major groups of auxin signaling proteins: (1) auxin efflux and influx transporters (PIN, ABCB, AUX/LAX); (2) auxin perception and signaling proteins: ARFs, transcriptional repressors (AUX/IAA), and the Skp1-Cullin-F-Box TIR1/AFB ubiquitin ligase complex (AXR1, ECR1, RCF1, HSP90/SGT1, RB, NEDD8, CAND1, COP9, CUL1); and (3) the transmembrane kinase family (TMK). The ARF proteins compete for promoter target sites of auxin response genes and dimerize with AUX/IAA repressor proteins at low auxin concentrations, while when auxin concentrations are high, AUX/IAA repressor proteins interact with TIR1/AFB and the AUX/IAA proteins become ubiquitinated and ultimately degraded via the 26S proteasome pathway (reviewed in Luo et al. Reference Luo, Zhou and Zhang2018). In Arabidopsis mutants, loss of TMK activity affected auxin signal transduction in root and potentially shoot development, cell expansion in roots, hypocotyls, and stamen filaments, and leaf cell expansion and cell proliferation (Dai et al. Reference Dai, Wang, Patterson and Bleecker2013). Additionally, a mutation in the auxin receptor auxin binding protein 1 (ABP1) was identified in Group 4, synthetic auxin–resistant wild mustard (Sinapis arvensis L.) from Manitoba (Grossmann Reference Grossmann2010; Zheng and Hall Reference Zheng and Hall2001).

Multiple homologues for these auxin efflux and influx transports were identified in the G. spurium genome: 4 of the PIN-FORMED auxin efflux carrier protein (PIN1), 3 of the AUXIN1/LIKE-AUX1 (AUX/LAX) auxin influx carrier proteins, and 13 of the ATP-binding cassette B (ABCB) proteins (Supplementary Table 5). Nine proteins with homology to the transcriptional repressors (IAA), nine with homology to ARFs, and six with homology to both were also found (IAA/ARF). Five homologues of TIR1/AFB were found and at least one homologue of most of the genes involved in the formation of the Skp1-Cullin-F-Box TIR1/AFB E3 ubiquitin ligase complex were located, as was one homologue of ABP1 (Supplementary Table 5). Four homologues of TMK were also located. We note that the annotation of the genome will not be complete, as there will be genes that were not expressed in the rosette at the stage or which have not been annotated correctly by the pipeline. Additional RNA work could improve these annotations and may indicate additional homologues.

Fifteen genes associated with synthetic auxin resistance (Group 4) (Table 2; Figure 13) are among the top 20 gene homologues with known or potential roles in herbicide resistance that contain SNPs identified as likely under selection within 1 Mbp. This includes nine genes associated with auxin perception: (1) five outlying SNPs of a AUX/IAA repressor protein homologue on gs06; (2) four SNPs each near CAND1 and ERC1 (components for Skp1-Cullin-F-Box TIR1/AFB E3 ubiquitin ligase complex) on gs02 and gs05, respectively; (3) three SNPs close to an ARF homologue on gs08; (4) one SNP near ARF on gs02; (5) one SNP near IAA/ARF on gs08; (6) one SNP near TIR/AFB on gs06; and (7) one SNP near AUX/IAA on gs01 (Table 2). For auxin transport genes, five homologues were identified: (1) four SNPs were found close to the auxin influx transporter (AUX/LAX) gene on gs08, (2) two SNPs were found close to AUX/LAX on gs04, (3) two SNPs were located near ABCB on gs06, (4) one SNP was found near a PIN homologue on gs01, and (5) another SNP was found near a second PIN homologue on gs01. Finally, two SNPs were located near the TMK gene on gs08 and two SNPs were located near ABP1 on gs06 (Table 2). SNPs showing evidence of selection are not necessarily contained in a gene under selection, but rather can be linked to nearby genes under selection. This means that while this analysis indicates regions of the genome where selection is occurring, the region contains numerous genes, making conclusive determinations that selection is occurring on a specific gene impossible, and we cannot determine what is causing that selection. More specifically, an SNP showing strong selection near known herbicide-resistant genes could be indicative of selection on that gene or on other nearby genes and may or may not be related to the evolution of herbicide resistance. As a result, while these SNPs cannot be conclusively attributed to selection on these genes, they can motivate further more targeted research, such as differential expression analysis in herbicide-resistant and herbicide-susceptible individuals. This chromosome-level assembly provides a fundamental resource for this analysis.

One or two homologues of the genes targeted by the remaining herbicides used to control G. spurium (EPSPS inhibitors, S-glutamine synthetase inhibitors, PPO inhibitors, and HPPD inhibitors; Anonymous 2022) were also found in the assembly. The predicted EPSPS protein was found as a single copy on gs04 and, as expected from a lack of reported resistance, did not show evidence of carrying a point mutation at Thr-102, Ala-103, or Pro-106 (Gaines and Heap Reference Gaines and Heap2023). A single homologue of protoporphyrinogen oxidase 2 (PPO2) was identified on gs02. The amino acids where changes are reported to potentially confer PPO2 resistance (Asn-98, Arg-128, Gly-383, Leu-385, Leu-398, Gly-399, Tyr-400, Leu-401, and Phe-421) are all conserved in this genome sequence (Rangani et al. Reference Rangani, Salas-Perez, Aponte, Knapp, Craig, Mietzner, Langaro, Noguera, Porri and Roma-Burgos2019). Two glutamine synthetase homologues were identified on gs03 and gs10, each with a single candidate SNP within 1 Mbp of their positions. A single homologue of HPPD was identified on gs02; however, while Amaranthus spp. and wild radish (Raphanus raphanistrum L.) have evolved HPPD resistance, the mechanism for this resistance appears to be a change in the plant’s metabolism or the gene’s expression pattern rather than point mutations in the target site (Lu et al. Reference Lu, Yu, Han, Owen and Powles2020; Nakka et al. Reference Nakka, Godar, Wani, Thompson, Peterson, Roelofs and Jugulam2017).

Two of the genes with the highest number of SNPs in the vicinity are the two of the four homologues of alpha-tubulin genes TOR2 and TUA5 located on gs08 and gs10. Alpha-tubulins are the target of Group 3, dinitroaniline herbicides such as trifluralin and ethalfluralin. While trifluralin does not control G. spurium and ethalfluralin only provides suppression (Anonymous 2022), it is possible the species is exposed to these microtubule inhibitors when other species are the primary targets for control. Point mutations at six sites in the alpha-tubulin genes have been determined to be the source of Group 3 resistance in grass species: goosegrass [Eleusine indica (L.) Gaertn.] (Anthony et al. Reference Anthony, Waldin, Ray, Bright and Hussey1998; Yamamoto et al. Reference Yamamoto, Zeng and Baird1998), green foxtail [Setaria viridis (L.) P. Beauv.] (Délye et al. Reference Délye, Menchari, Michel and Darmency2004), shortawn foxtail (Alopecurus aequalis Sobol.) (Hashim et al. Reference Hashim, Jan, Sunohara, Hachinohe, Ohdan and Matsumoto2012), and rigid ryegrass (Lolium rigidum Gaudin) (Chu et al. Reference Chu, Chen, Nyporko, Han, Yu and Powles2018). None of these point mutations are represented in our draft genome, and these genes are involved in fundamental processes such as successful mitosis and movement of organelles (Ludwig et al. Reference Ludwig, Oppenheim, Silflow and Snustad1987), so it is entirely possible that selection on these genes, if it is occurring, is unrelated to herbicide resistance.

Here we have produced a reference-quality, chromosome-level assembly for G. spurium and compared the genome with its nearest relatives with similar information available. We note that the currently sequenced genomes with n = 11 in the family show relative stability, but there is no clear mechanism that caused the reduction to n = 10 in G. spurium, given the available genomes. The ddRAD-tag data for the 19 Canadian Prairie populations indicate very low heterozygosity and high inbreeding rates that result in structured populations with limited gene flow. This likely explains the relatively slow movement of herbicide resistance genes in these populations. This work also indicates some gene candidates of interest that may be involved in the evolution of auxinic resistance in this species. These can be targeted for further investigation and include the IAA gene on chromosome 6 and the AUX gene on chromosome 8. The availability of the G. spurium genome will facilitate future research into the genetic basis of auxin resistance in the species and systematics research into the polyphyletic genus Galium.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/wsc.2024.79

Acknowledgments

We would like to thank the ORDC greenhouse staff for their help in growing the plants for this work; Taylor Withey for counting, measuring, and photographing the seeds produced during the greenhouse trial; and Shefali Vishwakarma for initial work on the RAD-seq files in STACKS. We would also like to thank Dr. Martin Laforest for suggestions on which genes to include in the list of interest for the evolution of herbicide resistance.

Funding statement

This work was funded by Agriculture and Agri-Food Canada through the Agricultural Plant Systematics Project (J-002275).

Competing interests

The authors declare no conflicts of interest.

Footnotes

Associate Editor: Caio Brunharo, Penn State University

References

Anonymous (2022) 2022 Guide to Crop Protection. Regina: Saskatchewan Ministry of Agriculture. 36 pGoogle Scholar
Anthony, RG, Waldin, TR, Ray, JA, Bright, SWJ, Hussey, PJ (1998) Herbicide resistance caused by spontaneous mutation of the cytoskeletal protein tubulin. Nature 393:260263 Google Scholar
Bache, SM, Wickham, H (2022) magrittr: A Forward-Pipe Operator for R. R Package Version 2.0.3. https://magrittr.tidyverse.org/. Accessed: Nov 1, 2022Google Scholar
Beckie, H, Shirriff, S, Leeson, J, Hall, L, Harker, K, Dokken-Bouchard, FB, Brenzil, CA (2020) Herbicide-resistant weeds in the Canadian prairies: 2012 to 2017. Weed Technol 34:461474 Google Scholar
Beckie, HJ, Lozinski, C, Shirriff, S, Brenzil, CA (2013) Herbicide-resistant weeds in the Canadian prairies: 2007 to 2011. Weed Technol 27:171183 Google Scholar
Beckie, HJ, Warwick, SI, Sauder, CA, Kelln, GM, Lozinski, C (2012) Acetolactate synthase inhibitor–resistant Galium spurium in western Canada. Weed Technol 26:151155 Google Scholar
Bodenhofer, U, Bonatesta, E, Horejs-Kainrath, C, Hochreiter, S (2015) msa: an R package for multiple sequence alignment. Bioinformatics 31:3997–3999Google Scholar
Bremer, B (2009) A review of molecular phylogenetic studies of Rubiaceae. Ann MO Bot Gard 96:426 Google Scholar
Brownrigg, R, Minka, TP, Deckman, A (2022) maps: Draw Geographical Maps. R Package Version 3.4.1. https://cran.r-project.org/web/packages/maps/index.html. Accessed: Nov 1, 2022Google Scholar
Brůna, T, Hoff, KJ, Lomsadze, A, Stanke, M, Borodovsky, M (2021) BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database . NAR Genom Bioinform 3:lqaa108 Google Scholar
Catchen, J, Hohenlohe, PA, Bassham, S, Amores, A, Cresko, WA (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:31243140 Google Scholar
Charif, D, Lobry, JR (2007) Seqin{R} 1.0-2: a contributed package to the {R} project for statistical computing devoted to biological sequences retrieval and analysis. Pages 207–232 in Bastolla U, Porto M, Roman HE, Vendruscolo M, eds. Structural Approaches to Sequence Evolution: Molecules, Networks, Populations. New York: SpringerGoogle Scholar
Chen, T, Ehrendorfer, F (2001) Galium. Pages 104–141 in Committee FoCE, ed. Flora of China. http://www.efloras.org/florataxon.aspx?flora_id=2&taxon_id=250090777. Accessed: Jan 6, 2025Google Scholar
Chen, Y, Chen, Y, Shi, C, Huang, Z, Zhang, Y, Li, S, Li, Y, Ye, J, Yu, C, Li, Z, Zhang, X, Wang, J, Yang, H, Fang, L, Chen, Q (2018) SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7:16 Google Scholar
Chu, Z, Chen, J, Nyporko, A, Han, H, Yu, Q, Powles, S (2018) Novel alpha-tubulin mutations conferring resistance to dinitroaniline herbicides in Lolium rigidum . Front Plant Sci 9:97 Google Scholar
Conner, JK, Hartl, DL (2004) A Primer of Ecological Genetics. Sunderland, MA: Sinauer. 304 pGoogle Scholar
Dai, N, Wang, W, Patterson, SE, Bleecker, AB (2013) The TMK subfamily of receptor-like kinases in Arabidopsis display an essential role in growth and a reduced sensitivity to auxin. PLoS ONE 8:e60990 Google Scholar
Danecek, P, Auton, A, Abecasis, G, Albers, CA, Banks, E, DePristo, MA, Handsaker, RE, Lunter, G, Marth, GT, Sherry, ST, McVean, G, Durbin, R, 1000 Genomes Project Analysis Group (2011) The variant call format and VCFtools. Bioinformatics 27:21562158 Google Scholar
Délye, C, Menchari, Y, Michel, S, Darmency, H (2004) Molecular bases for sensitivity to tubulin-binding herbicides in green foxtail. Plant Physiol 136:39203932 Google Scholar
Deng, W, Di, Y, Cai, J, Chen, Y, Yuan, S (2019) Target site resistance mechanisms to tribenuron-methyl and cross-resistance patterns to ALS-inhibiting herbicides of catchweed bedstraw (Galium aparine) with different ALS mutations. Weed Sci 67:183188 Google Scholar
Dharmasiri, N, Dharmasiri, S, Estelle, M (2005a) The F-box protein TIR1 is an auxin receptor. Nature 435:441445 Google Scholar
Dharmasiri, N, Dharmasiri, S, Weijers, D, Lechner, E, Yamada, M, Hobbie, L, Ehrismann, JS, Jürgens, G, Estelle, M (2005b) Plant development is regulated by a family of auxin receptor F-box proteins. Dev Cell 9:109119 Google Scholar
Ellinghaus, D, Kurtz, S, Willhoeft, U (2008) LTRharvest: an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9:18 Google Scholar
Emms, DM, Kelly, S (2018) STAG: species tree inference from all genes. bioRxiv 10.1101/267914Google Scholar
Emms, DM, Kelly, S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:238 Google Scholar
Flanagan, SP (2023) gwscaR: Genome-Wide Selection Components Analysis in R. R Package Version 0.1.0. https://github.com/spflanagan/gwscaR. Accessed: Nov 15, 2023Google Scholar
Fox, J, Weisberg, S (2019) An R Companion to Applied Regression. 3rd ed. Thousand Oaks, CA: Sage. 607 pGoogle Scholar
Gaines, TA, Heap, IM (2023) Mutations in Herbicide-Resistant Weeds to Inhibition of Enolpyruvyl Shikimate Phosphate Synthase. www.weedscience.com. Accessed: August 9, 2023Google Scholar
Geddes, CM, Pittman, MM, Hall, LM, Topinka, AK, Sharpe, SM, Leeson, JY, Beckie, HJ (2023) Increasing frequency of multiple herbicide-resistant kochia (Bassia scoparia) in Alberta. Canadian Journal of Plant Science 103, 233237. https://doi.org/10.1139/CJPS-2022-0224 Google Scholar
Gleason, HA, Cronquist, A (1991) Manual of Vascular Plants of Northeastern United States and Adjacent Canada. 2nd ed. New York: New York Botanical Garden Google Scholar
Goudet, J, Jombart, T (2022) hierfstat: Estimation and Tests of Hierarchical F-Statistics. R Package Version 0.5-11. https://rdrr.io/cran/hierfstat/#:∼:text=Estimates%20hierarchical%20F%2Dstatistics%20from,)%2C%2052%3A950. Accessed: Nov 15, 2023Google Scholar
Graves, S, Piepho, H-P, Selzer, L, Dorai-Raj, S (2019) multcompView: Visualizations of Paired Comparisons. R Package Version 0.1-8. https://cran.r-project.org/web/packages/multcompView/index.html. Accessed Nov 1, 2022Google Scholar
Grossmann, K (2010) Auxin herbicides: current status of mechanism and mode of action. Pest Manag Sci 66:113120 Google Scholar
Grossmann, K, Hansen, H (2001) Ethylene-triggered abscisic acid: a principle in plant growth regulation? Physiol Plant 113:914 Google Scholar
Grossmann, K, Schmülling, T (1995) The effects of the herbicide quinclorac on shoot growth in tomato is alleviated by inhibitors of ethylene biosynthesis and by the presence of an antisense construct to the 1-aminocyclopropane-1-carboxylic acid (ACC) synthase gene in transgenic plants. Plant Growth Regul 16:183188 Google Scholar
Gu, Z, Gu, L, Eils, R, Schlesner, M, Brors, B (2014) circlize: implements and enhances circular visualization in R. Bioinformatics 30:2811–2812Google Scholar
Guo, X-M, Wang, Z-F, Zhang, Y, Wang, R-J (2021) Chromosomal-level assembly of the Leptodermis oblonga (Rubiaceae) genome and its phylogenetic implications. Genomics 113:30723082 Google Scholar
Gurevich, A, Saveliev, V, Vyahhi, N, Tesler, G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:10721075 Google Scholar
Hahne, F, Ivanek, R (2016) Visualizing genomics data using gviz and bioconductor. Pages 335351 in Mathe, E, Davis, S, eds. Statistical Genomics: Methods and Protocols. New York: Springer Google Scholar
Hall, LM, Stromme, KM, Horsman, GP, Devine, MD (1998) Resistance to acetolactate synthase inhibitors and quinclorac in a biotype of false cleavers (Galium spurium). Weed Sci 46:390396 Google Scholar
Hall, LM, Beckie, HJ, Low, R, Shirriff, SW, Blackshaw, RE, Kimmel, N, Neeser, C. (2014) Survey of glyphosate-resistant kochia (Kochia scoparia L. Schrad.) in Alberta. Canadian Journal of Plant Science 94, 127130. https://doi.org/10.4141/CJPS2013-204 Google Scholar
Hansen, H, Grossmann, K (2000) Auxin-induced ethylene triggers abscisic acid biosynthesis and growth inhibition. Plant Physiol 124:14371448 Google Scholar
Harrell, FE (2022) Hmisc: Harrell Miscellaneous. CRAN Version 4.7-2. https://cran.r-project.org/web/packages/Hmisc/index.html. Accessed: Nov 1, 2022Google Scholar
Hashim, S, Jan, A, Sunohara, Y, Hachinohe, M, Ohdan, H, Matsumoto, H (2012) Mutation of alpha-tubulin genes in trifluralin-resistant water foxtail (Alopecurus aequalis). Pest Manag Sci 68:422429 Google Scholar
Heap, I (2023) The International Herbicide-Resistant Weed Database. weedscience.org/Home.aspx. Accessed: January 8, 2023Google Scholar
Hervé, M (2022) RVAideMemoire: Testing and Plotting Procedures for Biostatistics. R Package Version 0.9-81-2. https://cran.r-project.org/web/packages/RVAideMemoire/index.html. Accessed: Nov 1, 2022Google Scholar
Hijmans, RJ (2022) geosphere: Spherical Trigonometry. R Package Version 1.5-18. https://cran.r-project.org/web/packages/geosphere/. Accessed: Nov 1, 2022Google Scholar
Hoff, KJ, Lomsadze, A, Borodovsky, M, Stanke, M (2019) Whole-genome annotation with BRAKER. Methods Mol Biol 1962:6595 Google Scholar
Holm, LG, Pancho, JV, Herberger, JP, Plucknett, DL (1991) A Geographical Atlas of World Weeds. Malabar, FL: Krieger. 731 pGoogle Scholar
Hu, K, Xu, K, Wen, J, Yi, B, Shen, J, Ma, C, Fu, T, Ouyang, Y, Tu, J (2019) Helitron distribution in Brassicaceae and whole genome Helitron density as a character for distinguishing plant species. BMC Bioinformatics 20:354 Google Scholar
Jombart, T (2008) adegenet: an R package for the multivariate analysis of genetic markers. Bioinformatics 24:1403–1405Google Scholar
Kamvar, Z, Tabima, J, Grünwald, N (2014) Poppr: an R package for genetic analysis of populations with clonal partially clonal and/or sexual reproduction. PeerJ 2:e281 Google Scholar
Kepinski, S, Leyser, O (2005) The Arabidopsis F-box protein TIR1 is an auxin receptor. Nature 435:446451 Google Scholar
Kim, D, Paggi, JM, Park, C, Bennett, C, Salzberg, SL (2019) Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37:907915 Google Scholar
Knaus, BJ, Grünwald, NJ (2016) vcfR: a package to manipulate and visualize (VCF) data in R. bioRxiv 10.1101/041277Google Scholar
Koren, S, Walenz, BP, Berlin, K, Miller, JR, Bergman, NH, Phillippy, AM (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722736 Google Scholar
Kraft, M, Kuglitsch, R, Kwiatkowski, J, Frank, M, Grossmann, K (2007) Indole-3-acetic acid and auxin herbicides up-regulate 9-cis-epoxycarotenoid dioxygenase gene expression and abscisic acid accumulation in cleavers (Galium aparine): interaction with ethylene. J Exp Bot 58:14971503 Google Scholar
Kumar, S, Suleski, M, Craig, JE, Kasprowicz, AE, Sanderford, M, Li, M, Stecher, G, Hedges, SB (2022) TimeTree 5: an expanded resource for species divergence times. Mol Biol Evol 39:msac174 Google Scholar
Laetsch, DR, Blaxter, ML (2017) BlobTools: interrogation of genome assemblies. F1000Research 6:1287 Google Scholar
Lawrence, M, Huber, W, Pagès, H, Aboyoun, P, Carlson, M, Gentleman, R, Morgan, M, Carey, V (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9:e1003118 Google Scholar
Leeson, JY, Thomas, AG, Hall, LM, Brenzil, CA, Andrews, T, Brown, KR, Van Acker, RC (2005) Prairie Weed Surveys of Cereal, Oilseed, and Pulse Crops from the 1970s to the 2000s. Saskatoon, Canada: Agriculture and Agri-Food Canada. 395 pGoogle Scholar
Lemon, J (2006) plotrix: a package in the red light district of R. R News 6:8–12Google Scholar
Li, H, Durbin, R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:17541760 Google Scholar
Li, H, Handsaker, B, Wysoker, A, Fennell, T, Ruan, J, Homer, N, Marth, G, Abecasis, G, Durbin, R, Genome Project Data Processing S (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:20782079 Google Scholar
Li, S-B, Xie, Z-Z, Hu, C-G, Zhang, J-Z (2016) A review of auxin response factors (ARFs) in plants. Front Plant Sci 7:47 Google Scholar
Lomsadze, A, Burns, PD, Borodovsky, M (2014) Integration of mapped RNA-seq reads into automatic training of eukaryotic gene-finding algorithms. Nucleic Acids Res 42:e119 Google Scholar
Looman, J, Best, KF (1979) Budd’s Flora of the Canadian Prairie Provinces. Canada: Agriculture Canada. Hull, Quebec, Canada. 872 pGoogle Scholar
Lovell, JT (2023) GENESPACE: Synteny- and Orthology-constrained Comparative Genomics. R Package Version 1.0.13. https://github.com/jtlovell/GENESPACE. Accessed: Nov 15, 2023Google Scholar
Lovell, JT, Sreedasyam, A, Schranz, ME, Wilson, M, Carlson, J, Harkess, A, Emms, D, Goodstein, DM, Schmutz, J (2022) GENESPACE tracks regions of interest and gene copy number variation across multiple genomes. eLife 11:e78526 Google Scholar
Lu, H, Yu, Q, Han, H, Owen, MJ, Powles, SB (2020) Evolution of resistance to HPPD-inhibiting herbicides in a wild radish population via enhanced herbicide metabolism. Pest Manag Sci 76:19291937 Google Scholar
Ludwig, SR, Oppenheim, DG, Silflow, CD, Snustad, DP (1987) Characterization of the a-tubulin gene family of Arabidopsis thaliana . Proc Natl Acad Sci USA 84:58335837 Google Scholar
Luo, J, Zhou, JJ, Zhang, JZ (2018) Aux/IAA gene family in plants: molecular structure, regulation, and function. Int J Mol Sci 19:259 Google Scholar
Luu, K, Bazin, E, Blum, MGB (2017) pcadapt: an R package to perform genome scans for selection based on principal component analysis. Mol Ecol Res 17:67–77Google Scholar
Lysak, MA (2022) Celebrating Mendel, McClintock, and Darlington: on end-to-end chromosome fusions and nested chromosome fusions. Plant Cell 34:24752491 Google Scholar
Malik, N, Vanden Born, WH (1988) The biology of Canadian weeds 86. Galium aparine L. and Galium spurium L. Can J Plant Sci 68:481–499Google Scholar
Marçais, G, Delcher, AL, Phillippy, AM, Coston, R, Salzberg, SL, Zimin, A (2018) MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol 14:e1005944 Google Scholar
Martin, SL, Benedict, L, Wei, W, Sauder, CA, Beckie, HJ, Hall, LM (2020) High gene flow maintains genetic diversity following selection for high EPSPS copy number in the weed kochia (Amaranthaceae). Sci Rep 10:18864 Google Scholar
Martin, SL, Smith, TW, James, T, Shalabi, F, Kron, P, Sauder, CA (2017) An update to the Canadian range, abundance, and ploidy of Camelina spp. (Brassicaceae) east of the Rocky Mountains. Botany 95:405417 Google Scholar
Mijangos, JL, Gruber, B, Berry, O, Pacioni, C, Georges, A (2022) dartR v2: an accessible genetic analysis platform for conservation ecology and agriculture. Methods Ecol Evol 13:21502158 Google Scholar
Moore, RJ (1975) The Galium aparine complex in Canada. Can J Bot 53:877893 Google Scholar
Morgan, M, Pagès, H, Obenchain, V, Hayden, N (2021) Rsamtools: Binary Alignment (BAM), FASTA, Variant Call (BCF), and Tabix File Import. R Package Version 2.10.0. bioconductor.org/packages/Rsamtools. Accessed: Nov 1, 2022Google Scholar
Nakka, S, Godar, AS, Wani, PS, Thompson, CR, Peterson, DE, Roelofs, J, Jugulam, M (2017) Physiological and molecular characterization of hydroxyphenylpyruvate dioxygenase (HPPD)-inhibitor resistance in Amaranthus palmeri . Front Plant Sci 8:555 Google Scholar
Newcombe, RG (1998) Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med 17:857872 Google Scholar
Oksanen, J, Simpson, GL, Blanchet, FG, Kindt, R, Legendre, P, Minchin, PR, O’Hara, RB, Solymos, P, Stevens, MHH, Szoecs, E, Wagner, H, Barbour, M, Bedward, M, Bolker, B, Borcard, D, et al. (2022) vegan: Community Ecology Package. R Package Version 2.6-4Google Scholar
O’Leary, SJ, Puritz, JB, Willis, SC, Hollenbeck, CM, Portnoy, DS (2018) These aren’t the loci you’re looking for: principles of effective SNP filtering for molecular ecologists. Mol Ecol 27:31933206 Google Scholar
OpenAI (2024) ChatGPT (GPT-4). www.openai.com. Accessed Jul 28, 2024Google Scholar
Ou, S, Chen, J, Jiang, N (2018) Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res 46:e126 Google Scholar
Ou, S, Jiang, N (2018) LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol 176:14101422 Google Scholar
Ou, S, Su, W, Liao, Y, Chougule, K, Agda, JRA, Hellinga, AJ, Lugo, CSB, Elliott, TA, Ware, D, Peterson, T, Jiang, N, Hirsch, CN, Hufford, MB (2019) Benchmarking transposable element annotation methods for creation of a streamlined comprehensive pipeline. Genome Biol 20:275 Google Scholar
Pagès, H, Aboyoun, P, Gentleman, R, DebRoy, S (2021) Biostrings: Efficient Manipulation of Biological Strings. R Package Version 2.62.0. https://rdrr.io/bioc/Biostrings/. Accessed: Nov 1, 2022Google Scholar
Paradis, E (2010) pegas: an R package for population genetics with an integrated-modular approach. Bioinformatics 26:419–420Google Scholar
Paradis, E, Schliep, K (2019) ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35:526–528Google Scholar
Pazmiño, DM, Rodriguez-Serrano, M, Romero-Puertas, MC, Archilla-Ruiz, A, Del Río, LA, Sandalio, LM (2011) Differential response of young and adult leaves to herbicide 2,4-dichlorophenoxyacetic acid in pea plants: role of reactive oxygen species. Plant Cell Environ 34:18741889 Google Scholar
Pebesma, E (2018) Simple features for R: standardized support for spatial vector data. R J 10:439446 Google Scholar
Pebesma, E, Mailund, T, Heibert, J (2016) Measurement units in R. R J 8:486494 Google Scholar
Peterson, BK, Weber, JN, Kay, EH, Fisher, HS, Hoekstra, HE (2012) Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS ONE 7:e37135 Google Scholar
Pividori, M (2024) Chatbots in science: what can ChatGPT do for you? Nature, August 14, 2024, https://doi.org/10.1038/d41586-024-02630-z Google Scholar
Privé, F, Luu, K, Vilhjalmsson, BJ, Blum, MGB (2020) Performing highly efficient genome scans for local adaptation with R package pcadapt. Mol Biol Evol 37:21532154 Google Scholar
Quint, M, Gray, WM (2006) Auxin signaling. Curr Opin Plant Biol 9:448453 Google Scholar
Rangani, G, Salas-Perez, RA, Aponte, RA, Knapp, M, Craig, IR, Mietzner, T, Langaro, AC, Noguera, MM, Porri, A, Roma-Burgos, N (2019) A novel single-site mutation in the catalytic domain of protoporphyrinogen oxidase IX (PPO) confers resistance to PPO-inhibiting herbicides. Front Plant Sci 10:568 Google Scholar
Revell, LJ (2012) phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol 3:217–223Google Scholar
Revelle, W (2023) psych: Procedures for Psychological Psychometric and Personality Research. R Package Version 2.3.6. https://cran.r-project.org/web/packages/psych/index.html. Accessed: Nov 15, 2023Google Scholar
Savary, P, Foltête, J-C, Moal, H, Vuidel, G, Garnier, S (2020) graph4lg: a package for constructing and analyzing graphs for landscape genetics in R. Methods Ecol Evol 1:1–9Google Scholar
Scalabrin, S, Toniutti, L, Di Gaspero, G, Scaglione, D, Magris, G, Vidotto, M, Pinosio, S, Cattonaro, F, Magni, F, Jurman, I, Cerutti, M, Suggi Liverani, F, Navarini, L, Del Terra, L, Pellegrino, G, et al. (2020) A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm. Sci Rep 10:4642 Google Scholar
Schliep, K, Jombart, T, Kamvar, ZN, Archer, E, Harris, R (2020) apex: Phylogenetic Methods for Multiple Gene Data. R Package Version 1.0.4. https://cran.r-project.org/web/packages/apex/. Accessed: Nov 1, 2022Google Scholar
Schliep, KP (2011) phangorn: phylogenetic analysis in R. Bioinformatics 27:592–593Google Scholar
Shi, J, Liang, C (2019) Generic repeat finder: a high-sensitivity tool for genome-wide de novo repeat detection. Plant Physiol 180:18031815 Google Scholar
Simao, FA, Waterhouse, RM, Ioannidis, P, Kriventseva, EV, Zdobnov, EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:32103212 Google Scholar
Smit, A, Hubley, R (2008–2015) RepeatModeler Open-1.0. www.repeatmasker.org. Accessed: Nov 1, 2022Google Scholar
Smit, A, Hubley, R, Green, P (2010–2013) RepeatMasker Open-4.0. www.repeatmasker.org. Accessed: Nov 1, 2022Google Scholar
Smith, T, Kron, P, Martin, S (2018) flowPloidy: an R package for genome size and ploidy assessment of flow cytometry data. Appl Plant Sci 6:e01164Google Scholar
South, A (2017) rnaturalearth: World Map Data from Natural Earth. R Package Version 0.1.0. https://cran.r-project.org/web/packages/rnaturalearth/vignettes/rnaturalearth.html. Accessed: Nov 1, 2022Google Scholar
Soza, VL, Olmstead, RG (2010) Molecular systematics of tribe Rubieae (Rubiaceae): evolution of major clades, development of leaf-like whorls, and biogeography. Taxon 59:755771 Google Scholar
Stanke, M, Diekhans, M, Baertsch, R, Haussler, D (2008) Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24:637644 Google Scholar
Stevens, PF (2017) Angiosperm phylogeny website v14. www.mobot.org/mobot/research/apweb/welcom.html. Accessed: February 6, 2024Google Scholar
Storey, JD, Bass, AJ, Dabney, A, Robinson, D (2022) qvalue: Q-Value Estimation for False Discovery Rate Control. R Package Version 2.28.0. https://github.com/StoreyLab/qvalue. Accessed: Nov 15, 2023Google Scholar
Su, W, Gu, X, Peterson, T (2019) TIR-Learner: a new ensemble method for TIR transposable element annotation provides evidence for abundant new transposable elements in the maize genome. Mol Plant 12:447460 Google Scholar
Sun, J, Wang, J-x, Zhang, H-j, Liu, J-l, Bian, S-n (2011) Study on mutations in ALS for resistance to tribenuron-methyl in Galium aparine L. Agric Sci China 10:86–91Google Scholar
Sun, P, Jiao, B, Yang, Y, Shan, L, Li, T, Li, X, Xi, Z, Wang, X, Liu, J (2022) WGDI: a user-friendly toolkit for evolutionary analyses of whole genome duplications and ancestral karyotypes. Mol Plant 15:18411851 Google Scholar
Tadeo, K, Ronald, K, Vereriano, T, Robooni, T (2024) Genetic diversity and structure of Ugandan tea (Camellia sinensis (L.) O. Kuntze) germplasm and its implication in breeding. Genet Resour Crop Evol 71:481496 Google Scholar
Tan, X, Calderon-Villalobos, LIA, Sharon, M, Zheng, C, Robinson, CV, Estelle, M, Zheng, N (2007) Mechanism of auxin perception by the TIR1 ubiquitin ligase. Nature 446:640645 Google Scholar
R Core Team (2021) R: A Language and Environment for Statistical Computing. Version 4.1.1. Vienna, Austria: R Foundation for Statistical Computing. www.R-project.org Google Scholar
Thioulouse, J, Dray, S, Dufour, A-B, Siberchicot, A, Jombart, T, Pavoine, S (2018) Multivariate Analysis of Ecological Data with ade4. New York: Springer. 329 pGoogle Scholar
Todd, OE, Figueiredo, MRA, Morran, S, Soni, N, Preston, C, Kubes, MF, Napier, R, Gaines, TA (2020) Synthetic auxin herbicides: finding the lock and key to weed resistance. Plant Sci 300:110631 Google Scholar
Tranel, PJ, Wright, TR, Heap, IM (2023) Mutations in Herbicide-Resistant Weeds to Inhibition of Acetolactate Synthase. www.weedscience.com. Accessed: August 9, 2023Google Scholar
Turner, NJ, Szczawinski, AF (1978) Wild Coffee and Tea Substitutes of Canada. Ottawa: National Museum of Natural Sciences. 111 pGoogle Scholar
Urbanek, S (2022) png: Read and Write PNG Images. R Package Version 0.1-8. https://cran.r-project.org/web/packages/png/index.html. Accessed: Nov 15, 2023Google Scholar
Van Eerd, LL (2004) Characterization and Inheritance of Quinclorac Resistance in a False Cleavers (Galium spurium L.) Biotype. Doctoral dissertation. Guelph, Canada: University of Guelph. 199 pGoogle Scholar
Van Eerd, LL, McLean, MD, Stephenson, GR, Hall, JC (2004) Resistance to quinclorac and ALS-inhibitor herbicides in Galium spurium is conferred by two distinct genes. Weed Res 44:355365 Google Scholar
Van Eerd, LL, Stephenson, GR, Kwiatkowski, J, Grossmann, K, Hall, JC (2005) Physiological and biochemical characterization of quinclorac resistance in a false cleavers (Galium spurium L.) biotype. Pest Manag Sci 53:11441151 Google Scholar
Venables, WN, Ripley, BD (2002) Modern Applied Statistics with S. 4th ed. New York: Springer. 498 pGoogle Scholar
Voss, EG, Reznicek, AA (2012) Field Manual of Michigan Flora. Ann Arbor: University of Michigan Press. 1008 pGoogle Scholar
Walker, BJ, Abeel, T, Shea, T, Priest, M, Abouelliel, A, Sakthikumar, S, Cuomo, CA, Zeng, Q, Wortman, J, Young, SK, Earl, AM (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9:e112963 Google Scholar
Walsh, TA, Neal, R, Merlo, AO, Honma, M, Hicks, GR, Wolff, K, Matsumura, W, Davies, JP (2006) Mutations in an auxin receptor homolog AFB5 and in SGT1b confer resistance to synthetic picolinate auxins and not to 24-dichlorophenoxyacetic acid or indole-3-acetic acid in Arabidopsis . Plant Physiol 142:542552 Google Scholar
Wang, Y, Tang, H, Debarry, JD, Tan, X, Li, J, Wang, X, Lee, TH, Jin, H, Marler, B, Guo, H, Kissinger, JC, Paterson, AH (2012) MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res 40:e49 Google Scholar
Wickham, H (2016) ggplot2: Elegant Graphics for Data Analysis. Use R! New York: Springer. Second Edition. https://doi.org/10.1007/9. 260 pGoogle Scholar
Wickham, H (2022) stringr: Simple Consistent Wrappers for Common String Operations. R Package Version 1.5.0Google Scholar
Wickham, H, Averick, M, Bryan, J, Chang, W, McGowan, LDA, François, R, Grolemund, G, Hayes, A, Henry, L, Hester, J, Kuhn, M, Pedersen, TL, Miller, E, Bache, SM, Müller, K, et al. (2019) Welcome to the tidyverse. J Open Source Softw 4:1696 Google Scholar
Wickham, H, François, R, Müller, HL (2022) dplyr: A Grammar of Data Manipulation. R Package Version 1.0.10. https://cran.r-project.org/web/packages/dplyr/index.html. Accessed: Nov 15, 2023Google Scholar
Workman, R, Renee, F, Kilburn, D, Hao, S, Liu, K, Timp, W (2018) Recalcitrant plant species for third generation sequencing. Protocol Exchange. https://doi.org/10.17504/protocols.io.4vbgw2n. Accessed: July 05 2019Google Scholar
Xiong, W, He, L, Lai, J, Dooner, HK, Du, C (2014) HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc Natl Acad Sci USA 111:1026310268 Google Scholar
Xu, Z, Wang, H (2007) LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35(Suppl 2):W265W268 Google Scholar
Yamamoto, E, Zeng, L, Baird, WV (1998) Alpha-tubulin missense mutations correlate with antimicrotubule drug resistance in Eleusine indica . Plant Cell 10:297308 Google Scholar
Yang, LE, Meng, Y, Peng, DL, Nie, ZL, Sun, H (2018) Molecular phylogeny of Galium L. of the tribe Rubieae (Rubiaceae)—emphasis on Chinese species and recognition of a new genus Pseudogalium . Mol Phylogenet Evol 126:221232 Google Scholar
Zažímalová, E, Murphy, AS, Yang, H, Hoyerová, K, Hosek, P (2010) Auxin transporters—why so many? Cold Spring Harb Perspect Biol 2:a001552 Google Scholar
Zažímalová, E, Petrášek, J, Benková, E, Mueller-Roeber, B, Balazadeh, S (2014) Auxin and Its Role in Plant Development. Vienna: Springer. https://doi.org/10.1007/978-3-7091-1526-8. 444 pGoogle Scholar
Zeileis, A, Fisher, J, Hornik, K, Ihaka, R, McWhite, C, Murrell, P, Stauffer, R, Wilke, C (2020) colorspace: a toolbox for manipulating and assessing colors and palettes. J Stat Softw 96:1–49Google Scholar
Zhang, R-G, Wang, Z-X, Ou, S, Li, G-Y (2019) TEsorter: lineage-level classification of transposable elements using conserved protein domains. bioRxiv 10.1101/800177Google Scholar
Zheng, H-g, Hall, JC (2001) Understanding auxinic herbicide resistance in wild mustard: physiological, biochemical, and molecular genetic approaches. Weed Sci 49:276281 Google Scholar
Figure 0

Figure 1. Map of the Canadian Prairie provinces Albert and Saskatchewan indicating locations for sample populations collected for use in this research, with latitude indicated on the left and longitude indicated along the bottom.

Figure 1

Figure 2. The three visual categories scored for hook density on the mericarps of Galium spurium: (A) naked, (B) sparsely hooked, and (C) densely hooked.

Figure 2

Table 1. Genome assembly statistics for Galium spurium.

Figure 3

Figure 3. Circle plot of the Galium spurium genome with (A) gene density, (B) single-nucleotide polymorphism (SNP) density, (C) heterozygosity, (D) alternative allelic frequency, and (E) long terminal repeat frequency. An indication of chromosome length is plotted along the outside edge.

Figure 4

Figure 4. A phylogenetic tree indicating the evolutionary relationships among species studied here produced by OrthoFinder. The values at the nodes indicate support calculated by OrthoFinder using the Species Tree Inference from All Genes (STAG) method and indicates the proportion of orthogroup trees that had the bipartition. Galium, Cruciata, and Sherardia are all within the tribe Rubieae (Rubioideae), while Leptodermis is in the tribe Paederieae (Rubioideae), and Coffea canephora is in a different subfamily, Ixoroideae. Asclepias syriaca is an outgroup from the Apocynaceae.

Figure 5

Figure 5. A riparian plot of syntenic relationships among species from the Rubiaceae generated by GENESPACE showing genomic structure across the family, with Coffea canephora at the bottom and working upward: Leptodermis oblonga, Sherardia arvensis, Cruciata laevipes, and finally Galium spurium at the top of the plot. Asterisks indicate that the chromosome was inverted to clarify the plot. Note that C. canephora’s cc09 has been shifted to better indicate its position within the chromosomes of the Rubiaceae species.

Figure 6

Figure 6. Plots of synteny between proteins of Galium spurium and (A) Cruciata laevipes and (B) Coffea canephora. Red dots indicate the best syntenic matches and blue dots indicate secondary matches, with gray dots indicating lower-order matches. For G. spurium and C. laevipes, chromosomes are colored based on syntenic relationships with C. canephora. We see from these plots that several chromosomes are almost entirely syntenic (e.g., gs06, gs07, and gs10), but there is no clear pattern suggesting that end-to-end joining or nested chromosome fusion led to the reduction in chromosome number in G. spurium from n = 11 to n = 10.

Figure 7

Figure 7. For the 19 populations studied here (A) overall proportion of individuals showing homozygosity for the major allele (blue), heterozygosity (purple), and homozygosity (magenta) for the minor allele; (B) levels of heterozygosity by population; and (C) a heat map of pairwise FST values for with cooler colors indicating lower FST values and warmer indicating higher FST values. Heterozygosity was low across all populations with the majority of individuals (83%) homozygous for the reference allele with evidence of limited outcrossing. On average, the FST values were 0.54, but AB-08 showed lower pairwise FST values across comparisons and SK-05 showed higher values.

Figure 8

Figure 8. Provesti’s genetic distance (y axis) by geographic (landscape) distance (m) for pairings of the 19 populations. This graph (graph41g::scatter_dist) shows a smooth (loess) line with 95% confidence interval. A Mantel test with 1,000 Monte Carlo repetitions indicates a significant (P < 0.001) positive relationship providing evidence of isolation by distance.

Figure 9

Figure 9. Principal component analysis (PCA) of the genetic variation of single-nucleotide polymorphism (SNP) data. Circles in warm colors represent individuals from Alberta and triangles in cool colors represent those from Saskatchewan. Clumping of individuals from the same population is evident, but variation among populations provides spread. Axis1 explains 76% of the variation, while axis 2 explains 44%.

Figure 10

Figure 10. Phenological data from the 19 populations grown in the greenhouse, including (A) proportion of seed germinated after 7 d, (B) proportion flowering at 6 wk, and (C) proportion of plants with seed at 8 wk. In all cases, statistical differences are observed, but the majority of populations are highly variable and show strong overlap with values from the other populations.

Figure 11

Figure 11. Morphological data from the 19 populations grown in the greenhouse for key characteristics, including (A) plant height at 6 wk, (B) total estimated seed count, and (C) seed (mericarp) diameter.

Figure 12

Figure 12. Redundancy analysis (RDA) examining how variation in the single-nucleotide polymorphism (SNP) data is associated with variation in key traits. A total of 5.5% of the variation in the SNP data was explained by the morphological characteristics included. The majority of highly loading SNPs (70%), with a cutoff of 3 SDs from the mean, were associated with hook density and are highlighted in red. Most of the remaining SNPs are associated with total seed weight (26%) and highlighted in purple, while the one SNP associated with flowering time is in yellow.

Figure 13

Figure 13. Manhattan plots of almost 4,000 single-nucleotide polymorphisms (SNPs) across the genome, with 192 SNPs showing evidence of selection after the Benjamini-Hochberg procedure was used to adjust significance value to control false discovery, the rate was set to 1%, and SNPs were thinned to reduce the effects of linkage disequilibrium, highlighted in red. The green bar at the top of each chromosome plot indicates gene density, with darker areas annotated with more genes. The locations of genes with proteins that are homologues to known or potential contributors to herbicide resistance are shown on the x-axes.

Figure 14

Table 2. Top 20 homologues of genes with known or potential roles in herbicide resistance ranked by proximity to single-nucleotide polymorphisms (SNPs) under selection.a

Supplementary material: File

Martin et al. supplementary material

Martin et al. supplementary material
Download Martin et al. supplementary material(File)
File 162 KB