Workshop Descriptions

WS1 Applying Modern Genomic Tools to the Management and Characterization of Plant Genetic Resources

A. Preliminary program (abstracts below)

Morning session – ‘setting the scene’

8:00 – 8:05 Welcome address
8:05 – 8:40 Keynote address: Dr. Christopher M. Richards: The influence of large-scale genomics and the changing role of ex situ collections
8:40 – 9:00 Dr. David Spooner: DNA barcoding: An oversimplified solution to a complex problem
9:00 – 9:20 Dr. Jan Engels: Towards a rational, secure and effective long-term conservation strategy
9:20 – 9:40 Dr. Ken Richards: Challenges of applying molecular techniques to PGR management – a Canadian perspective
9:40 – 10:00 Q & A session
10:00 – 10:30 TEA and COFFEE
10:30 – 11:10 Keynote address: Dr. Loren Rieseberg: Population genetic challenges and the potential of modern genomics technologies for the management and characterization of plant genetic resources
11:10 – 11:30 Dr. Rich Cronn/ Dr. Aaron Liston: Multiplex Sequencing of Plant Chloroplast Genomes Using Illumina/Solexa Sequencing-By-Synthesis Technology
11:30 – 11:50 Dr. Nolan Kane: Chloroplast genome sequencing using Solexa and SOLiD
11:50 – 12:10 Dr. Katrina Dlugosch: Prospects and challenges of 454 sequencing for PGR characterization
12:10 – 12:30 Q & A session
12:30 – 13:30 LUNCH Sponsored by Applied Biosystems

Afternoon session – ‘opportunities through genomics’

13:30 – 14:30 Speakers from the private sector
14:30 – 15:00 Q&A session on genomics technologies
15:00 – 15:30 TEA and COFFEE
15:30 – 16:30 Discussion session
16:30 – 17:00 Conclusions and wrap-up (Chair: Quentin Cronk, Professor in Plant Science, UBC)

B. Talk abstracts (in order of program)

Keynote: The influence of large-scale genomic and the changing role of ex situ collections
Christopher M. Richards
The development of large scale genomics resources in non-model organisms promises to have a fundamental impact on the utilization of genetic resources. Technical innovation in high through-put sequencing has reduced the cost to a point where genome-wide SNP development is feasible across a range of taxa including wild relatives of domesticated cultivars. While the focus of these efforts are clearly on gene discovery and assessment of candidate loci for crop improvement, methods and markers developed here will be important in genebank management. Large scale surveys of wider genetic variation with significantly more biological complexity that model organisms will present a number of technical and analytical challenges. Increasingly, the technical work of crop improvement programs making use of genetic resources draws on the disciplines of phylogeography, molecular evolution and ecological genetics. I will describe how development of novel analytical approaches for population genomics will in turn influence the way we collect, and monitor and database the diversity in large ex situ collections and I will present ideas on the future roles genebanks may play as both providers of viable germplasm and associated data.

DNA barcoding: An oversimplified solution to a complex problem
David M. Spooner
DNA barcoding (“barcoding”) has been proposed as a rapid and practical molecular tool to identify species via diagnostic variation in short orthologous DNA sequences from one or a few universal genomic regions. It seeks to overcome the “taxonomic impediment” caused by a greater need for species identifications than the supply of taxonomic specialists. A number of barcoding regions have been proposed for plants, including the internal non-transcribed spacer of nuclear ribosomal DNA (ITS), and the plastid markers trnH-psbA intergenic spacer, matK, and other plastid regions, with the first three being the most variable. This study tests the utility of barcoding with these three regions in a complicated plant group, Solanum sect. Petota; wild potatoes. These DNA regions fail to provide species-specific markers in sect. Petota because ITS has too much intraspecific variation and the plastid markers lack sufficient polymorphism. Wild potatoes are not alone in failing to work with barcoding regions. Addressing the taxonomic impediment will require a comprehensive and integrative program of research and training using a variety of data sets appropriate to different species groups. Barcoding, in contrast, is impeded by common complicating biological phenomena, is a retroactive procedure that relies on well defined species to function, is based solely on DNA sequences that are often inappropriate at the species level, has been poorly tested with replicate samples, and ignores substantial practical and theoretical problems in defining species.

Towards a rational, secure and effective long-term conservation strategy
Johannes M.M. Engels (presenter) and Robbert van Treuren
A rough analysis of the history of how most of the existing germplasm collections have been established, and comparing the outcome of this analysis with what one would expect that such collections should contain in terms of genetic diversity for a given genepool, allows the conclusion that the content of existing ex situ collections leaves room for improvement, especially from a long-term conservation perspective. Many of these collections have grown out of breeders’ working collections that consisted of a selected set of accessions and/or have been established by countries and/or national or institutional genebanks with the aim of providing genetic diversity, in particular specific traits, to users (i.e. predominantly plant breeders) of those collections in a given country. This approach has resulted in considerable redundancy and in genetic diversity gaps, both from a genetic diversity as well as from a geographic perspective.
A long-term global or regional ex situ germplasm collection for a given crop genepool should contain an adequate representation of the total existing genetic diversity in that genepool (both, in situ as well as ex situ) in as few as possible samples (i.e. accessions) in order to be rational. This principle begs the question if a long-term conservation collection should aim at storing genotypes or genes/alleles.
Modern genomic and information management tools allow now more efficient and effective conservation approaches and methodologies to be applied and this results among others in:
o better monitoring of routine conservation activities (e.g. collecting and regeneration);
o attempts to work towards more adequately composed collections for long-term conservation, including the identification of collection gaps, unwanted duplicates and genetic redundancy (e.g. proposed approach to establish a global strategic base collection for cacao);
o better coordinated and more complementary conservation efforts between in situ (natural habitats and on-farm) and ex situ conservation programmes;
o more efficient collaboration efforts between genebanks, countries as well as between regions (e.g. rationalizing Allium, predominantly garlic collections; establishment and operation of a virtual European genebank system, i.e. AEGIS; rationalisation efforts of the Global Crop Diversity Trust);
o More rational and cost efficient global or regional conservation efforts;
o Better services to users (core collection and core selection formation).

Challenges of applying molecular techniques to PGR management – a Canadian perspective
Ken Richards
Genetic resources are playing an increasingly important role in Canadian agriculture for the betterment of Canadian and world societies. Recently Agriculture and Agri-Food consulted with national stakeholders about research priorities and determined one to be: “Understanding and conserving Canadian bioresources”. In response to this national priority the Canadian Genetic Resources Program developed long-term objectives: to protect and conserve the genetic diversity of Canadian bioresources, contribute to the security, protection and safety of the food system, enhance the environmental performance of the Canadian agricultural system, contribute to the development of new opportunities for agriculture, thereby enhancing food and feed quality, Canadian health and wellness, and economic benefits for the industry, and support bioresource-related regulatory requirements. The Program also developed specific short-term objectives:
a) develop new techniques to conserve and regenerate plant, animal and microbial germplasm to maintain genetic integrity and minimize genetic erosion;
b) create new phenotypic and genotypic information including identifying new sources of disease resistance, abiotic stress resistance, nutritional quality and bioactive compounds, through characterization and evaluation of bioresource attributes;
c) assess genetic diversity changes in domesticated plant and animal germplasm;
d) improve the structure of the GRIN-CA database for delivery of bioinformation; and
e) contribute to access and benefit sharing regimens (acquire, donate, maintain, regenerate germplasm) consistent with Canada's commitments to international treaties, e.g. Convention on Biological Diversity (CBD) and the FAO International Treaty on Genetic Resources for Food and Agriculture (ITGRFA).
Plant Gene Resources of Canada has applied various molecular techniques to help meet some of the above objectives, namely those associated with characterization and diversity changes of plant germplasm. Examples from diverse crop and wild species will illustrate the advances made in use of techniques and also some of the limitations experienced.

Keynote: Population genetic challenges and the potential of modern genomics technologies for the management and characterization of plant genetic resources
Loren Rieseberg
The development of molecular diagnostic tools for the management and characterization of crop germplasm such as landraces, breeder’s varieties, as well as populations of wild relatives is useful for several reasons. An appropriate method could provide a standardized means for identifying and categorizing germplasm across species and across institutions. It could also be used to reduce unwanted duplication in germplasm repositories, assess genetic relationships, develop a more stable classification of domesticated and wild populations, and detect contaminated or admixed samples. Furthermore, if biologically relevant molecular variation were assayed, it might be feasible to predict the likely value of germplasm for breeding and crop improvement. A variety of different approaches are currently being employed to characterize germplasm in different crops, ranging from allozymes to microsatellites to single nucleotide polymorphisms (SNPs) in nuclear loci. Also, DNA-barcoding approaches, including whole plastome sequencing, are now being considered for analyses of clonal and selfing crops. I will explore the strengths and weaknesses of the primary methods currently being employed (or that have recently become technically feasible) for germplasm characterization. I will also discuss the population genetic challenges associated with the development of a widely applicable, stable, and cost-effective strategy for analyzing crops that vary in mating system, ploidy level, and means of propagation. When assessing different approaches, I will do so in the light of rapid advances in sequencing and SNP genotyping technologies that are providing new technological solutions to old problems.

Multiplex Sequencing of Plant Chloroplast Genomes Using Illumina/Solexa Sequencing-By-Synthesis Technology
Richard Cronn, Aaron Liston
Chloroplast and mitochondrial organellar genomes are widely used in plant germplasm characterization because they offer a simple means to evaluate cytoplasmic diversity, germplasm differentiation, and taxonomic affinities. Due to their haploid nature and (typically) uniparental transmission, these genomes are highly responsive to drift. These positive attributes are counterbalanced by two undesirable features; a large size and highly conservative mutation rate. Haplotype variation – when present – is rarely found in a single mutational ‘hotspot’, but is usually dispersed across the genome in simple repeats, small rearrangements, and single nucleotide polymorphisms. Because of the limited variation detected in most taxa, a host of genes, spacers, introns, and microsatellite repeats are frequently pre-screened to identify “tortoises” and “hares” so that genotyping efforts can be tailored to specific taxonomic questions.
An alternative to this endless pursuit is to sequence entire genomes and evaluate all mutational classes genome-wide. In this presentation, we show how multiplex “sequencing-by-synthesis” (MSBS) on the Illumina Genome Analyzer is one way to achieve this goal. We have successfully used MSBS to sequence PCR-amplified plastomes from 4 to 6 species of Pinus simultaneously. By ‘tagging’ each genome with a unique adapter, microreads (36 - 40 bp) can be sorted and independently assembled using a combination of de novo and reference-guided steps. Results to date show that draft genomes can be rapidly assembled that are 85% to 98% complete, with an average sequencing depth over 50X. The power of this approach is highlighted with Pinus torreyana, a species that has yet to reveal intraspecific cpDNA divergence in previous RFLP and cpSSR studies. In a comparison of two individuals, we identified 5 SNPs in 101 kb of chloroplast DNA. We conclude this talk by considering how MSBS might be applied to population-level screening of additional genes and genomes.

Chloroplast genome sequencing using Solexa and SOLiD
Nolan Kane
Next-generation sequencing technology enables rapid, inexpensive characterization of small genomes, but cannot easily deal with the highly complex genomes of most eukaryotes. However, the smaller organellar genomes can be isolated and sequenced, enabling these technologies to be applied to eukaryotic systems. Here we report the use of Solexa and SOLiD to sequence whole chloroplast genomes from several species from the Compositae (Asteraceae), the largest plant family. The advantages of each of these technologies is discussed, and several potential uses are examined.

Prospects and challenges of 454 sequencing for PGR characterization
Katrina Dlugosch
Roche 454 Life Science GS FLX sequencing is a next-generation technology that currently yields ~100 megabases of sequence per run, in ~250 bp lengths. These are relatively long reads among the next-gen approaches available, and they offer the potential to reconstruct complex (repetitive) nuclear genomic sequence without the use of an existing template for assembly. Where a template is available to aid assembly, long reads also offer the possibility of obtaining large amounts of sequence at low coverage. I will relate some of our own experiences with comparative genomics from low-coverage GS FLX data, and I will detail the laboratory and bioinformatic requirements of using this approach to manage plant genetic resources.

C. Workshop outline
As new sequencing technologies become rapidly available, the price for sequencing is predicted to drop continuously. The human genomics community in particular is pressing hard for cheaper and faster sequencing methods as they promise new and improved treatments in the area of medicine. The great potential of these technologies for the field of plant genetic resource management have so far remained largely untapped. However, sequencing large areas of the genome in order to obtain information about inter-and intra-specific variability is about to become a reality due to the ever decreasing cost of sequencing technologies. Soon, germplasm will become distinguishable on the level of varieties and land races with standardized methods that are fast, reliable and affordable. Such methods could include the use of massively parallel sequencing to decipher the genetic code of whole plastids and/or chip based approaches that could survey SNP variation at many nuclear loci for many individuals. This will allow researchers to tackle problems such as landrace genotyping, species level identification of wild relatives in a genebank setting, the detection of duplicate accessions, greater efficiency of germplasm management and a standardized molecular characterization protocol between different genebanks across the globe.
The great opportunities for plant genetic resource management that arrive with these new technologies need to be explored. Challenges of already established methods, such as plant DNA-barcoding, should be addressed and limitations of such techniques should be discussed in the context of plant genetic resource management. The diversity of crops regarding their reproductive biology, agricultural management and genetic make-up, poses a particular challenge that needs special consideration for the development of global standards. Furthermore, the generation of tools, such as a centralized database where standardized methods can be documented and characterization results can be submitted, ought to be a topic of discussion.

At this workshop, we aim to discuss how best to make use of these emerging possibilities and how to actively influence the development of accompanying bioinformatics methods as to adapt them to suit the plant genetic resource community’s needs. The debate about the usefulness of many of these methods needs to be moved from the informal setting of ‘institutional hallways’ to an inter-institutional level in order to work on a common strategy to capitalize on these rapidly emerging opportunities for the management of plant genetic resources.
The workshop will consist of a series of lectures, ranging from technical and theoretical viewpoints to more applied aspects. We are also planning several ‘breakout sessions’, in which the participants will be able to get first-hand experience with some of the new methods and analysis techniques under the guidance of experts in the field and representatives of the private sector. Furthermore a mediated discussion forum is envisioned, where scientists can freely exchange their ideas on this topic and debate controversial issues.