Metagenome analysis of benthic marine invertebrates

Project Details

Project Lead
Eric Schmidt 
Project Manager
Zhejian Lin 
Project Members
Earl Middlebrook, Diarey Tianero, Thomas Waller, Thomas Kakule, Malcolm Zachariah, Jason Kwan, Russell Green, Zhejian Lin, Ashaimaa Moussa, Thomas Smith, Elizabeth Pierce  
Institution
University of Utah, Department of Medicinal Chemistry  
Discipline
Biology (603) 
Subdiscipline
40.05 Chemistry 

Abstract

We are carrying out deep sequencing of environmental DNA from benthic marine organisms that are important components of their community but that have not been extensively examined genomically. In these organisms, symbiotic bacteria are demonstrably critical to host survival. The metagenomes are extremely complex, yet robust assemblies can sometimes be achieved. These properties make benthic marine invertebrates excellent models for NGS technology. In this project, we will use Future Grid resources to carry out de novo assembly of marine invertebrate metagenomic sequence data, a process that requires large amounts of memory and CPU power due the volume of data.

Intellectual Merit

This work will help determine the potential utility of NGS technology, which produces a large amount of data but as relatively short reads, in metagenomics.

Broader Impacts

In the course of our work we will determine the practical aspects of processing large and complex Illumina sequencing data to obtain de novo genome assemblies of very minor members of the metagenome. This will be of great use to the metagenomics community.

Scale of Use

Assemblies using the program Meta-Velvet require a single node with a large amount of memory (~150 GB). Ideally we would be able to SSH into a single node to run the assembly. Long-term we may explore more distributed workflows.

Results

We have been able to successfully assemble the complete genome of a previously unknown endosymbiotic bacterium from metagenomic sequence data obtained from a marine invertebrate (even though the bacterium only accounted for ~0.6% of the data). The complete genome afforded many insights into the symbiotic relationship, which we have reported in a paper published in Proceedings of the National Academy of Sciences. The insights gained in this effort have allowed us to develop new methods in data processing and assembly which we are currently refining and will be the subject of future publications.  We will continue to use Future Grid in these efforts to gain insight into other symbiotic systems. The scientific broad impact of this work is twofold. First, these symbiotic relationships are a key, yet poorly understood aspect of coral reef biodiversity. Second, these symbioses lead to the production of bioactive small molecules. By understanding the origin of compounds, we are developing new methods to tap biodiversity for potential application in medicine, agriculture, and other areas.