ABSTRACT Title of Dissertation: DIET AND STOMACH MICROBIOTA OF GULF MENHADEN, A KEY FORAGE FILTER FEEDING FISH SPECIES Ammar Hanif, Doctor of Philosophy, 2020 Dissertation directed by: Professor, Rosemary Jagus, and University of Maryland Center for Environmental Science Menhaden represent a family of important filter feeding forage fish that serves as a trophic link between plankton and piscivorous predators in the marine environment. Dietary analysis is difficult because diet items are small and >80 % of the stomach content is amorphous material. DNA metabarcoding combines mass-amplification of short DNA sequences (barcodes) with high- throughput sequencing. This application allows the simultaneous identification of many taxa within the same environmental sample, as well as the analysis of many samples simultaneously, providing a comprehensive assessment of diet items and gut microbiota. Here we present a methodological approach using DNA metabarcoding suitable for a small filter feeding fish to identify the stomach contents of juvenile Gulf menhaden (Brevoortia patronus), collected within Apalachicola Bay, Florida. I describe the optimization of DNA extraction, comparison of two primers and sequencing protocols, estimation of menhaden DNA contamination, quality filtering of sequences, post-sequence processing and taxonomic identification of recovered sequences. I characterized the prokaryotic community using 16S universal ribosomal RNA (rRNA) gene sequencing primers in the V3-V4 hypervariable regions. Using two different sequencing protocols employing different ?universal? 16S rRNA gene sequencing primers. Although no difference in overall operational taxonomic units (OTUs) was found, the two sequencing protocols gave differences in the relative abundancies of several bacterial classes. The dominant OTUs resulting from 16S rRNA gene sequencing at the phylum level were assigned to Proteobacteria, Acidobacteria, Actinobacteria and Chloroflexi and included oil eating bacteria consistent with the Gulf of Mexico location. Stomach microbiota and diet were compared in juvenile Gulf menhaden, Brevoortia patronus, caught at two locations, Two Mile Channel and St. Vincent Sound, in Apalachicola Bay, FL in May and July of 2013. The stomach microbiota of samples from both locations showed a predominance of Proteobacteria, Chloroflexi, Bacteroidetes, Acidobacteria and Actinobacteria, although significant differences in composition at the class level were seen. The stomach microbiota from fish from Two-Mile Channel showed a higher level of taxonomic richness and there was a strong association between the microbiota and sampling location, correlating with differences in salinity. Approximately 1050 diet items were identified, although significant differences in the species represented were found in samples from the two locations. Members of the Stramenopile/ Alveolate/Rhizaria (SAR) clade accounted for 66 % representation in samples from Two Mile Channel, dominated by the diatoms Cyclotella and Skeletonema, as well as the ciliate Oligotrichia. In contrast, Metazoa (zooplankton) dominated in samples from St. Vincent Sound, accounting for over 80 % of the reads. These are mainly Acartia copepods. Since ciliates are considered to be microzooplankton, this means there is just over 60 % representation of phytoplankton in samples from Two Mile Channel and over 90 % representation of zooplankton in samples from St. Vincent Sound. Overall, I demonstrate the diversity of juvenile menhaden stomach contents that supports a characterization of menhaden as environmental samplers. DIET AND STOMACH MICROBIOTA OF GULF MENHADEN, A KEY FORAGE FILTER FEEDING FISH SPECIES by Ammar W. Hanif Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2020 Advisory Committee: Professor Rosemary Jagus, Chair Professor Allen Place Professor David Secor Professor Jeff Shields Dr. Kevin Friedland ? Copyright by Ammar W. Hanif 2020 Preface Chapter 2 of this thesis will be submitted for publication in Limnology and Oceanography Methods and is formatted under the guidelines specified for that journal. Chapter 3 of this thesis will be submitted for publication in Frontiers in Microbiology and is formatted under the guidelines specified for that journal. ii Foreword This project began as a joint project between the NOAA-EPP funded Cooperative Science Centers established to help train under-represented minorities in marine sciences. The partners of the Cooperative Science Centers, the Living Marine Resources Cooperative Science Center, LMRCSC, lead institution, the University of Maryland Eastern Shore, UMES, along with the Environmental Cooperative Science Center, ECSC, lead institution Florida A&M University's, FAMU, embarked on a multi-institutional study of forage fish. The skills and resources of the partner institutions and their students were brought together to work on unanswered questions and gaps in our knowledge of these key species. Atlantic and Gulf menhaden are among the largest US fisheries used primarily for feeding to farmed animals and aquaculture fish. There are concerns that this will cause unintended and unconsidered consequences, affecting not only the populations of predator fish that depend on the prey fish, but more widely on ecosystem integrity and function. My part in this has been to bring new technology to investigating the microbiota and diet of the Gulf menhaden, Brevoortia patronus, using DNA metabarcoding. iii Dedication This work is dedicated to my children as an example of what they can accomplish with patience, perseverance, prayer, and support and to all those that came before me. iv Acknowledgements I would like to thank those who assisted me in this endeavor. My special regards go out to all my committee members, who have provided excellent support and the skills to move forward. My deepest gratitude is reserved for Dr. Rosemary Jagus, my mentor and advisor. Many thanks and much appreciation to fellow graduate students. This work was supported by the National Oceanic and Atmospheric Administration, Educational Partnership Program award to the Living Marine Resources Cooperative Science Center (LMRCSC) NA11SEC4810002 and NA16SEC4810007. I was supported in part by MD SeaGrant Graduate Student Fellowship, NA10OAR4170072 SA7528129-D and in part as an LMRCSC Graduate Fellow. Thanks to Dr. Stacy Smith, Delaware State University, for coordinating the menhaden collection. Thanks to James White, Resphera Bioscience, for assistance with data analysis, Sabeena Nazar, for input and assistance with sequencing, Ernest Williams, Institute for Marine and Environmental Technology, for assistance on many experiments, Dr. Ryan Woodland, Chesapeake Bay Laboratories, for discussions on stable isotope analysis and Dr. Russell Hill, Executive Director of the Institute for Marine and Environmental Technology who supported and encouraged me during my final push to completion. Thanks also to LMRCSC summer undergraduate interns Alexis Peterson, Marcus Hughes, Nicolas Dawson, Rashawnda Wright, and Malisa Smith and to Centennial High School intern, Eric Sibanda for assistance with this work. v Table of Contents Preface ................................................................................................................ ii Foreword .............................................................................................................iii Dedication .......................................................................................................... iv Acknowledgements ............................................................................................ v Table of Contents .............................................................................................. vi List of Tables .................................................................................................... viii List of Figures .................................................................................................... ix List of Abbreviations .......................................................................................... xi Chapter 1: Introduction ....................................................................................... 1 1.1 Forage fish .......................................................................................................... 1 1.2 Menhaden ............................................................................................................ 2 1.3 Gulf menhaden .................................................................................................... 4 1.4 Apalachicola Bay, FL ......................................................................................... 8 1.6 Menhaden diet ................................................................................................... 10 1.7 DNA-based approaches to the identification of diet ......................................... 10 1.8 Use of DNA metabarcoding for the analysis of the stomach content of juvenile Gulf menhaden .............................................................................................................. 15 1.9 Research questions ............................................................................................ 16 Chapter 2: Methodology for the identification of stomach contents in the filter feeding fish (Brevoortia patronus) using DNA metabarcoding........................ 17 2.1 Abstract ............................................................................................................. 17 2.2 Introduction ....................................................................................................... 18 2.3 Materials and Procedures .................................................................................. 23 2.3.1 Sample collection ...................................................................................... 24 2.3.2 Optimization of DNA extraction ............................................................... 25 2.3.3 Sample preparation and DNA extraction ................................................. 26 2.3.4 High throughput sequencing ..................................................................... 26 2.3.5 Post-sequencing pipeline .......................................................................... 28 2.3.6 Analysis of diversity .................................................................................. 29 2.3.7 Menhaden 18S rRNA gene sequence and estimation of menhaden DNA contamination ........................................................................................................... 29 2.4 Assessment of method ...................................................................................... 30 2.4.1 Optimization of DNA extraction and quality assessment ......................... 30 2.4.2 Assessment of menhaden DNA contamination in stomach DNA samples 31 2.4.3 Comparison of results using different sequencing methods ..................... 35 2.4.4 Variation in taxonomic composition in each sample at the class level using the two different primers/sequencing protocols ........................................................ 43 2.4.5 Validation of the presence of taxa found by high throughput sequencing with end-point PCR using group specific primers .................................................... 47 2.5 Discussion ......................................................................................................... 49 2.5.1 Strengths of DNA metabarcoding ............................................................. 49 2.5.2 Limitations of DNA metabarcoding .......................................................... 50 2.5.3 Other considerations ................................................................................. 51 2.6 Comments and recommendations ..................................................................... 52 2.7 Acknowledgments............................................................................................. 53 vi Chapter 3: The stomach microbiota of juvenile Gulf menhaden (Brevoortia patronus) differs with location in Apalachicola Bay, FL ................................... 55 3.1 Abstract ............................................................................................................. 55 3.2 Introduction ....................................................................................................... 56 3.3 Methods............................................................................................................. 58 3.3.1 Sample collection ...................................................................................... 58 3.3.2 Sample preparation, DNA extraction, and estimation of DNA quality..... 60 3.3.3 MiSeq library preparation and high throughput sequencing ................... 60 3.3.4 Post-sequencing pipeline .......................................................................... 61 3.3.5 Microbiota data analysis .......................................................................... 62 3.4 Results ............................................................................................................... 62 3.4.1 Characterization of stomach bacteria communities from menhaden caught at Two-Mile Channel and St. Vincent Sound ............................................................ 62 3.4.2 Shared and unique OTUs .......................................................................... 67 3.4.3 Alpha and beta diversity analysis ............................................................. 74 3.5 Discussion ......................................................................................................... 77 3.6 Acknowledgements ........................................................................................... 82 Chapter 4: Diet of juvenile Gulf Menhaden (Brevoortia patronus) using DNA metabarcoding .................................................................................................. 83 4.1 Abstract ............................................................................................................. 83 4.2 Introduction ....................................................................................................... 84 4.3 Methods............................................................................................................. 87 4.3.1 Sample collection ...................................................................................... 87 4.3.2 Sample preparation and DNA extraction ................................................. 89 4.3.3 High throughput sequencing ..................................................................... 89 4.3.4 Post sequencing pipeline........................................................................... 90 4.3.5 Data analysis ............................................................................................ 90 4.3.6 Stable isotope analysis .............................................................................. 91 4.4 Results ............................................................................................................... 91 4.4.1 Characterization of stomach 18S rRNA gene sequences from menhaden caught at Two-Mile Channel and St. Vincent Sound ................................................ 91 4.4.2 Alpha and beta diversity analyses show that stomach samples from TMC are more diverse than those from SVS ...................................................................... 99 4.4.3 Stable isotope analysis ............................................................................ 102 4.5 Discussion ....................................................................................................... 104 4.6 Acknowledgements ......................................................................................... 108 Chapter 5: Future steps and final thoughts ................................................... 112 5.1 Limitations ...................................................................................................... 112 5.2 How should we convert sequence reads to dietary data? ................................ 113 5.3 Studying ecosystems with DNA metabarcoding ............................................ 113 5.4 Use of filter feeders as environmental biomonitors in environmental metabarcoding studies ................................................................................................. 114 5.5 Accounting for functional ecological importance/significance of the stomach microbiota ................................................................................................................... 116 5.6 The evolution of DNA metabarcoding analysis methods ............................... 117 Bibliography .................................................................................................... 119 vii List of Tables Table 2.1: Primers used for sequencing Table 2.2: DNA recovery and quality using different DNA extraction kits: Table 2.3: Effect of different gene sequencing methods on the number of raw sequence and post-processing sequence reads Table 2.4: Mean alpha diversity metrics from 16S rDNA gene sequences Table 2.5: Significant differences in mean 16S rRNA reads at the class level using Fadrosh versus Illumina sequencing protocols Table 2.6: Number of 18S rRNA reads and OTUs per sample Table 2.7: Taxon-specific primers for validation of sequence assignments Supplemental Table 2.1: Characteristics of water at collection sites in Apalachicola Bay Table 3.1: Water quality measurements from each site Table 3.2: Comparison of the number of the number of raw reads and post- processing reads from Two Mile Channel and St Vincent Sound samples Table 3.3: Representation of the most abundant proteobacteria from Two Mile Channel and St Vincent Sound samples Table 3.4: Taxonomic composition of the shared OTUs from all samples Table 3.5: Unique OTUs from Two Mile Channel and St Vincent Sound samples Table 4.1: Water quality measurements from each sample site in Apalachicola Bay Table 4.2: Comparison of the number of the number of raw reads and post- processing reads from Two Mile Channel and St Vincent Sound samples Supplemental Table 4.1: Representation of the most abundant higher taxa from Two Mile Channel and St Vincent Sound samples Supplemental Table 4.2: Taxonomic composition of the SAR clade at the genus level of stomach contents from Two Mile Channel and St Vincent Sound Supplemental Table 4.3: Taxonomic composition of Metazoa at the genus level of stomach contents from Two Mile Channel and St Vincent Sound Supplemental Table 4.4: Length of fish caught at TMC and SVS viii List of Figures Figure 1.1: Conceptual life history model of Gulf menhaden Figure 1.2: Perspective map of Apalachicola Bay, Florida Figure 2.1: Summary of workflow Figure 2.2: Collection sites in Apalachicola Bay Figure 2.3: Percentage of menhaden 18S rRNA gene sequences in stomach DNA samples Figure 2.4: Multiple alignment of Gulf menhaden 18S rRNA gene sequence Figure 2.5: Schematic of the 16S rRNA gene sequencing primers Figure 2.6: Comparison of 16S rRNA gene read counts generated by Fadrosh and Illumina protocols pre- and post-processing Figure 2.7: Comparison of alpha diversity metrics of OTUs at the class level generated by the Fadrosh or Illumina protocols Figure 2.8: Comparison of apparent community beta diversity by unweighted and weighted UniFrac measures Figure 2.9: Comparison of the relative abundance of OTUs at the class level derived using Fadrosh and Illumina protocols for 16S rRNA gene sequencing Figure 2.10: Comparison of the relative abundance of OTUs at the phylum level determined using primers for 18S rRNA gene sequences Figure 2.11: Validation of taxa found by endpoint PCR Figure 3.1. Collection sites in Apalachicola Bay Figure 3.2: Taxonomic composition at the phylum level of menhaden stomach microbiota from Two Mile Channel and St Vincent Sound samples Figure 3.3: Taxonomic composition at the class level of Proteobacteria and Chloroflexi from Two Mile Channel and St Vincent Sound samples Figure 3.4: Variation in taxonomic composition at the class level of stomach microbiota in each sample from Two Mile Channel and St Vincent Sound Figure 3.5: Taxonomic composition of the shared OTUs from all samples Figure 3.6: Comparison of the differences in the microbial community from each sample by alpha diversity metrics Figure 3.7: Comparison of the differences in the microbial community from each sample by beta diversity analysis Figure 4.1: Collection sites in Apalachicola Bay Figure 4.2: Taxonomic composition at the higher taxa level of menhaden stomach contents from Two Mile Channel and St Vincent Sound samples ix Figure 4.3: Variation in taxonomic composition of stomach contents of the major taxa in each sample from Two Mile Channel and St Vincent Sound Figure 4.4: Variation in taxonomic composition of stomach content OTUs within the SAR clade in each sample from Two Mile Channel and St Vincent Sound Figure 4.5: Variation in taxonomic composition of stomach content OTUs within the Metaozoa in each sample from Two Mile Channel and St Vincent Sound Figure 4.6: Comparison of the differences in stomach contents from each sample by alpha diversity metrics Figure 4.7: Comparison of the differences in stomach contents from each sample by beta diversity analysis Figure 4.8: Significant differential abundance in OTUs from Two Mile Channel and St Vincent Sound Figure 4.9: Stable isotope analysis of muscle x List of Abbreviations 16S rRNA: 16S (Svedberg) ribosomal ribonucleic acid 18S rRNA: 18S (Svedberg) ribosomal ribonucleic acid ANERR: Apalachicola National Estuarine Research Reserve ARISA: automated ribosomal intergenic spacer analysis BASLab: Bioanalytical Service Laboratory BLAST: basic local alignment search tool bp: base pairs CASIF: Central Appalachian Laboratory Stable Isotope Facility COI: cytochrome c oxidase subunit I CTAB: cetyl trimethyl ammonium bromide DGGE: denaturing gradient gel electrophoresis DSU: Delaware State University DNA: deoxyribonucleic acid ECSC: Environmental Cooperative Science Center eDNA: environmental deoxyribonucleic acid FAMU: Florida Agriculture and Mechanical University FDR: false discovery rate FISH: fluorescence in situ hybridization GSI: gonadal somatic index GUI: graphical user interface HTS: high-throughput sequencing ITS: internal transcribed spacer KNIME: Konstanz information miner LMRCSC: Living Marine Resources Cooperative Science Center OTU: operational taxonomic unit PAH: polycyclic aromatic hydrocarbon PCoA: principal coordinates analysis PCR: polymerase chain reaction PSU: practical salinity units QIIME: quantitative insights into microbial ecology RNA: ribonucleic acid RT-PCR: reverse transcriptase-polymerase chain reaction SAR: Stramenopiles, Alveolates and Rhizaria SEDAR: Southeast data, assessment, and review SDS: sodium dodecyl sulfate SL standard length SVS: Saint Vincent Sound TL: TMC: Two-mile Channel T-RFLP: terminal restriction fragment length polymorphism xi Chapter 1: Introduction 1 1.1 Forage fish 2 Forage fish, also called prey fish, are small pelagic fish that are preyed on by 3 larger predators for food. Predators include larger fish, seabirds and marine 4 mammals. Typical ocean forage fish feed near the base of the food chain on 5 plankton, often by filter feeding. Forage fish species play both an ecological role 6 in food webs and an economic role in commercial fisheries (Pikitch et al, 2012; 7 Pikitch et al, 2014). Forage fish species are exceptionally important to the 8 structure and functioning of marine ecosystems, serving as the main conduit of 9 energy flow from lower to upper trophic levels (Pikitch et al, 2012). They can 10 exact middle out control on both plankton and predators. 11 Marine ecosystems that exhibit this community configuration, featuring many 12 species at the lower and upper trophic levels, but constricted to one, or at most 13 several, dominant planktivorous forage-fish species at the crucial mid-level, have 14 been referred to as ?wasp-waist? ecosystems (Bakun et al 2006; Bakun et al, 15 2009; Alder et al, 2008; Cury et al, 2000). Wasp-waist species support a high 16 diversity of larger predators that are highly susceptible to fluctuations in prey 17 biomass (Cury et al, 2000). Variations in the abundance of forage fish species 18 will propagate to both higher trophic levels (which may depend on them as a 19 major food item) and to lower trophic levels (on which they may exert very heavy 20 grazing pressures) (Bakun, 2006). Numerous ecosystem models have shown the 21 overall importance of forage fish to the ecosystem (Geers et al, 2016; O?Farrell et 22 al, 2017; Plaganyi & Essington, 2014). Forage fish are the main diet for many 23 fish, seabirds, and marine mammals (Alder et al, 2008; Essington et al, 2015). It 24 is estimated that forage fish can make up to 20 % of the diet of marine mammals 25 and 12.5 % of the diet of predatory seabirds (Alder et al, 2008; Essington et al, 26 2015). Furthermore, they can be vital to local economies by supplying fish for 27 large volume fisheries that support industrial or reduction fisheries. Such fisheries 28 are found throughout the world?s oceans, except Antarctica. Some of the largest 29 1 fisheries occur on the west coast of South America (southeast Pacific), northern 30 Europe, and the United States (east and Gulf coasts and Alaska) where the 31 principal catches are of Peruvian anchovy, capelin, and Atlantic and Gulf 32 menhaden respectively (Alder 2008). Historically herring, sardines and 33 menhaden have been the main forage fish species targeted for the reduction 34 fisheries. In the late 1950s, Peruvian anchovy was added to this list and fishing 35 for this species increased. Piktich found that while the global catch of forage fish 36 worldwide was valued at $5.6 billion, the fisheries they supported were valued at 37 more than twice that at $11.3 billion (Pikitch et al, 2014). 38 1.2 Menhaden 39 Menhaden (Brevoortia spp) are a small common pelagic schooling genus of 40 the family Clupeidae occupying the coasts and estuaries of the United States 41 Atlantic and Gulf regions. There are four recognized species of menhaden in 42 North American marine waters, three of which are found in the Gulf of Mexico. 43 Recent taxonomic work, using DNA sequence comparisons, have organized 44 these into large-scaled (Gulf and Atlantic menhaden, B. patronus and B. 45 tyrannus, respectively) and small-scaled (Finescale and Yellowfin menhaden, B. 46 gunteri and B. smithi, respectively) (Anderson, 2007). There is higher relatedness 47 within the small-scaled and large-scaled species than amongst other members of 48 the genus. Of these, only the large-scaled, B. patronus and B. tyrannus, support 49 an established reduction fishery where fish are reduced to fish oil and fishmeal. 50 Atlantic menhaden range along the Atlantic coast from Nova Scotia to 51 southeastern Florida. Gulf menhaden dominate the menhaden fishery in the Gulf 52 of Mexico, with the other two Gulf species menhaden species representing less 53 than 1 % of the annual catch (Ahrenholz, 1981). Gulf menhaden range from 54 Veracruz, Mexico to southwestern Florida. Though Finescale and Gulf menhaden 55 stay within the Gulf of Mexico, the Yellowfin menhaden overlap the ranges of the 56 other menhaden and can be found from the Mississippi River Delta to Virginia. 57 Finescale menhaden range just east of the Mississippi River Delta to Campeche, 58 Mexico. 59 2 Menhaden form large schools, which can be found migrating throughout 60 estuaries and near-shore regions. Menhaden depend on these environments for 61 spawning and nursery grounds. Although Atlantic menhaden have been 62 considered to be the most migratory of the four species (Ahrenholz, 1991), recent 63 reassessments of tagging data show their seasonal movements to be more 64 regional than coastal (Liljestrand et al, 2019). The adult population is generally 65 distributed from Florida to Maine. During May-June, an estimated 86 % of 66 Atlantic menhaden from North and South Carolina move northwards. They 67 remain largely within the same coastal region from June to October. In winter, 68 approximately 20 % of tagged fish north of the Chesapeake Bay move southward 69 to the Chesapeake Bay and North and South Carolina. However, most appear to 70 over-winter in the northern part of their range (Liljestrand et al, 2019). This is 71 consistent with high Atlantic menhaden larval abundance in near-shore waters 72 during the winter in regions north of the Maryland-Delaware line and the 73 Chesapeake Bay region (Simpson et al, 2016). The other menhaden species do 74 not exhibit extensive migration patterns. 75 More is known of the spawning behavior of Atlantic menhaden compared to 76 the other species. Analysis of Northeast Fisheries Science Center 77 ichthyoplankton surveys (1977-1987 and 2000-2013) shows that Atlantic 78 menhaden spawning occurs primarily near shore over a large spatial range, from 79 southern New England to North Carolina (Simpson et al, 2017) with hotspots in 80 the Mid Atlantic Bight between the Chesapeake Bay and the Delaware Bay and 81 near Long Island in New York. Spawning activity takes place throughout the year 82 and population range, but peaks during November and December. 83 Neither Yellowfin nor Finescale menhaden show any evidence of systemic 84 seasonal migration but rather remain nearshore or in the estuaries throughout 85 the year (Gunter, 1945; Dahlberg, 1970; Ahrenholz, 1991). Based on the 86 collection of ripening adults and times when eggs are found in the water column, 87 the spawning activity of both Yellowfin and Finescale menhaden is presumed to 88 be from November to March where they filter-feed extensively on phytoplankton 89 3 and zooplankton. As larvae, menhaden are attack feeders that subsequently 90 undergo an ontogenic change in feeding behavior to filter feeding. Due to their 91 collective filtering capacity, it has been postulated that menhaden can exert a 92 significant grazing pressure on common phytoplankton blooms that occur near 93 and in estuaries (Deegan, 1993; Lynch et al, 2010). 94 1.3 Gulf menhaden 95 Gulf menhaden (Brevoortia patronus) is the main menhaden species in the 96 Gulf of Mexico, ranging from the northern Gulf from Brazos Santiago, Texas, to 97 Tampa Bay, Florida (Christmas & Gunter, 1960). This species distributes along 98 the U.S Gulf coast nearshore waters during late spring and summer then utilize 99 deeper waters offshore beginning in October and for the winter (Ahrenholz, 100 1991). Though Gulf menhaden do not stratify with age, as seen in Atlantic 101 menhaden, there are data to indicate a tendency for Gulf menhaden from the 102 extreme eastern and western ranges to move toward the center of their range 103 with age (Ahrenholz, 1981). The species is an important forage fish species 104 along the Gulf coast providing an important food source for fish, seabirds and 105 marine mammals (Vaughan et al, 2007). Many of the commercially and 106 recreationally harvested fish species along the Gulf coast rely on the abundant 107 schools of menhaden, including king mackerel (Scomberomorus cavalla), 108 Spanish mackerel (Scomberomorus maculates), dorado (Coryphaena hippurus), 109 crevalle jack (Caranx hippos), tarpon (Megalops atlanticus), red drum and bonito 110 (Sarda sarda) (Dailey et al, 2008; Franklin, 2007; Sagarese et al, 2016). Among 111 other species, the diet of the blacktip shark (Carcharhinus limbatus) and the 112 brown pelican (Pelecanus occidentalis), Louisiana's state bird, can consist of 113 over 95 % menhaden (Franklin, 2011). 114 Gulf menhaden support a large directed reduction fishery and, along with 115 shrimp, support the largest fisheries by landings and by revenue in the Gulf of 116 Mexico (Vaughan et al, 2007; O?Farrell et al, 2017). Gulf menhaden are reduced 117 to fish oil and meal used in livestock feed, aquaculture feed and omega-3 fatty 118 acid-rich menhaden oil, is used in pharma- and nutraceuticals, cosmetics and 119 4 other consumer products (Olsen et al, 2014; SEDAR, 2013; Menhaden Advisory 120 Commission, 2015). Assessment of the fishery in 2018 concluded that Gulf 121 menhaden are not overfished or undergoing overfishing (SEDAR63, 2018). 122 However, the assessment panel concluded that data and techniques are 123 insufficient at present to incorporate factors that could describe the ecosystem 124 value of Gulf Menhaden adequately. Without adequate assessment of the 125 ecological role of the species, the determination of fishery reference points 126 remains inadequate. In particular, if a wasp-waist species such as Gulf 127 menhaden decreases in abundance, the architecture of energy flows can 128 become highly vulnerable and unreliable (Jordan et al, 2005; Robinson et al, 129 2015; Geers et al, 2016). This raises the possibility that the potential exists for 130 the large reduction fishery to impose a substantial ecological impact (Sagarese et 131 al, 2016). One deficit in understanding the ecological role of Gulf menhaden is 132 the dearth of specific dietary data for menhaden. Most of the menhaden dietary 133 studies have been on the allopatric species, the Atlantic menhaden, Brevoortia 134 tyrannus, which does not overlap with B. patronus spatially; hence emphasizing 135 the need for a more complete understanding of the diet of Gulf menhaden. 136 The valid scientific name for Gulf menhaden is Brevoortia patronus (Goode). 137 The life history of Gulf menhaden has been described by several authors (Hoode 138 & Fore, 1973; Ahrenholz 1991). In general, Gulf menhaden life history is typical 139 of the cycle followed by most estuarine-dependent species in the Gulf of Mexico 140 (Figure 1.1). Gulf menhaden have been found spawning from near-shore to sixty 141 miles offshore along the entire U.S. Gulf coast from October through early March 142 (Suttkus, 1956; Turner, 1969; Combs, 1969; Fore, 1970; Christmas & Waller, 143 1975; Lassuy, 1983; Shaw et al, 1985). Peak spawning periods fluctuate from 144 year-to-year, probably in response to varying environmental conditions (Suttkus 145 1956). Using samples archived at the NMFS Beaufort Laboratory from 1944- 146 2014, as well as fresh samples collected from Mississippi and Louisiana waters 147 from 2014-2016, gonadal somatic index (GSI) values were found to increase in 148 early October for both males and females, and reach peak values for females by 149 the second half of October (Brown-Peterson et al, 2017). Female GSI remained 150 5 elevated but gradually decreased from late October through March, whereas 151 male GSI remained elevated from early October through March without a gradual 152 decline. Mean male and female GSI values suggest a spawning season of 5.5 153 months, extending from early October through the end of March (Brown-Peterson 154 et al, 2017). Work looking at the histology of female fish concluded that Gulf 155 menhaden are asynchronous batch spawners, potentially spawning every 4-7 156 days. The Brown-Peterson study provided an estimated total annual fecundity of 157 10-20 times higher than that estimated in earlier studies (Lewis & Roithmayr, 158 1981) and the value used in the last benchmark assessment of menhaden stock 159 (SEDAR 13). 160 Spawning occurs offshore, and the larvae move into estuarine nursery areas 161 where they spend the early part of their lives (Figure 1.1) (Christmas et al, 1982; 162 Reid. 1955). Egg hatch and early growth of planktonic larvae occur when 163 currents from offshore spawning grounds transport them to low-salinity estuary 164 nursery grounds (Minello & Webb, 1997). The use of estuaries as nursery habitat 165 is a common theme in the life history of marine fishes because the protected 166 environment and abundant food provide an ideal location (Able, 2005; Potter et 167 al, 2013). Gulf menhaden are unusual in that the juveniles depend on estuaries 168 and can be considered to be marine estuarine opportunists (Potter et al, 2013). 169 Planktonic larvae make their way into estuaries. At hatching, larvae are poorly 170 developed with undeveloped mouths and fin rays as well as nonfunctional, 171 unpigmented eyes (Reintjes 1962; Houde & Fore 1973). The metamorphosis of 172 Gulf menhaden larvae to juveniles occurs between 28-35 mm SL (Deegan 1986) 173 and at a reported age range of 88-103 days (Deegan & Thompson 1987). 174 Juvenile growth and development occur primarily in estuaries (Robinson et al, 175 2015). The duration of this stage, and the ultimate size reached, varies based on 176 estuarine conditions and the absolute age of individual fish (relative to when they 177 were spawned during the season) (Lassuy 1983; Ahrenholz 1991). At the time of 178 hatching, larval Gulf menhaden are from 2.8-3.1 mm standard length (SL). First 179 feeding occurs at 2.9-5.7 days at 4.3 mm (SL) Powell (1993). As larvae transform 180 6 into juveniles, 181 body depth and 182 weight increase 183 substantially with 184 only a minimal 185 increase in length 186 (Ahrenholz 1991). 187 Significant 188 changes in 189 internal 190 morphology occur; 191 the maxillary and 192 dentary teeth 193 become 194 nonfunctional and 195 disappear. Gill 196 rakers increase in 197 length, number, 198 Figure 1.1 Conceptual life history model for Gulf menhaden Diagram from Christmas et.al. 1982 and complexity, 199 and pharyngeal 200 pockets appear. The alimentary tract folds forward, and a muscular stomach 201 (gizzard) and many pyloric caeca develop while the intestine forms several coils 202 (June & Carlson 1971). 203 Young-of-the-year Gulf menhaden are ubiquitous members of the northern 204 Gulf of Mexico estuarine nekton communities and occupy fresh to brackish 205 waters. After transformation from the larval form, juveniles remain in low salinity, 206 near-shore areas where they travel about in dense schools, often near the 207 surface (Lassuy, 1983). Menhaden are omnivorous filter feeders, feeding by 208 straining phytoplankton and zooplankton from water. As juveniles, Gulf 209 menhaden live in tidal creeks, marsh and open bay areas where they filter the 210 water column via their gill rakers and where they remain until late summer or fall 211 7 (Deegan, 1990). The migration pattern of juvenile Gulf menhaden involves the 212 sequential use of marsh creek and open bay areas, coinciding with the 213 productivity peaks in those areas (Deegan, 1990). By occupying tidal creeks 214 early in the year, they can take advantage of high primary productivity stimulated 215 by the influx of nutrients with high spring river flow, the flushing of detritus off the 216 marsh surface from the river mouth, as well as temperatures warmer than the 217 open bay area (Deegan, 1993). The combination of warm water and high 218 productivity in tidal creeks in the spring provides an environment that promotes 219 rapid growth. When food availability in the tidal creeks begins to decline, the fish 220 move to the open bay area where phytoplankton and zooplankton are increasing. 221 By fall, the schools disappear from near-shore waters and are thought to move 222 offshore, wintering on the inner and middle continental shelf (Roithmayr & Waller, 223 1963). The extent of the offshore range is unknown (SEDAR 63, 2018). 224 1.4 Apalachicola Bay, FL 225 My model system for studying Gulf menhaden stomach contents is the 226 Apalachicola Bay estuary. It is a highly productive lagoon and barrier island 227 complex on the upper Gulf coast of Florida (Figure 1.2). The high productivity is 228 a result of the Apalachicola River delivering freshwater and nutrients to the bay 229 (Livingston 1984, Mortazavi et al 2000a, 2000b, 2001). Nutrient input supports 230 high levels of phytoplankton productivity (Mortazavi et al, 2000b) which in turn 231 supports the bay?s secondary productivity (Chanton & Lewis 2002). It covers 232 about 212 square miles and serves as the interface between the river system 233 and the Gulf of Mexico. Four barrier islands bound the bay: St. Vincent Island, St. 234 George Island, Little St. George Island, and Dog Island. The bay area, including 235 Apalachicola Bay, East Bay, St. George Sound, St. Vincent Sound, Indian 236 Lagoon, and Alligator Harbor, is about 65 km long and 5 to 10 km wide. 237 Apalachicola Bay is a river-dominated system with the major source of 238 freshwater input coming from the alluvial Apalachicola River. The Apalachicola 239 River is the largest Florida river in terms of flow (NWFWM report 2017). 240 8 Maximum river flows 241 occur during late 242 winter to early spring 243 months and are 244 highly correlated 245 with Georgia 246 rainfalls (Meeter et 247 al, 1979). The bulk 248 of seawater flow is 249 from the east 250 entering St. George 251 Sound. The western 252 Figure 1.2: Perspective map of Apalachicola Bay, Florida end of Apalachicola 253 Bay is linked to the Gulf of Mexico by Indian Pass, the narrow channel between 254 St. Vincent Island and the mainland, with a maximum water depth of about 4 m. 255 St. Vincent Sound itself is shallow, with an average depth of little more than 1 m, 256 containing numerous oyster bars. Within the bay's shallow waters, with an 257 average depth of 3 m, are numerous oyster reefs and sandy shoals. The 258 surrounding wooded lowlands consist of saltwater and freshwater marshes, and 259 freshwater swamps. Two-Mile Channel follows the coastline from the west side of 260 the Apalachicola River estuary for approximately 2 miles. 261 Apalachicola Bay has been designated by NOAA as a National Estuarine 262 Research Reserve (ANERR). There is ongoing work with Florida A&M 263 University's (FAMU) Environmental Sciences Institute as part of the 264 Environmental Cooperative Science Center (ECSC) to develop a conceptual 265 model of Apalachicola Bay to help in management decisions and fill in data gaps 266 about the system. This made it an ideal site in which to pursue my studies on the 267 diet of juvenile Gulf menhaden 268 9 1.6 Menhaden diet 269 Assessment of menhaden diet is technically challenging because the visible 270 food items are small (5-100 ?m) and menhaden have a gizzard-like stomach that 271 grinds ingested items to an amorphous paste (Friedland et al, 1984). As a result, 272 most of the menhaden stomach content is unrecognizable and has been 273 described as amorphous material (Lewis & Peters, 1994) making microscopic 274 techniques extremely difficult for identification. Microscopic examination also has 275 disadvantages that include being labor intensive and needing highly skilled 276 individuals for the morphological identification of semi-digested or fully digested 277 plant and animal fragments (Holechek et al, 1982; Ingerson-Mahar, 2002; 278 Moreby, 1988). Friedland et al (1984) showed in the allopatric species, Atlantic 279 menhaden (Brevoortia tyrannus), that the minimum-sized filtered particle for 280 juvenile menhaden is 7 to 9 ?m; however, maximum filtration efficiency is for 281 particles approximately 100 ?m. It has been postulated that the detrital material 282 plays a role in the retention of these smaller particles (Friedland et al, 1984). 283 Detritus is primarily structural material of plant origin that also commonly includes 284 bacteria, fungi, microalgae, protozoa and small animals (Deegan et al, 1990). 285 However, detrital material will be reflective of the water filtered since constituents 286 of detritus vary by location (VanValkenburg, 1978). In addition, the amount of 287 detrital material in the stomach may differ depending on location of feeding 288 (Lewis 1994). Furthermore, although the role of ingested detritus has largely 289 been ignored (Lewis & Peters, 1994), it has been shown, using physiological and 290 stable isotope evidence that detrital material can be used as a food source in 291 juvenile Gulf menhaden (Deegan et al, 1990; Olsen et al, 2018). 292 1.7 DNA-based approaches to the identification of diet 293 In this study, I sought to establish the molecular technique of DNA 294 metabarcoding to provide an unambiguous forensic tool to identify stomach 295 contents of menhaden. DNA metabarcoding has become the method of choice in 296 characterizing living communities in any environment. This approach provides a 297 comprehensive culture-independent approach that has been used to obtain a full 298 10 inventory of gut microbiota (Jami et al, 2015; Tarnecki et al, 2017; Egerton et al, 299 2018) and prey items (Jakubavi?i?t? et al, 2017; Riccioni et al, 2018; Waraniak et 300 al, 2019) from multiple fish species. Advances in DNA based metabarcoding 301 have made comprehensive assessment of diet and gut microbiota feasible by 302 combining mass-amplification of short DNA sequences (barcodes) with high- 303 throughput sequencing. Through molecular barcoding methods, organisms in the 304 stomach contents of filter feeders, where most prey items lack diagnostic 305 taxonomic features, can now be assessed (Pompanon et al, 2012). Similarly, 306 DNA barcoding and high-throughput sequencing can also be used to evaluate 307 the microbiota of stomach contents (King et al, 2012) and for characterizing the 308 biodiversity of microbial communities. 309 The basic principle of DNA-based methods is to analyze the DNA extracted 310 directly from a sample derived from a site of interest. This could be water, 311 sediment, or gut contents. The earliest DNA-based methods extracted DNA from 312 a microbial community and probed for targeted genes of interest using the 313 technique fluorescent in situ hybridization (FISH). This method uses fluorescently 314 labeled, specific oligonucleotides probes as marker genes, which hybridize to the 315 target DNA (Amann et al, 1995). An alternative method was to sequence 316 amplicons of specific gene regions that were subsequently cloned into 317 Escherichia coli (Ward et al, 1990). These methods are only sufficient for low- 318 throughput applications, they do not deliver exhaustive insights into microbial or 319 prey diversity and are expensive and time consuming. Other DNA-based 320 methods such as automated ribosomal internal transcribed spacer analysis 321 (ARISA), terminal restriction fragment length polymorphism (T-RFLP), denaturing 322 gradient gel electrophoresis (DGGE) and microarrays were also pursued. 323 However, the development of high-throughput sequencing technology quickly 324 replaced those methods as the go-to method for DNA-based organism 325 identification in either microbial community or diet analysis by being able to 326 generate millions to billions of short-fragment DNA reads within hours. 327 11 High-throughput sequencing for identification uses a phylogenetically 328 conserved marker region as a DNA barcode to identify organisms. A suitable 329 barcode sequence needs to have conserved sequences at the 5?- and 3?-ends 330 with hypervariable region(s) in between. This allows amplification of the bar code 331 region by polymerase chain reaction. Over the last decade, barcode-based 332 approaches have been applied to study microbial communities and diet 333 (Hiergeist et al, 2015; Pompanon et al, 2012). These studies have provided 334 insights into diet, host health, complex trophic interactions and ecosystem 335 function. Barcode based approaches use the DNA sequence of the amplified 336 product to identify organisms. There are several types of barcodes available, 337 each with their own set of limitations and specific applications. Perhaps the most 338 widely used are the 16S (prokaryote) or 18S (eukaryote) ribosomal RNA gene 339 (rRNA) sequences or the mitochondrial cytochrome oxidase I (COI) sequence 340 (eukaryote). These targets have allowed a much more detailed analysis of gut 341 and fecal contents compared to earlier microscopic examinations. Barcoding 342 methods allow the identification of organisms in the stomach content of filter 343 feeders where prey items lack diagnostic taxonomic features (Pompanon et al, 344 2012). Early barcoding approaches to diet analysis and microbial communities 345 used a DNA profiling technique through amplification of the sample using general 346 or group-specific primers followed by denaturing gradient gel electrophoresis 347 (DGGE) analysis (Deagle et al, 2005; Harper et al, 2006; Hiergeist et al, 2015). 348 However, this method does not lend itself to high throughput analysis. Another 349 early method to identify taxa included bacterial cloning and sequencing of 350 amplicons generated by general or group-specific primers. Jo et al (2016) used 351 DNA barcoding and cloning of amplicons to identify diet items consumed by 352 brown trout, an invasive generalist feeder in Tasmanian lakes. Using primers for 353 COI (specific for metazoans), they identified a 1.4-fold higher number of dietary 354 items overall compared to visual quantification. However, the diversities of 355 coleopterans and dipterans identified via DNA barcoding were 2.7-fold and 7-fold 356 higher, respectively. This application has been used in the diet analysis of 357 passerine birds, whales, marine amphipods and bivalves, penguins, dolphins, 358 12 bats, and humans (Blankenship & Yayanos, 2005; Deagle et al, 2007; Dunshea 359 et al, 2008; Jarman et al, 2004; Rollo et al, 2002; Zeale et al, 2011). 360 DNA barcoding is a method of species identification using a short section of 361 DNA from a specific gene or genes. DNA metabarcoding, using high-throughput 362 sequencing technologies, allows the simultaneous amplification and identification 363 of bar code sequences from many taxa within the same environmental sample 364 and also allows the analysis of many samples simultaneously. DNA 365 metabarcoding coupled with high-throughput sequencing technologies has 366 allowed information on diet and microbiome studies to be obtained more rapidly 367 and has uncovered much higher diversity (Pompanon et al 2012). Within recent 368 years the lower cost of high throughput sequencing, expansion of sequence 369 databases, and more user-friendly data analysis and bioinformatics tools have 370 increased the use of DNA metabarcoding and high-throughput sequencing for 371 microbiome and diet studies (Deagle et al 2013; Pompanon et al, 2012). The first 372 papers using this method primarily dealt with analyzing fecal material from larger 373 mammals such as Australian fur seals, penguin, and bats (Bohmann et al, 2011; 374 Deagle et al, 2009; Deagle et al, 2010; Murray et al, 2011). In the marine 375 environment DNA metabarcoding and high-throughput sequencing have become 376 more common in studying the diet and microbiota of ecologically significant 377 animals as well as the microbial community of water and sediments. Harms- 378 Tuohy et al (2016) used this method to study the stomach content of the invasive 379 lionfish. They identified species that were missed in the digested liquiform 380 material from the stomach, using traditional microscopic methods. Given their 381 results they argued that DNA analyses of fish gut contents could be used in 382 monitoring or evaluating biodiversity. Filter feeders such as forage fish and 383 mollusks provide an opportunity to supplement evaluations of biodiversity or 384 environmental monitoring because they are effective environmental samplers. 385 King et al (2016) filled a gap in oyster gut and stomach microbiome diversity by 386 analyzing the stomach and gut contents of the eastern oyster. By using DNA 387 metabarcoding and high-throughput sequencing, they found two distinct rich 388 microbial communities in oyster stomachs and guts collected from two different 389 13 sites in Louisiana, USA. Furthermore, they found an approximate 8-fold increase 390 in the number of species in gut and stomach core microbiota than had been 391 identified previously. Previous studies had been predominantly culture- 392 dependent, focusing on characterizing human or oyster pathogens. Even culture- 393 independent studies using fingerprinting methods fund only a few important taxa 394 to compare the microbial patterns in the gut of different oyster populations. 395 Although DNA metabarcoding and high-throughput sequencing have 396 substantially enhanced diet and microbiome studies, heedless application of this 397 method can lead to the introduction of bias. Chapter 2 discusses some of the 398 technical issues that can introduce bias into DNA metabarcoding analysis and 399 describes optimization of the method for looking at Gulf menhaden stomach 400 contents. The choice of sequencing platforms is also important. There are 401 several available high-throughput sequencing platforms each with their own set 402 of limitations contributing to the sources of bias. To minimize errors and avoid 403 bias, the appropriate sequencing application must be selected based on research 404 goals. Perhaps the primary variable in sequencing technology is read length and 405 error rates. Earlier microbiome and diet studies used the pyrosequencing 406 technology of the Roche/454 platform (Roche), which had the advantage of 407 producing longer read lengths over other available platforms. However, this came 408 at the cost of having higher error rates that can lead to an overestimation of 409 diversity. The Illumina sequencing platform has lower error rates, but the shorter 410 read lengths, posing a challenge for designing barcodes that have sufficient 411 discriminatory power for species level identification. Though the Illumina platform 412 is currently a popular choice due to cost and lower sequencing errors, near full- 413 length fragments longer than 1,300 bps are required for a comprehensive and 414 reliable estimation of taxa richness (Yarza et al, 2014). Unfortunately, only 23 % 415 of the 16S rRNA sequences are longer than 900 bps. Clooney et al (2016) did a 416 comparison study across sequencing platforms, amplicon choice and sequence 417 analysis software and found that the choice of taxonomic binning software 418 proved to be more important in discriminatory power over sequencing platform 419 and choice of amplicon. High-throughput sequencing technologies produce 420 14 millions of reads giving gigabytes worth of sequencing data. It is only by using 421 bioinformatics tools that these data become usable to answer research 422 questions. These tools are charged with the task of discarding sequences with 423 errors, sorting the remaining sequences according to their tags, clustering them 424 based on sequence similarity and assigning them to a taxon. Once this is 425 complete, sequences can then be analyzed for ecological significance, e.g. by 426 measuring diversity or statistically determining group abundance. Such 427 processing tools also have the potential to introduce bias error giving a false 428 sense of diversity. For example, one artifact of sequencing technology is the 429 generation of chimeric sequences. Comparing the sequence to a known 430 database or determining distance from other sequences usually finds chimeras 431 (Soininen et al, 2009). Another critical step in the bioinformatics analysis is 432 sorting sequences into clusters of operational taxonomic units (OTU) in which 433 16S or 18S rRNA sequences are considered to be from the same taxon (Edgar, 434 2013). Operational taxonomic unit is an operational definition to group sequences 435 by similarity, equivalent to classical Linnaean or evolutionary taxonomy, i.e. 436 OTUs are proxies for ?species?. This is useful since not every organism has an 437 rRNA sequence in available databases. Sequences can be clustered according 438 to their similarity to each other, based on a similarity threshold (usually 97 %). 439 Clusters are sorted (binned) based on a percent similarity depending on the 440 sequence threshold set (usually 97 %). Poor threshold assignment can lead to 441 errors in the estimation of diversity. Taxonomic assignment accuracy is largely 442 dependent on the database being used. It is important to recognize that public 443 databases may contain sequencing errors and incorrectly assigned taxa (Harris 444 et al, 2003). In addition, databases may not contain sufficient taxonomic breadth. 445 1.8 Use of DNA metabarcoding for the analysis of the stomach content of 446 juvenile Gulf menhaden 447 Despite the importance of Gulf menhaden as a dominant prey fish and its 448 economic importance in fisheries, there is limited specific dietary information for 449 this species and a more complete understanding of the diet is needed. 450 15 Assessments of the fishery have concluded that Gulf menhaden are not 451 overfished or undergoing overfishing (Vaughan et al, 2007). However, this 452 assessment did not fully consider the ecological role of the species and its 453 ecological importance has not been adequately quantified. This suggests that the 454 potential exists for this large fishery to yield an ecological impact (Olsen et al, 455 2014; Sagarese et al, 2016). 456 To understand food web dynamics, the whole dietary breadth needs to be 457 described. I have developed a method for determining the stomach content of 458 juvenile menhaden using DNA metabarcoding to look at diet items in the 459 stomach, as well as the stomach microbiota. I have compared the results in fish 460 caught at two different locations within Apalachicola Bay, one a low salinity 461 location close to the Apalachicola estuary at Two Mile Channel (May 2013) and 462 the other high salinity location in St. Vincent Sound (July 2013). 463 1.9 Research questions 464 1. Can a method based on DNA metabarcoding using rRNA gene sequences be 465 used to describe the microbiota (using 16S rRNA sequences) and prey items 466 (18S rRNA sequences) from the stomach contents of Gulf menhaden? 467 2. Does the stomach microbiota, as assessed by 16S rRNA metabarcoding differ 468 in menhaden caught at different locations? i.e. could Gulf menhaden, as filter 469 feeders, function as environmental samplers, and therefore be used as a 470 possible biomonitor species to assess microbial diversity in inland bays and 471 estuaries? 472 3. Does the application of DNA metabarcoding provide a wider description of 473 menhaden prey items compared with previous methods? 474 4. Do the stomach diet items change with the location at which the fish are 475 caught? This could differentiate between selective filtration of food items 476 reflecting size or developmental stage of menhaden from an opportunistic 477 feeding strategy. 478 479 16 Chapter 2: Methodology for the identification of stomach 480 contents in the filter feeding fish (Brevoortia patronus) using 481 DNA metabarcoding 482 2.1 Abstract 483 Menhaden are filter feeding forage fish that serve as a trophic link between 484 plankton and piscivorous predators in the marine environment. Dietary analysis is 485 difficult in juvenile menhaden because >80 % of the stomach content is 486 amorphous material. DNA metabarcoding allows a comprehensive assessment 487 of stomach contents by combining mass-amplification of short DNA sequences 488 (bar codes) with high-throughput sequencing. Here we describe a method for the 489 assessment of diet items and gut microbiota of juvenile Gulf menhaden 490 (Brevoortia patronus), collected within Apalachicola Bay, FL. The method 491 describes the optimization of DNA extraction, effects of different sequencing 492 protocols, estimation of menhaden DNA contamination, quality filtering of 493 sequences, post-sequence processing and taxonomic identification of 494 sequences. We characterized the stomach prokaryotic community using 495 universal 16S ribosomal RNA (rRNA) gene sequencing primers in the V3-V4 496 hypervariable region. We explored the effects of two different sequencing 497 protocols employing different ?universal? 16S rRNA gene sequencing primers. 498 The two protocols gave differences in the relative abundancies of several 499 bacterial classes. The dominant OTUs resulting from 16S rRNA gene sequencing 500 were assigned to Proteobacteria, Acidobacteria, Actinobacteria and Chloroflexi 501 and included oil eating bacteria consistent with the Gulf of Mexico location. 502 Eukaryotic diet items were determined using universal sequencing primers 503 targeting the V4-V5 hypervariable region of the 18S rRNA gene sequence. We 504 identified OTUs belonging predominantly to copepods and diatoms. Overall, this 505 study demonstrated a greater taxonomic richness of stomach contents than 506 previously described, consistent with a depiction of menhaden as environmental 507 samplers. 508 17 2.2 Introduction 509 Menhaden (Brevoortia spp) are small common pelagic schooling members of 510 the family Clupeidae occupying the coasts and estuaries of the United States 511 Atlantic and Gulf regions. Gulf menhaden (Brevoortia patronus) range from 512 Veracruz, Mexico to southwestern Florida. They form large schools, which can 513 be found moving throughout estuaries and near-shore regions in late spring and 514 summer (Ahrenholz, 1991). Gulf menhaden is an important forage fish species 515 along the Gulf coast, providing forage for several commercially important fishes 516 in the Gulf of Mexico, including mackerel, bluefish, sharks, white and spotted 517 seatrout, longnose gars and red drum (Sagarese et al 2016; Etzold & Christmas, 518 1979; Reintjes, 1970; Simmons & Breuer, 1964). They also constitute up to 97 % 519 of food consumed by birds such as the brown pelican (Arthur 1931) and common 520 loon (Pendleton, 1989; Ahrenholz, 1991). Furthermore, Gulf menhaden support 521 the largest fishery by landings in the Gulf of Mexico. Gulf menhaden are reduced 522 to fish oil and meal for use in livestock feed, aquaculture, pharma- and 523 nutraceuticals, cosmetics and other consumer products (Nicholson, 1978). 524 Gulf menhaden are obligate filter feeders and juveniles have been shown to 525 feed on phytoplankton, zooplankton and particulate organic matter. However, 526 neither the full breadth of species consumed, nor the representation of any 527 species in the diet can be determined by visual methods, because food is ground 528 to an amorphous paste in the gizzard-like stomach (Lewis & Peters, 1994). To 529 understand food web dynamics, the whole dietary breadth needs to be 530 measured. Targeted PCR identification using species-specific bar codes can only 531 show the presence or absence of species already thought to be present. DNA 532 metabarcoding can greatly reduce the bias of sequence-specific methods by 533 combining amplification of a ?universal? gene region as a DNA barcode with high- 534 throughput sequencing (next generation sequencing). Unbiased community 535 structure could be monitored by denaturing gradient gel electrophoresis (DGGE) 536 and clone library methods based on rRNA gene sequences. However, this 537 approach requires extensive man-hours for picking/cloning/sequencing. DNA 538 18 barcoding is a method of species identification using a short section of DNA from 539 a specific gene or genes. DNA metabarcoding allows for simultaneous 540 identification of many taxa within the same environmental sample and allows the 541 analysis of many samples simultaneously. DNA metabarcoding methods have 542 been shown to enhance diet studies substantially in a range of fish species and 543 can be used with the total and partially degraded DNA extracted from fish 544 stomachs (Jakubavi?i?t? et al, 2017; Waraniak et al, 2019). Using molecular 545 barcoding methods, it is possible to identify prey items that lack visible diagnostic 546 taxonomic features (Pompanon et al, 2012). Similarly, DNA metabarcoding and 547 high-throughput sequencing can also be used to evaluate the microbiota of 548 stomach contents (King et al, 2012). 549 Although DNA barcoding and high-throughput sequencing have substantially 550 enhanced diet and microbiome studies, inappropriate application of this method 551 without an appreciation of its limitations can lead to the introduction of bias. A 552 number of steps, that include DNA extraction and PCR amplification, may 553 hamper the objective of obtaining results that are truly representative of the 554 source of DNA studied. Two factors that affect the successful amplification of 555 extracted DNA are the quality and quantity of DNA (Eichmiller et al 2016; Li et al, 556 2018; Majaneva et al, 2018). Uneven or inefficient DNA extraction can result in 557 only the most abundant organisms being sequenced and the subsequent 558 underestimation of the diversity of diet and/or microbiota. To allow all organisms 559 present to be identified requires uniform and efficient DNA extraction. However, it 560 should be noted that there is no universal best method for isolating DNA from 561 stomach contents and optimization is needed for each species and source 562 investigated (Pollock et al, 2018). 563 An extensive literature can be found comparing DNA extraction methods 564 often looking at different target taxa and environments and showing differences 565 in DNA yield and PCR amplification success amongst methods (reviewed 566 Schiebelhut et al, 2017). Commercial spin-column based methods make use of 567 the binding of DNA to a silica membrane in the presence of a high concentration 568 19 of chaotropic salts (Boom et al, 1990) and non-target substances are rinsed off 569 while the target DNA is bound to the silica membrane and can be eluted. 570 Published DNA extraction protocols vary widely depending on sample type and 571 intended use. Some choices include enzymatic digestion with lysozyme and/or 572 Proteinase K, use of surfactants like sodium dodecyl sulfate (SDS) or 573 cetrimonium bromide (CTAB), strong chaotropic agents like guanidine 574 thiocyanate or urea, or physical methods such as bead-beating or freeze-thaw 575 cycles. A combination of strategies is essential to maintain the balance between 576 maximum cell disruption, low DNA degradation and efficient DNA extraction from 577 all cell types. The gut microbial community can contain organisms from more 578 than thirty different phyla that range in susceptibility to cell breakage and release 579 of DNA and therefore efficiency of DNA extraction (Lagier et al, 2012; Rajili?- 580 Stojanovi? et al, 2007; Hoffmann et al, 2013). If cells from only some species are 581 lysed, the community analysis data will be skewed. The inclusion of a bead- 582 beating step, in which a sample is agitated rapidly with beads or balls in a device 583 that shakes the homogenization vessel, has been linked to higher DNA yields 584 (Schiebelhut et al, 2017; Ushio, 2019). The inclusion of a bead beating step has 585 been shown to allow more efficient extraction of DNA from Gram-positive and 586 spore-forming bacteria uncovering a higher bacterial diversity (Han et al, 2019; 587 Jiang et al, 2019; Ketchum et al, 2018). DNA can also be lost by adsorption to 588 the surface of various particles in the sample from sludge or soil and so 589 contribute to bias (Vanysacker et al 2010). Extracting DNA from the stomach 590 contents of a filter feeding marine organism such as menhaden, in which 591 phytoplankton and plant detritus can be present, can lead to adsorption to the 592 surface of various particles in the sample (Vanysacker et al, 2010). 593 Co-extraction of PCR inhibitors with DNA can interfere with downstream 594 amplification (Claassen et al, 2013). Complex polysaccharides, bile salts, lipids, 595 and urates in stomach contents are all known PCR inhibitors that require 596 additional steps to be removed (Schrader et al, 2012). Many manufacturers have 597 developed DNA extraction kits designed to remove PCR inhibitors. Other 598 products are available to cope with the presence of PCR inhibitors when included 599 20 in PCR amplification reactions and sequencing. PCR inhibition may lead to only 600 the most abundant organisms being sequenced and the subsequent 601 underestimation of the diversity of diet and/or microbiota. Conversely, extra 602 purification steps can reduce DNA yield and lead to the underestimation of 603 diversity. 604 Another essential aspect demanding careful consideration is the choice of 605 barcode. The 16S rRNA gene sequence has long been the gold standard for 606 identifying prokaryotes, making use of variable regions that occur between highly 607 conserved sites within the 16S rRNA gene of bacteria (Degnan et al, 2012; Quast 608 et al, 2013). In eukaryotes, the 18S rRNA gene or the mitochondrially encoded 609 cytochrome c oxidase I (COI, also called COX1) gene are most commonly used. 610 (Folmer et al, 1994; Martin et al, 2006; Jarman et al, 2006; Pompanon et al, 611 2012; King et al, 2012; Wang et al, 2014; Hugerth et al, 2014). However, 612 Klindworth et al (2012) have shown that the barcodes should be optimized in 613 respect to their overall coverage and phyletic spectrum expected. The 614 mitochondrially encoded COI was developed as a potential bar code for 615 eukaryotes. It works well for vertebrates and many metazoa, but COI gene 616 sequences lacks the discriminatory power to identify protist and fungal species 617 (Bellemain et al, 2010). Newer barcodes targeting 18S rRNA genes have been 618 developed for identification of species across eukaryotes (Hugerth et al, 2014) 619 and are used here for identifying eukaryotic diet items. 18S rRNA gene 620 sequences provide better coverage of eukaryotic taxa than that provided by 621 available sequences of mitochondrial CO1 (Deagle et al, 2014) and the SILVA 622 database is superior for annotating 18S rDNA sequences to finer taxonomic 623 levels than the NCBI nt database (Pruesse et al 2007; Lindeque et al, 2013). 624 Pertinent to our study here, significantly higher total OTU richness was recovered 625 from the marine zooplankton community off the Florida Keys using 18S rRNA 626 gene sequencing data compared with COI (Djurhuus et al, 2018). 627 A suitable barcode sequence needs to have conserved sequences at the 5?- 628 and 3?-ends with hypervariable region(s) in between. Primers to amplify these 629 21 regions are designed using multiple sequence alignment to identify sites that are 630 conserved within a group of taxa, but unique between groups, ideally with equal 631 amplification efficiency from all taxa sequenced (Jarman et al, 2004; Jarman et 632 al, 2006; King et al, 2008). Again, having a priori knowledge of the system is 633 useful as the primers chosen can selectively introduce bias by underestimating 634 one taxon and over-estimating another. When looking at communities, the design 635 of ?universal? PCR primers can have an effect on phylogenetic resolution. No 636 primer set is truly universal and some commonly used 16S rRNA gene sequence 637 primers have proved ineffective at amplifying biologically relevant bacteria, 638 (Go??biewski & Tretyn, 2019). Here we compare results using two different 639 ?universal? 16S rRNA gene sequence primers and a modified sequencing 640 protocol that provide differential amplification of DNA. 641 Another source of bias can be in the choice of sequencing platforms. There 642 are several available high-throughput sequencing platforms each with their own 643 set of limitations contributing to the sources of bias. To minimize errors and avoid 644 bias, the appropriate sequencing application must be selected based on research 645 goals. The primary variables in sequencing technology are read length and error 646 rates. The Illumina MiSeq system has become the most commonly used 647 sequencing platform for 16S and 18S rRNA gene metabarcoding and is used in 648 this study. In general, the MiSeq platform produces the longest and most 649 accurate sequences and has a much higher throughput than the other platforms. 650 This enables more samples to be sequenced at higher depth or lower cost 651 (Forin-Wiart et al, 2018; Quail et al, 2012). 652 High-throughput sequencing technologies produce millions of sequences 653 giving gigabytes worth of sequencing data. It is only by using bioinformatics tools 654 to analyze big data sets that these data become usable to answer research 655 questions. Such tools are needed for quality checks to discard sequences with 656 errors. Sequences can be sorted based on sequence similarity to sequences in 657 the 16S and 18S rRNA databases and assigned to a taxon (Grabowski & 658 Rappsilber, 2019). Sequence analysis software and the choice of taxonomic 659 22 binning software can also make a significant difference to results (Clooney et al, 660 2016). 661 The purpose of this study was to develop a method to characterize stomach 662 contents of a key filter feeding forage fish species, Gulf menhaden. We did this 663 by reducing sources of bias such as that introduced by the DNA extraction 664 method, refining the amplification and sequencing methods, choosing primers 665 wisely, and analyzing the results using common bioinformatic tools and diversity 666 metrics. The method allowed characterization of the stomach prokaryotic 667 community, as well as eukaryotic diet items. We intend to use this method in the 668 future to examine the effects of location, season and developmental stage on the 669 diet and microbiota of menhaden. It could also be used to monitor Gulf 670 menhaden stomach contents over time to tease out effects of climate change. 671 We also anticipate applying the method to other filter feeding species. Our overall 672 purpose was to provide a path to gain a better understanding of microbial 673 diversity in menhaden stomachs, as well a better understanding of their diet. 674 2.3 Materials and Procedures 675 A summary of the workflow involved in the identification of the stomach 676 contents of juvenile Gulf menhaden (Brevoortia patronus) using DNA 677 metabarcoding and high throughput sequencing are shown in Figure 2.1. 678 23 Summary of work flow Sample processing DNA quality assesment stomach content removal PCR DNA extraction Nanodrop Read processing High throughput sequencing quality filtering Illumina?s Hi-Seq platform Build OTU table Pick OTUs CGTGGGATGATGGGC OTU1 S1 S2 S3 TAAAGTCGGATGCTG Closed - reference ATCGTTCGATCGAAA Sample by OTU1 45 27 68 OTU1 TGCATGGGATTCGA OTU2 50 5 62 Open - reference observation matrixGGACTTAGGCTTGAG OTU1 OTU3 42 34 36 ATGCGAGATTGAGTG De novo ACGTGAACTGTGATG OTU4 58 30 79 OTU1 GCTGTAAAGTAGACT Data visualization Diversity assessment Alpha diversity Beta diversity Shannon Bray - Curtis PCoA plots Simpson weighted and taxonomy charts Observed unweighted Unifrac PC1 sample Chao 1 Figure 2.1: Summary of workflow 679 2.3.1 Sample collection 680 Ten Gulf menhaden stomach samples were received from our Delaware 681 State University and Florida Fish and Wildlife collaborators. The fish were 682 collected from Apalachicola Bay, FL on July 2, 2013, at four sites (SVS02, 683 SVS03, SVS04, SVS05) in St. Vincent Sound (SVS) with latitude and longitude 684 coordinates corresponding to 29.68, -85.2; 29.70, -85.1; 29.68, -85.1; 29.6, -85.1, 685 24 PC2 relative abundandance respectively (Figure 2.2). Collections were made using a seine net at 1 m depth. 686 Water quality measurements of temperature, salinity, pH, dissolved oxygen, and 687 turbidity were recorded using a YSI 556 multiparameter water quality meter 688 (Supplemental Table 2.1). 689 Figure 2.2: Collection sites in Apalachicola Bay. The sites where menhaden were collected within St. Vincent Sound, in July 2013, are shown as red circles 690 2.3.2 Optimization of DNA extraction 691 Given that menhaden are filter feeders and their stomachs contain 692 amorphous environmental material, we assumed that the stomach contents could 693 possess many potent PCR inhibitors. Furthermore, the menhaden diet is known 694 to consist of organisms that resist cell lysis necessitating the need to evaluate 695 DNA extraction and quality. There are a range of commercial kits available to 696 produce high quality DNA free of PCR inhibitors in high yields. We extracted the 697 DNA of menhaden stomach contents using several of these and compared DNA 698 quality, DNA yield, and its ability to be amplified by PCR. Four commercial DNA 699 extraction kits, DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany), PowerSoil 700 DNA Isolation kit (Mo Bio, Santa Anna, CA, US), Quick-DNA Miniprep kit (Zymo, 701 Irvine, CA, US), and High Pure PCR Template DNA Preparation kit (ROCHE 702 25 Diagnostics, Indianapolis, IN, US) were compared to ascertain the best method 703 for extraction of menhaden stomach contents. Each kit was used as per the 704 manufacturer's instructions, although all included a bead beating step. DNA 705 quality was assessed spectrophotometrically using the Nanodrop 1000 706 (ThermoFisher Scientific, Waltham, MA, US) and by PCR amplification using the 707 universal prokaryotic 16S rRNA gene sequence primers 27F-1492R (Weisburg et 708 al, 1991). PCR amplicons were visualized by ethidium bromide staining after 1 % 709 agarose gel electrophoresis. 710 2.3.3 Sample preparation and DNA extraction 711 Whole stomachs were removed from fish and placed in 95 % ethanol prior to 712 transit to IMET. Using a surgical blade, stomachs were cut in half and gently 713 shaken in 1 ml of 100 % ethanol until contents were released. 714 DNA extraction was done using the High Pure PCR Template DNA 715 Preparation kit (ROCHE Diagnostics, Indianapolis, IN, US) with some 716 modifications to the manufacturer?s protocol. The stomach contents in ethanol 717 were left at 60 ?C until dry. Two hundred ?l of lysis buffer (ROCHE Diagnostics, 718 Indianapolis, IN, US) was added to the sample along with 10 ?l Proteinase K (20 719 mg/ml) and incubated at 70 ?C for 1 h. This extract was transferred to a 2 ml 720 screw cap tube containing garnet beads (Mo Bio Santa Ana, CA, US). Disruption 721 of the sample was achieved by bead beating using the FastPrep FP120 bead 722 beating apparatus (Savant Instruments, Inc., Holbrook, NY) for 20 sec at 4500 723 rpm. Once this step was completed, manufacturer?s protocols were followed for 724 the subsequent cellular debris removal, washes and DNA elution steps. 725 2.3.4 High throughput sequencing 726 Sequencing of both the 16S rRNA gene and 18S rRNA gene amplicons of all 727 10 samples was performed on Illumina?s MiSeq platform located in the 728 BioAnalytical Services Laboratory (BAS Lab) at the University of Maryland 729 Center for Environmental Science-Institute of Marine and Environmental 730 Technology. To assess bias in sequencing methods, 16S rRNA genes were 731 26 sequenced using two different protocols and primers that amplify the V3-V4 732 variable regions in 16S rRNA gene sequence. The first method used the 733 recommended Illumina protocol for the MiSeq as described in the Nextera DNA 734 Library Prep Reference Guide (Illumina Document #15027987v1) and is referred 735 to as the Illumina dataset. The second method used a dual indexing amplification 736 and sequencing approach as described in Fadrosh et al (2014) and is referred to 737 as the Fadrosh data set. In the Fadrosh protocol, the inclusion of heterogeneous 738 spacers at the ends of the primers increases the efficiency of primer hybridization 739 and overcomes the under-representation of low frequency and low primer 740 homology sequences during the first several cycles of a sequencing run. Both the 741 recommended Illumina and Fadrosh sequencing protocols use sequencing 742 primers targeting the V3-V4 variable regions of 16S rRNA gene sequence, as 743 described in Klindworth et al (2012) and Fadrosh et al (2014) were sequenced in 744 both directions. The primer sets are shown in Table 2.1. 745 For eukaryotic species, the sequencing was done as described above for 746 16S rRNA sequencing except that the sequencing primer set used was for the 747 18S rRNA gene sequence. The primers used were 574*F 748 (CGGTAAYTCCAGCTCYV) and 1132R (CCGTCAATTHCTTYAART) in the 18S 749 rRNA gene V4-V5 region as developed by Hugerth et al (2014). The position 750 numbering refers to the position in the rRNA sequence, as identified in the 751 Saccharomyces cerevisiae strain FM-sc-08 18S rRNA gene, NCBI accession 752 number Z75578. The 18S rRNA primers are also given in Table 2.1. 753 The list of primers is given in Table 2.1. Individual ?index" sequences are 754 added to DNA fragments from each sample during the library preparation so that 755 sequences from each sample can be identified, edited and sorted before the final 756 data analysis. 757 27 Table 2.1: Primers used for sequencing Primer sequences used for amplification and sequencing of 16S and 18S rRNA gene sequences. 758 2.3.5 Post-sequencing pipeline 759 Sequence identity and data quality assessment were performed on the 760 MiSeq instrument itself. MacQIIME was used to process and assess quality of 761 output sequences (called reads) from the sequencing primers. Post-sequence 762 processing was done using the recommended QIIME pipeline for Illumina reads 763 (Caporaso et al, 2010). Removal of index sequences, called de-multiplexing, 764 takes place on the MiSeq instrument. The MacQIIME script join_paired_ends.py 765 was used to join forward and reverse sequences. Paired reads were filtered for 766 low quality reads (quality score of <25) and short read length (<200 bp) and 767 removed from the library using the split_libraries.py command. Chimeric 768 sequences were identified de novo using the USEARCH61 (v6.1.544) algorithm 769 with the script identify_chimeric_seqs.py (Edgar 2010). This was followed by the 770 removal of PhiX sequences by a BLAST analysis with the scripts 771 parallel_blast.py and filter_fasta.py. After the removal of chimeric and PhiX 772 sequences, the remaining reads were used for operational taxonomic unit (OTU) 773 picking using the script pick_otus.py. For prokaryotic species in particular, many 774 more species exist than have been cultured and identified taxonomically. An 775 OTU is an operational term, most commonly defined as a cluster of reads with 97 776 % similarity, based on the expectation that OTUs can be used as a proxy for 777 species (Sneath & Sogal, 1973; Mysara et al, 2017). A UCLUST de novo 778 clustering method within the USEARCH61 (v6.1.544) algorithm was used to pick 779 OTUs for both 16S sequencing protocols (Edgar 2010). Taxonomic assignment 780 was done using the script assign_taxonomy.py with Greengenes gg_13_8 as a 781 28 reference database at 90 % similarity (Lawrence Berkeley National Laboratory, 782 http://greengenes.lbl.gov). These data were used to construct an OTU table for 783 subsequent relative sequence abundance and diversity analysis. 784 The 18S rRNA gene sequences were also processed in MacQIIME using the 785 appropriate scripts as described above. OTUs generated from 18S rRNA gene 786 sequencing were picked and taxonomy was assigned using the UCLUST method 787 against the Silva 111 Eukaryote-only database (Edgar 2010; Quast et al, 2013). 788 The resulting OTU tables and metadata table are available at 789 https://github.com/Hanif82/gulfmenhadenotutable.git. 790 2.3.6 Analysis of diversity 791 Observed OTUs, Good?s coverage, Fisher?s alpha, Chao1 richness, Shannon 792 index, and Inverse Simpson index metrics were used to assess alpha diversity 793 using the programs R and MacQIIME. Metastats was used to test for differentially 794 abundant taxa with p-value adjustment using the False Discovery Rate (FDR) 795 (White et al, 2009). To analyze how closely the samples were related to each 796 other and to compare observed differences in the microbial community from the 797 two sequencing methods, beta diversity analyses were determined based on the 798 unweighted and weighted UniFrac phylogenetic distances metric and visualized 799 using the ordination method principal coordinate analysis (PCoA) into two- 800 dimensional plots. 801 2.3.7 Menhaden 18S rRNA gene sequence and estimation of menhaden DNA 802 contamination 803 Using the SILVA 111 eukaryote-only database for taxonomic assignment, we 804 observed a significant number of reads that were assigned to Reeves shad, 805 Tenulosa reevsii, a clupeid fish closely related to Gulf menhaden. Given that 806 there is no record of Reeves shad in the Gulf of Mexico, we assumed these 807 reads to be of the host, Gulf menhaden. Based on this, we calculated the 808 percentage of reads assigned to Reeves shad as a proxy for estimating the 809 percentage of reads from menhaden DNA in the stomach DNA samples. Using 810 29 the 18S rRNA DNA sequence deposited in GenBank for Reeves shad (accession 811 number EU12003.1) five primer sets (a combination of universal eukaryotic 812 primers from the literature and primers designed in this study) were used to 813 amplify and sequence the menhaden 18S rRNA gene producing a partial 814 sequence of 1489 base pairs (accession number MN335220, Supplemental 815 Figure 2.1). 816 2.4 Assessment of method 817 2.4.1 Optimization of DNA extraction and quality assessment 818 In order to make the extraction method transferable to other investigations, 819 we opted for the use of commercially available kits with the expectation of having 820 to do some adjustment to the manufacturer?s protocol. 821 Spectrophotometry is a commonly used method to assess DNA quality and 822 quantity. Here we used Nanodrop 1000 (ThermoFisher Scientific, Waltham, MA, 823 US) to measure absorbance from 230-320 nm to calculate the concentration of 824 DNA and determine A260/A280 and A260/A230 ratios. Good quality DNA should 825 have a A260/A280 and A260/A230 ratio of 1.7-2.0 and 2.0-2.2 respectively. 826 Strong absorbance at 230 nm and/or 280 nm will lower expected ratios indicating 827 contaminants such as organic or chaotropic salts. Several commercial kits use 828 silica to bind nucleic acids which can co-elute and has a measured absorbance 829 of approximately 230 nm lowering the A260/A230 ratio. However, this does not 830 30 interfere with downstream molecular applications such as PCR. Of the four kits 831 we tested, the Qiagen DNeasy Blood & Tissue kit and the Roche High Pure PCR 832 Template Preparation kit gave DNA with acceptable A260/A280 ratios and the 833 Roche kit gave the best A260/A230 ratios. However, only the High Pure PCR 834 Template 835 Preparation kit 836 (Roche, Diagnostics, 837 Indianapolis, IN, US) 838 gave extracted DNA 839 that was amplifiable 840 using the 16S rRNA 841 gene sequence 842 primers 27F and 843 1492R (Weisburg et 844 al, 1991) (Table 2.2). 845 The size of the 846 amplicon, 1465 bp 847 was also an 848 Table 2.2: DNA recovery and quality using different DNA indication that little 849 extraction kits. DNA was extracted from menhaden stomach contents using four different commercial DNA extraction kits. The degradation had 850 quality and quantity of DNA recovered was assessed by occurred during 851 spectrophotometry from A230 to A320 using NanoDrop 1000, as well as by its ability to be amplified by universal primers 27F and purification. 852 1492R for 16S rRNA genes (Weisburg et al, 1991). 853 DNA was extracted from menhaden stomach contents using four 854 different commercial DNA extraction kits. The quality and quantity 2.4.2 Assessment of menhaden DNA contamination in stomach DNA samples 855 of DNA recovered was assessed by spectrophotometry from A230 to A320 using NanoDrop 1000, as well as by its ability to be ampInlif itehde b ay nuanilvyesrissa lo pf raim mersic 2ro7Fb ianl dc 1o4m92mRu fnoirt y16 iSn rsRtoNmA ach contents by high 856 genes (Weisburg et al, 1991). throughput sequencing, too much host DNA contamination can bias results. 857 Though care was taken during DNA extraction to minimize menhaden DNA 858 contamination, it was not eliminated. Initially there was no straightforward way to 859 test for this prior to the high-throughput sequencing step because the sequence 860 for menhaden 18S rRNA gene had not been found in gene databases. However, 861 31 after the taxonomic assignments were made using the SILVA 111 database, we 862 noticed that a high 863 percentage of reads 864 closely matched a 865 related clupeid fish, 866 Tenulosa reevsii, 867 Reeves shad, 868 (Richardson, 1846). 869 Reeves shad belongs 870 to the same family as 871 menhaden but is 872 native to the 873 Northwest Pacific 874 Ocean. Given the 875 close phylogenetic 876 relationship and no 877 known reports of 878 Reeves shad in the 879 Gulf of Mexico, we 880 assumed these reads 881 represented 882 Figure 2.3: Percentage of menhaden 18S rRNA gene menhaden DNA. 883 sequences in stomach DNA samples: The percentage of menhaden 18S rRNA gene sequences in stomach DNA samples Using the sequence 884 were estimated by calculating the percentage of reads matching the Reeves shad 18S rRNA gene. Turquoise colored bars of Reeves shad 18S 885 represent the percentage of reads assigned to Reeves shad. rRNA gene sequence 886 Blue colored bars represent percentage of reads assigned to all other OTUs. as a proxy for 887 menhaden 18S rRNA sequence, we estimated the amount of menhaden DNA 888 contamination in individual fish. The percentage of the 18S rRNA reads 889 considered to be from menhaden ranged from 9 to 81 % (Figure 2.3). In view of 890 these results, the reads considered to be menhaden were excluded from the 891 32 eukaryotic community analysis in all samples. Despite this, the samples gave 892 good depth of coverage, as indicated by the Goods coverage index (see below). 893 Prior to this study, sequence for the menhaden 18S rRNA gene was absent 894 in gene databases. This lack of information in the database is a prime example of 895 the shortcomings of metabarcoding methods that rely on well curated gene 896 databases for high taxonomic resolution and assignment. Primer sets based on 897 the sequence of T. reevsii allowed amplification of menhaden 18S rRNA DNA 898 using DNA purified from menhaden fin clips and generated a partial sequence of 899 1489 bp (accession number MN335200) (Supplemental Figure 2.1). 900 A BLAST analysis of the menhaden 18S rRNA partial gene sequence 901 recovered against the GenBank database showed >99 % identity to several other 902 closely related Clupeidea. An alignment of the menhaden sequence against the 903 top hits, Tenulosa reevsii (accession number EU12003.1), Potamulosa 904 richmondia, (accession number KJ774739.1) and Nematalosa erebi (accession 905 number HQ615575.1) is shown in Figure 2.4. All four species are from different 906 parts of the world, but all are from the family Clupeidae. Identification of the 907 menhaden 18S rRNA sequence confirmed our assumption that the apparent 908 Tenulosa reevsii sequences represented Gulf menhaden 18S rRNA gene 909 sequences and justified our exclusion of these sequences from the community 910 analysis. 911 912 33 Figure 2.4: Multiple alignment of menhaden 18S rRNA gene sequence. The partial menhaden sequence (designated with an asterisk (*) (Accession number, MN335220) is shown here in a multiple alignment with 18S rRNA gene sequence from Potamalosa richmondia, Australian freshwater herring, accession number, KJ774739.1; Nematalosa erebi, Australian river gizzard shad, accession number HQ615575.1; and Tenulosa reevsii, accession number EU12003.1. 913 34 2.4.3 Comparison of results using different sequencing methods 914 One reason why high-throughput sequencing has become a powerful tool in 915 microbiome studies is the identification of ?universal? primers. However, for 16S 916 and 18S rRNA gene sequences, previous studies have indicated that primer 917 selection can influence results and bias abundance data (Meusnier et al, 2008; 918 Hugerth et al, 2014; Pollock et al, 2018). In order to determine if there was 919 introduction of community composition bias due to primer and/or sequencing 920 protocol, all samples analyzed here were sequenced with two published 921 sequencing protocols that use different ?universal? prokaryotic primer sets. One 922 protocol, termed ?Illumina? is the Illumina-recommended sequencing protocol 923 using primers based on the study of Klindworth et al, 2012). The other protocol, 924 termed ?Fadrosh?, was based on the study of Fadrosh et al, 2014). The Fadrosh 925 protocol uses a dual indexing approach in which heterogeneous spacers are 926 attached to the sequencing primers. This protocol is used to increase depth of 927 coverage by capturing low abundance organisms. Both primer sets are based on 928 the V3-V4 hypervariable region the E. coli 16S rRNA gene sequence (accession 929 number FJ950694.1). Both primer sets give PCR products of approximately 500 930 bp in length. Figure 2.5 shows the main differences between the two primer sets 931 is that the Illumina forward primer is shorter and more degenerate in comparison 932 to Fadrosh forward primer and the Fadrosh reverse primer is shorter and more 933 degenerate that the Illumina reverse primer. Both forward and reverse primers 934 have heterogeneous spacers following the 12 bp index sequence (not shown). 935 Figure 2.5: Schematic of the 16S rRNA gene sequencing primers: The Fadrosh primers are taken from Fadrosh et al (Fadrosh et al, 2014) and the Illumina primers are taken from Klindworth et al (Klindworth et al, 2012). Note that only the sequencing PCR primers are shown and do not include the index or adapter sequences. The position numbers are based on the E. coli 16S rRNA gene sequence, accession number FJ950694.1. 35 936 We retrieved 937 a total of 938 1,141,141 and 939 906,483 raw 940 reads using the 941 Illumina and 942 Fadrosh 943 sequencing 944 methods, 945 respectively 946 (Table 2.3). After 947 post-sequencing 948 processing to 949 Figure 2.6: Comparison of 16S rRNA gene read counts generated by Fadrosh and Illumina protocols pre- and post-processing. obtain quality 950 Counts of reads per sample shown are shown by sequencing primer/protocol used and before and after processing in QIIME. Red reads, these 951 and blue bars represent number of reads generated by Fadrosh and Illumina protocols, respectively. Solid bars represent the number of numbers were 952 reads pre-processing. The bars with hash lines represent the number of reads post-QIIME processing. reduced to 953 296,734 and 954 898,66 for the Illumina and Fadrosh protocols, respectively. The number of reads 955 varied between samples, with a wide range between the highest and lowest 956 number of reads (Table 2.3 and Figure 2.6). However, post-processing, the 957 Fadrosh sequencing method consistently gave the higher number of reads. 958 Using the Illumina sequencing method, sample 09 produced 356,332, the 959 maximum number of raw reads and sample 07 produced the least number of raw 960 reads, at 11,705. The Fadrosh sequencing method did not produce such wide 961 differences in number of reads per sample. The most striking difference between 962 the two methods is the comparison of the number of raw and processed reads as 963 shown in Figure 2.6. Overall, there are far fewer raw reads lost with processing 964 using the Fadrosh sequencing method indicating a greater percentage of high- 965 quality reads. Looking at the average number of reads the percent difference pre- 966 36 and post- 967 processing 968 was 74 % and 969 0.86 % for the 970 Illumina and 971 Fadrosh 972 sequencing 973 method, 974 respectively. 975 Post- 976 Table 2.3: Effect of different gene sequencing protocols on the processing, 977 number of raw reads and post-processing reads. DNA was extracted from the stomach contents of 10 juvenile menhaden. A region of sample 07 978 approximately 469 bp encompassing the V3-V4 hypervariable regions of 16S rRNA genes was targeted for sequencing using the sequencing from the 979 protocols described by Fadrosh et al (Fadrosh et al, 2014) and those Illumina reads 980 recommended by Illumina (Klindworth et al, 2012) using the Fadrosh and Illumina sequencing protocols. The raw and post-processing reads from had the lowest 981 each were expressed as reads per sample, mean number of reads per sample and total reads. number of 982 reads with 983 2,690 reads. This was excluded in our downstream analysis because it did not 984 meet the minimum rarefaction requirements. The difference in the number of raw 985 reads generated by the two protocols probably reflects the stringency of 986 hybridization of the two primer sets to rRNA sequences. The Illumina primers 987 were smaller and had higher degeneracy and would have hybridized less 988 stringently in early rounds of amplification giving a higher number of products that 989 were not of sufficient quality. The Fadrosh primer set had less degeneracy and 990 also had the spacer sequences that allowed better amplification of rare or more 991 diverse 16S rRNA sequences after the initial rounds of amplification and gave 992 reads of better quality. 993 To provide an understanding of the apparent differences in the microbiota 994 suggested from each sequencing method, we looked at several commonly used 995 alpha diversity metrics using OTU abundancies (Figure 2.7). We also tested the 996 null hypothesis of differences between sequencing methods using Mann-Whitney 997 37 statistical test (Table 2.4). Observed OTUs (species), Chao1 estimator, Fisher?s 998 alpha (?) indexes are commonly used to measure richness simply defined as the 999 number of different species in a sample. Observed OTUs is the simplest 1000 measurement of species richness; it counts the number of different OTUs in a 1001 sample. In all but sample 06, Observed OTUs were greater using the Fadrosh 1002 sequencing protocol but showed no significant difference between sequencing 1003 method (p = 0.661). In sample 06, the number of Observed OTUs was higher 1004 using the Illumina sequencing protocol, although the difference is very small. 1005 Chao1 diversity metric is another estimator of species richness; however, it 1006 makes the assumption that all possible species are often not accounted for 1007 because of sequencing depth in a sample. The metric attempts to account for 1008 this by providing an estimate of ?true? species richness. The Chao1 shows a 1009 similar result as Observed OTUs where this measure is greater in all but 3 1010 samples (05, 06 and 10) using the Fadrosh sequencing method. However, it 1011 does not show a significant difference between the two sequencing methods (p = 1012 0.78). Fisher?s (?) is used as a sample-size independent estimator to address the 1013 issue of OTU richness bias due to sample size (sequencing depth). Similar to 1014 Table 2.4: Mean alpha diversity metrics from 16S rDNA gene sequences. The mean alpha diversity metric value from sequences derived using the Fadrosh or Illumina sequencing protocols are given, looking at: Observed OTUs; Chao1 estimator; Good?s coverage; Fisher?s alpha index; Simpson reciprocal index; Shannon index; along with the p-value using Mann-Whitney statistical tests. 38 Observed OTUs, Fisher?s (?) was greater using the Fadrosh sequencing protocol 1015 but the difference was not significant (p = 0.661). 1016 1017 Figure 2.7: Comparison of alpha diversity metrics of OTUs at the class level generated by the Fadrosh or Illumina protocols. Six alpha diversity metrics were used to measure diversity; a) Chao 1 estimator; b) Fisher?s Alpha index value; c) Shannon index value; d) Simpson reciprocal index value; e) Good?s coverage; f) Observed OTUs. Red and blue bars represent values from each menhaden stomach sample sequenced using the Fadrosh and Illumina sequencing protocols, respectively. 39 In sample 06, Fishers (?) was greater using the Illumina sequencing method, 1018 however the difference is very small. The metric Good?s coverage, a method of 1019 estimating what percent of the total number of species is represented in a 1020 sample, only varied slightly per sample and sequencing protocol and showed no 1021 significant difference between the two sequencing protocols (p = 0.870). The 1022 Shannon index and Simpson reciprocal are diversity metrics that measure 1023 richness but take into account the relative abundance or evenness of each 1024 group. However, they do so in different ways. The Shannon index gives an equal 1025 weight to richness and evenness, thus as richness and evenness increase, so 1026 does the Shannon index. The Simpson?s index calculates the probability that any 1027 two reads randomly sampled from a community will belong to the same 1028 taxonomic assignment, thus is more heavily influenced by dominant taxa. 1029 Typically, these two metrics compliment well and will follow the same trend. Both 1030 Shannon and Simpson reciprocal show that all samples have a higher diversity 1031 index using the Fadrosh sequencing method except for sample 10. The 1032 difference is shown to be significant only in the Simpson reciprocal index (p = 1033 0.035). Taken together the diversity metrics show that the two protocols gave 1034 sufficient sampling depth and are similar in species richness. However, the 1035 significant difference in the Simpson reciprocal indicates the presence of 1036 dominant taxa in samples sequenced with the Illumina sequencing method that 1037 reduce the evenness. A possible reason for this is that the samples sequenced 1038 using the Illumina sequencing method are dominated by chloroplast reads in 1039 comparison to the samples obtained using the Fadrosh sequencing protocol, 1040 probably reflecting the lower stringency of the Illumina primers. 1041 In order to see the apparent differences of the bacterial communities 1042 between each sample and the apparent differences found between the two 1043 sequencing protocols, we used weighted and unweighted Unifrac distance 1044 metrics visualized by principal coordinate analysis (PCoA) plots (Lozupone et al, 1045 2007) (Figure 2.8). Unweighted Unifrac analysis is a qualitative measure. In 1046 contrast, weighted Unifrac analysis reveals quantitative community differences 1047 due to relative taxon abundance, or in this case apparent differences in relative 1048 40 taxon abundance. The weighted PCoA plots show a clear separation of samples 1049 based on Illumina and Fadrosh sequencing methods. The sample set sequenced 1050 with the Illumina sequencing protocol form two clusters with 95 % confidence, 1051 with sample 01 being an outlier. In contrast, the sample set sequenced with 1052 Fadrosh sequencing method is more scattered but forms three clusters with 95 % 1053 confidence, with samples 01 and 05 being outliers. Samples 06 and 10 1054 sequenced by each sequencing method vary only along PC2 (y-axis) in the 1055 weighted PCoA plot, indicating that these show the least change between the 1056 two sequencing protocols. 1057 41 1058 Figure 2.8: Comparison of apparent community beta diversity by unweighted and weighted UniFrac measures. Principal coordinate analysis of sequences produced from Fadrosh and Illumina sequencing methods using a) unweighted and b) weighted Unifrac distance metrics (Lozupone et al, 2007) to measure differences between the sample sets. Red and blue circles represent each menhaden stomach sample sequenced with Fadrosh and Illumina sequencing methods, respectively. The unweighted PCoA plots also show a clear separation of samples based 1059 on sequencing protocol. Again, the Illumina sequencing protocol shows two 1060 distinct clusters, with sample 01 again being an outlier. Samples group in the 1061 same cluster for both weighted and unweighted analyses, except for sample 08. 1062 Overall, it is clear that the sequencing methods do affect overall community 1063 42 results. However, based on the scale of the axes however, the differences are 1064 probably too small to impact our estimate of overall community composition. 1065 Overall the microbial community, using either set of sequencing primers, shows 1066 the same microbial taxa, although the relative abundances of those taxa differ 1067 depending on which sequencing primers are used. 1068 2.4.4 Variation in taxonomic composition in each sample at the class level 1069 using the two different primers/sequencing protocols 1070 The most dominant taxa at the class level consistently across all samples 1071 regardless of sequencing primers/method used was from chloroplast 16S rRNA 1072 sequences and thus are not representative of prokaryotes (Figure 2.9). These 1073 sequences accounted for 25.1 % and 37.0 % of the 16S rRNA gene sequences 1074 using Fadrosh and Illumina sequencing protocols respectively. Following 1075 chloroplast, the next dominant groups were Gammaproteobacteria and 1076 Anaerolinae with relative abundance averaging approximately 10.0 % to 18.0 % 1077 of the bacteria community respectively. Though these were on average the next 1078 dominant groups, there was variability between samples. For example, in sample 1079 05 only, Synechococcophyideae were more dominant than Anaerolinae (Figure 1080 2.9). Using MetaStats to test for significant differences between the sequencing 1081 methods, there were significant differences in taxa from only six classes; 1082 Acidimicrobia,Planctomycetia, Sva0725, Betaproteobacteria, Acidobacter-6 and 1083 OS-K (Table 2.5). Using MetaStats to test for significant differences between the 1084 sequencing methods, there were significant differences in taxa from only six 1085 classes; Acidimicrobia, Planctomycetia, Sva0725, Betaproteobacteria, 1086 Acidobacter-6 and OS-K (Table 2.5). There are decreased numbers of reads of 1087 Acidimicrobia, Sva0725, Betaproteobacteria, Acidobacteria-6 and OS-K using the 1088 Illumina sequencing method compared with the Fadrosh sequencing method. 1089 Only Planctomycetia showed more reads using the Illumina sequencing protocol. 1090 1091 43 1092 Figure 2.9: Comparison of the relative abundance of OTUs at the class level derived using Fadrosh and Illumina protocols for 16S rRNA gene sequencing. The relative abundance of OTUs at the class level from individual fish, caught at different St. Vincent Sound sites, when sequenced using primers for 16S rRNA gene sequences along with the Fadrosh and Illumina sequencing method. 1093 44 1094 Table 2.5: Significant differences in mean 16S rRNA reads at the class level using Fadrosh versus Illumina sequencing protocols. Differences in 16S rRNA reads at the class level using Fadrosh versus Illumina sequencing protocols, along with p-values and q-values for each and showing whether using the Fadrosh protocol gives and increase or decrease in the number of reads belonging to an OTU. 2.4.4 Community assessment of stomach content using universal eukaryotic 1095 primers 1096 Given the success of assessing the prokaryotic microbiota using universal 1097 prokaryotic primers, we applied this method using universal eukaryotic primers to 1098 gain an understanding of prey items. We used metabarcoding targeting a portion 1099 of the V4-V5 hypervariable regions of 18S rRNA gene sequences as developed 1100 by Hugerth et al. (Hugerth et al, 2014). 1101 The total 1102 number of raw 1103 reads using 18S 1104 rRNA gene 1105 primers was 1106 350,223, with 1107 349,697 reads 1108 Table 2.6: Number of 18S rRNA reads and OTUs per sample. a. generated post 1109 18S OTUs per sample:?Number of 18S OTUs per sample using 97 % similarity of the SILVA 111 eukaryote only sequence database. OTUs processing 1110 matching menhaden were removed before analysis. b. Comparison of (Table 2.6). 1111 18S rRNA reads generated by each sample: Number of reads per sample before and after processing pipeline using QIIME. Looking at the 1112 18S rRNA gene 1113 45 sequencing results, we see that the majority of the reads that were assigned to 1114 chloroplasts by the 16S rRNA gene sequencing analysis, belong to the groups 1115 Alveolata and Stramenopiles. These are the dominant protists, consisting mainly 1116 of ciliates, dinoflagellates and diatoms (Figure 2.10). Overall, 18S rRNA gene 1117 sequences are dominated by metazoan sequences. In each sample, between 57 1118 % to 97 % of the 18S rRNA gene sequences are assigned to Metazoa (Figure 1119 2.10). The four most dominant metazoan phyla represented are Arthropoda, 1120 Craniata, Mollusca, and Annelida respectively. The reads from the Arthropoda 1121 consisted almost entirely of the copepod, Acartia tonsa. The relative abundance 1122 of Acartia ranged from 12.7 to 86.9 % of metazoan species found. The remainder 1123 Figure 2.10: Comparison of the relative abundance of OTUs at the phylum level determined using primers for 18S rRNA gene sequences: The relative abundance of OTUs at the class level from individual fish, caught at different St. Vincent Sound sites, determined after sequencing using primers for 18S rRNA gene sequences. 46 of the 18S rRNA reads from Crustacea are from several genera of barnacles, the 1124 nauplius and cyprid larvae of which should be retained by juvenile menhaden gill 1125 rakers. The reads assigned to the phylum Mollusca are dominated by two 1126 organisms. One is a gastropod of the genus Deroceras, an air breathing land 1127 slug from permanently wet habitats. D. laeve is widespread in the Gulf of Mexico 1128 coastal region. 1129 Both eggs and adults of D. laeve can survive when submerged; the adult is 1130 25-35 mm; eggs 1.7-2 mm. The other mollusk is a bivalve of the rock boring 1131 genus Leiosolenus which is also found in the Gulf of Mexico with L. aristata, 1132 dispersed through ballast water. The most unexpected metazoan representatives 1133 were from the Craniata. This group included several organisms commonly found 1134 in the freshwater aquarium pet trade, such as zebrafish and spotted gar. Given 1135 that the adults from these unexpected taxa are inconsistent with the size of 1136 menhaden gill rakers, we conclude that their presence may represent captured 1137 earlier life stages, degraded cellular material, or eDNA. 1138 2.4.5 Validation of the presence of taxa found by high throughput sequencing 1139 with end-point PCR using group specific primers 1140 In order to validate the method, we targeted several taxa by amplification 1141 using their published taxon specific primers for 16S or 18S rRNA genes 1142 47 (Ashelford et al, 2002; M?hling et al, 2008; Bradley et al, 2016) (Table 2.7) 1143 Table 2.7: Taxon-specific primers for validation of sequence assignments. Published rRNA gene primers for amplifying the 16S and 18S rRNA genes of different taxa were taken from Ashelford et al (Ashelford et al, 2002) for Betaproteobacteria; M?hling et al (M?hling et al, 2008), for Gamma proteobacteria and Cyanobacteria/chloroplasts; and Bradley et al (Bradley et al, 2016) for microalgae/phytoplankton. 1144 For bacteria, primers designed to amplify 16S rRNA gene sequences from 1145 Gammaproteobacteria were chosen due to the relatively high abundance of 1146 Gammaproteobacteria amongst the samples. Primers to amplify 16S rRNA gene 1147 sequences of Betaproteobacteria was chosen to represent a taxon that was not 1148 as widely abundant. For eukaryotic taxa, primers that are designed to amplify 1149 18S rRNA gene sequences of phytoplankton were chosen, which also includes a 1150 wide variety of protists. The presence of the above taxa was considered to be 1151 confirmed if the relevant primers produced an amplicon of expected product 1152 length by end-point PCR. Amplicons corresponding to the expected sizes were 1153 seen in all cases (Figure 2.11). 1154 48 Figure 2.11: Validation of taxa found by endpoint PCR. Validation of taxa found was by endpoint PCR using 16S rRNA gene primers for a) Gammaproteobacteria (M?hling et al, 2008); b) Betaproteobacteria (Ashelford et al, 2002); c) Cyanobacteria/chloroplasts (M?hling et al, 2008); or 18S rRNA gene primers for d) phytoplankton/protists (Bradley et al, 2016). These primers should give amplicons of a) 476 bp; b) 323 bp; c) 424 bp; and d) 88 bp, respectively. 1155 1156 2.5 Discussion 1157 2.5.1 Strengths of DNA metabarcoding 1158 DNA metabarcoding, whether for the analysis of microbiota or diet items 1159 allows high taxonomic resolution and enables the simultaneous analysis of many 1160 samples. The general workflow is well established; extraction of total DNA from 1161 the dietary sample, PCR amplification of DNA barcode markers from taxa of 1162 interest and then DNA sequencing for taxonomic classification of the recovered 1163 sequences. A clearer picture of bacterial diversity will be seen if chloroplast 1164 sequences are removed prior to analysis. 1165 Applying DNA techniques to diet identification has recently increased 1166 identification resolution, particularly in marine systems (Blankenship & Yayanos, 1167 2005; Riemann et al, 2010; Cleary et al, 2012; Jakubavi?i?t? et al, 2017; 1168 Waraniak et al, 2019). DNA metabarcoding enables the identification of most 1169 prey items, even when diets are broad and diverse, as well as the simultaneous 1170 analysis of many samples. This work has identified trophic linkages within food 1171 webs, as well as predator diet breadth and preference. One of the few 1172 investigations examining the potential of using DNA analyses of fish gut contents 1173 in the monitoring of ecosystem function is the study of the stomach contents of 1174 coral reef fish (Leray et al, 2013). 1175 49 Visual identification has conventionally been used for the analysis of gut 1176 contents of fish based on prey morphology (Hyslop, 1980). The method is time 1177 consuming and requires taxonomic expertise. However, most studies based on 1178 visual inspection, particularly in small filter feeding fish have the following 1179 disadvantages: ambiguous prey specimen identification due to extensive 1180 digestion, the presence of unidentified partial tissues, and a lack of expert 1181 knowledge of identification (at least higher than family or order level) (Baker et al, 1182 2014). Stable isotope analysis can allow a sense of trophic level but can only 1183 inform of relative proportion of phytoplankton and zooplankton. Because of this, 1184 PCR methods for gut-contents analysis have developed rapidly and they now 1185 dominate the diagnostic methods used for gut-contents analysis in field-based 1186 research (Pompanon et al, 2012). 1187 2.5.2 Limitations of DNA metabarcoding 1188 Problems with DNA barcoding using universal primers do exist. Because it is 1189 PCR amplification based, problems of contamination can occur. This includes 1190 field-based contamination from nonfood environmental DNA, laboratory 1191 contamination (De Barba et al, 2014). In addition, misassignment of sequence-to- 1192 sample during high throughput sequencing (Schnell et al, 2015). 1193 Analysis by 18S rRNA gene metabarcoding is only semi-quantitative 1194 producing relative read abundancies (Deagle et al, 2019). Although it does not 1195 provide quantitative estimates of prey items, it can provide a guide for future 1196 targeted study design. The choice of primers can make a big difference; even 1197 closely related primer sets can affect apparent representation of some taxa, as 1198 shown in this study, even though in this study no differences overall in what 1199 dominant OTUs was found. Furthermore, there is no perfect set of universal 1200 primers for either 16S or 18S rRNA gene sequences. Even now,18S rRNA gene 1201 sequences are not well represented across the eukaryotic tree of life, particularly 1202 for aquatic species (Weigand et al, 2019). 1203 50 2.5.3 Other considerations 1204 DNA purification methods will continue to improve to allow for amplification of 1205 difficult samples. In addition, remedies against PCR inhibitors have been 1206 developed. For example, Quantabio (Beverly, MA, US) has developed an 1207 engineered Taq DNA polymerase that is combined with high avidity monoclonal 1208 antibodies. These antibodies bind the polymerase and keep it inactive prior to the 1209 initial PCR denaturation step preventing binding to potential PCR inhibitors. This 1210 polymerase, in Quantabio?s ToughMix, comes with additives that neutralize PCR 1211 inhibitors like polysaccharides, humic acid, and polyphenols to ensure reliable 1212 assay performance with a spectrum of starting materials including environmental 1213 samples. 1214 Host or predator DNA contamination is a common problem in DNA barcoding 1215 research. Because prey samples are collected from the predator?s gut, there is a 1216 very high probability that predator DNA is included with the prey samples. In this 1217 study, we analyzed this directly by assessing what percentage of reads 1218 represented menhaden rRNA gene sequences and were able to eliminate 1219 samples in which host DNA contamination was too high. This required 1220 knowledge of the menhaden 18S rRNA gene sequence which we were able to 1221 determine. However, with other tissues such as gill, which is also used for diet 1222 studies, the problem of host DNA contamination can be problematic. This can be 1223 resolved by stratagems such as blocking the detection of predator DNA with 1224 ligase or using a blocking oligonucleotide (Cleary et al, 2012; Craig et al, 2013), 1225 although such methods may reduce depth of coverage. 1226 High-throughput sequencing (HTS) technologies are now increasingly used 1227 in fisheries research and are producing ever-increasing quantities of data. While 1228 many laboratories and even undergraduate students generate high throughput 1229 sequencing data, analyzing these results requires a skill set that is traditionally 1230 reserved for bioinformaticians. Learning to program, using languages such as R 1231 and Python, and making sense of the vast amounts of available ?omics data have 1232 become easier thanks to a multitude of available resources. This can empower 1233 51 bench researchers to perform more complex computational analyses. Although 1234 not used here, KNIME is an example of an accessible entry point for researchers 1235 daunted by programming. It is a graphical user interface (GUI) analytics 1236 environment that offers a ?point and click? alternative to classical programming 1237 (Berthold et al, 2009; Fillbrunn et al, 2017). These GUI programs are becoming 1238 more popular as applications for metabarcoding expand. Since the beginning of 1239 this work QIIME has been superseded by QIIME2 which includes a semi-GUI 1240 application (Caporaso et al, 2010, Bolyen et al, 2019). MOTHUR is another 1241 example of a semi-GUI HTS analysis platform (Schloss et al, 2009). These tools 1242 such as KNIME, QIIME2 and MOTHER, together with a growing number of 1243 tutorials and courses, have been crucial in providing simple user interfaces to 1244 conduct complex analyses, making big data accessible to biologists (Grabowski 1245 & Rappsilber, 2019). However, these platforms offer less flexibility for pipeline 1246 development compared with programming languages such as R and Python. 1247 Furthermore, they can still require hours of set-up and use. 1248 2.6 Comments and recommendations 1249 Assessment of menhaden stomach contents is technically challenging 1250 because the visible food items are small (5-100 ?m) and menhaden have a 1251 gizzard-like stomach that grinds ingested items to an amorphous paste 1252 (Friedland et al, 1984). The methods reported here shows that DNA 1253 metabarcoding can be applied successfully to the study of the stomach 1254 microbiota and diet items of Brevoortia patronus, the Gulf menhaden. The 1255 method allowed characterization of the stomach prokaryotic community, as well 1256 as eukaryotic diet items. Neither the diet not the stomach microbiota of Gulf 1257 menhaden has been analyzed previously by DNA metabarcoding. The analysis 1258 of 18S rRNA sequences has uncovered a greater taxonomic richness than 1259 previously described with OTUs (species) per sample ranging from 1500 to over 1260 10,000. The methods described are suitable also for the study of Atlantic 1261 menhaden, Brevoortia tyrannus. The method can be applied to studying the 1262 effects of location, salinity, season and developmental stage on the diet and 1263 52 stomach microbiota of menhaden and can be applied to investigating the 1264 microbiota of different regions of the gut. It could also be used to monitor Gulf 1265 menhaden stomach contents over time to tease out effects of climate change. 1266 We also anticipate applying the method to other filter feeding species. Increased 1267 data management resources and the reduction in cost for high throughput 1268 sequencing make DNA metabarcoding an attractive alternative to traditional 1269 methods. If routinely included in the investigation of ecosystem function, DNA 1270 metabarcoding has the potential to complement other approaches and ultimately 1271 enhance ecosystem-based management and biomonitoring (Taberlet et al, 2012; 1272 Evans et al, 2016; Deiner et al, 2017; Bohan et al, 2017). 1273 2.7 Acknowledgments 1274 This work was supported by the National Oceanic and Atmospheric 1275 Administration, Educational Partnership Program (NOAA-EPP) award to the 1276 Living Marine Resources Cooperative Science Center (LMRCSC) 1277 NA11SEC4810002 and NA16SEC4810007. Ammar Hanif was supported in part 1278 by MD SeaGrant Graduate Student Fellowship, NA10OAR4170072 SA7528129- 1279 D and in part as an LMRCSC Graduate Fellow. Thanks to Sabeena Nazar for her 1280 technical expertise in optimizing the high throughput sequencing. Thanks to Dr. 1281 Stacy Smith, Delaware State University for coordinating the collection of 1282 menhaden, to LMRCSC summer undergraduate intern Malisa Smith and to 1283 Centennial High School intern, Eric Sibanda for assistance in determining the 1284 menhaden 18S rRNA gene sequence. 1285 1286 53 Supplementary Materials 1287 Sample Site Salinity Temp pH DO Secchi ppt oC S1 SVS02 37.6 28.4 8.0 5.5 0.6 S2 SVS02 37.6 28.4 8.0 5.5 0.6 S3 SVS02 37.6 28.4 8.0 5.5 0.6 S4 SVS03 36.3 27.8 7.9 5.5 0.6 S5 SVS04 36.4 28.7 8.0 5.8 0.4 S6 SVS04 36.4 28.7 8.0 5.8 0.4 S7 SVS04 36.4 28.7 8.0 5.8 0.4 S8 SVS05 36 29 7.8 5.7 0.4 S9 SVS05 36 29 7.8 5.7 0.4 S10 SVS05 36 29 7.8 5.7 0.4 Supplemental Table 2.1: Characteristics of water at collection sites in Apalachicola Bay Water quality measurements of temperature, salinity, pH, dissolved oxygen, and turbidity Ire recorded at the time and place of sample collection using a YSI 556 multiparameter water quality meter. 1288 Sequence data 1289 Sequence data for this project can be found at 1290 https://github.com/Hanif82/gulfmenhadenotutable.git. 1291 1292 54 Chapter 3: The stomach microbiota of juvenile Gulf 1293 menhaden (Brevoortia patronus) differs with location in 1294 Apalachicola Bay, FL 1295 3.1 Abstract 1296 As part of a study investigating the ecology of juveniles of the Gulf menhaden 1297 (Brevoortia patronus) from Apalachicola Bay, FL, the microbiota of the stomach 1298 was characterized. As juveniles, Gulf menhaden live in tidal creeks, marsh and 1299 open bay areas. In the present study we compared the stomach microbiota of 1300 juvenile Gulf menhaden at two different locations within Apalachicola Bay, FL, 1301 using DNA metabarcoding. Juvenile menhaden from the same year class were 1302 collected in Apalachicola Bay from Two-mile Channel and St. Vincent Sound 1303 representing different salinities. MiSeq Illumina high-throughput sequencing was 1304 used to analyze DNA amplicons from the V3-V4 region of the 16S ribosomal 1305 RNA (rRNA) gene from the stomachs of menhaden at each site. 14,184 unique 1306 operational taxonomic units (OTUs) were identified, 14,145 of which were found 1307 in samples from both locations. The stomach microbiota of samples from both 1308 locations showed a predominance of Proteobacteria, Chloroflexi, Bacteroidetes, 1309 Acidobacteria and Actinobacteria, although significant differences in composition 1310 at the class level were seen. Ninety-six OTUs were present in all samples from 1311 both locations, representing 32.45 % of total reads. Alpha diversity metrics 1312 showed that samples from Two-mile Channel showed a higher level of taxonomic 1313 richness with thirty-seven OTUs unique to these samples. Beta diversity analysis 1314 showed a strong association between the microbiota in menhaden stomachs and 1315 sampling location, possibly reflecting differences in water quality. Proteobacteria 1316 represent 51 % and 49 % of the total reads in samples from Two-mile Channel 1317 and Vincent Sound, respectively. Within the most abundant of these were 1318 families with the potential to contribute to menhaden nutrition by the provision of 1319 B vitamins or enzymes that could aid in digestion of cellulose and chitin and 1320 degrade detrital sulfated polysaccharides. Within the Proteobacteria, only five 1321 55 OTUs could be assigned at the genus level; Anaerospora, Desulfosarcina, 1322 Desulfobacca, Shewanella and Halomonas. Such a consortium has been linked 1323 with the processes of anaerobic hydrocarbon degradation, sulfate reduction, 1324 denitrification and/or methanogenesis associated with petroleum biodegradation. 1325 3.2 Introduction 1326 The microbiota of fish is among the better characterized among the non- 1327 mammalian vertebrates (Colston & Jackson, 2016). Earlier work on the microbial 1328 communities of fish has been largely culture based, although, as with studies of 1329 the microbiota of other host taxa, culture-independent studies have increased 1330 more recently and the patterns of fish microbiota has been reviewed (Clements 1331 et al, 2014). The stomach is not often included for gut microbial composition 1332 analyses in fish. However, fish have a unique and close interaction with their 1333 surrounding environment as well as with the microorganisms that co-exist there 1334 compared to tetrapods. This is particularly true for filter-feeders that travel with 1335 open mouths taking in material in the same proportion as they occur in the 1336 surrounding water although the size of the gill rakers can provide some 1337 selectivity. Effectively, menhaden can concentrate organisms from the water in 1338 which they travel. In the present study we compared the stomach microbiota of 1339 juvenile Gulf menhaden (Brevoortia patronus) using DNA metabarcoding with 1340 primers for the DNA encoding the 16S rRNA gene sequence. DNA 1341 metabarcoding has rapidly become the method of choice in characterizing living 1342 communities in any environment. This approach provides a comprehensive 1343 culture-independent approach that has been used increasingly to obtain a full 1344 inventory of the gut microbiota from multiple fish species (Jami et al, 2015; 1345 Tarnecki et al, 2017; Egerton et al, 2018). Because of the intimate association of 1346 fish with their surrounding environment, we compared the stomach microbiota of 1347 Gulf menhaden at two different locations within Apalachicola Bay, Florida, Two 1348 Mile Channel (TMC) and St. Vincent?s Sound (SVS), representing very different 1349 salinities. 1350 B. patronus is the main menhaden species in the Gulf of Mexico, ranging 1351 from the northern Gulf from Brazos Santiago, Texas, to Tampa Bay, Florida 1352 56 (Christmas & Gunter, 1960). Young-of-the-year Gulf menhaden are ubiquitous 1353 members of the northern Gulf of Mexico estuarine nekton communities and 1354 occupy fresh to brackish waters using both marsh and open bay habitats 1355 (Deegan et al , 1990). After transformation, juveniles remain in low salinity, near- 1356 shore areas where they travel about in dense schools, often near the surface 1357 (Lassuy, 1983). As juveniles, menhaden live in tidal creeks, marsh and open bay 1358 areas where they filter the water column via their gill rakers. The migration 1359 pattern of juvenile Gulf menhaden involves the sequential use of marsh creek 1360 and open bay areas where they remain until late summer or fall (Deegan et al, 1361 1990). This migration pattern coincides with the productivity peaks in tidal creek 1362 and open bay areas. Menhaden are omnivorous filter feeders, feeding by 1363 straining phytoplankton and zooplankton from water. Their movement as 1364 juveniles is related to the availability of food in the water. 1365 The Apalachicola Bay estuary is a highly productive lagoon and barrier island 1366 complex on the upper Gulf coast of Florida. It covers about 212 square miles and 1367 serves as the interface between the river system and the Gulf of Mexico. Four 1368 barrier islands bound the bay: St. Vincent Island, St. George Island, Little St. 1369 George Island, and Dog Island. The bay area, including Apalachicola Bay, East 1370 Bay, St. George Sound, St. Vincent Sound, Indian Lagoon, and Alligator Harbor, 1371 is about 65 km long and 5 to 10 km wide. Apalachicola Bay is a river-dominated 1372 system with the major source of freshwater input coming from the Apalachicola 1373 River. The alluvial Apalachicola River is the largest Florida river in terms of flow 1374 (NWFWM report 2017). Maximum river flows occur during late winter to early 1375 spring months and are highly correlated with Georgia rainfalls (Meeter et al, 1376 1979). The bulk of seawater flow is from the east entering St. George Sound. 1377 The western end of Apalachicola Bay is linked to the Gulf of Mexico by Indian 1378 Pass, the narrow channel between St. Vincent Island and the mainland, with a 1379 maximum water depth of about 4 m. St. Vincent Sound itself is shallow, with an 1380 average depth of little more than 1 m, containing numerous oyster bars. Within 1381 the bay's shallow waters, with an average depth of 3 m, are numerous oyster 1382 reefs and sandy shoals. The surrounding wooded lowlands consist of saltwater 1383 57 and freshwater marshes, and freshwater swamps. Two-Mile Channel follows the 1384 coastline from the west side of the Apalachicola River estuary for approximately 1385 2 miles. 1386 Given that menhaden sample the environment through filtering, it seemed 1387 likely that the stomach microbiota would reflect the microbial communities in the 1388 water of the two locations although their contributions to menhaden health or as 1389 indicators of environmental quality was unknown. The present study looks at the 1390 diversity of prokaryotic microorganisms within menhaden juvenile stomachs with 1391 regard to location and water quality, their capability to provide nutritional benefits, 1392 as well as the potential of menhaden to function as a biomonitors for coastal and 1393 near-shore bays. 1394 3.3 Methods 1395 Detailed methodology for DNA metabarcoding using universal primers for 1396 16S rRNA genes has been provided in Chapter 2. 1397 3.3.1 Sample collection 1398 Colleagues from Delaware State University (DSU) collected samples in May 1399 2013 at Two-Mile Channel (TMC) in Apalachicola Bay, Florida, (TMC01, 1400 Lat:29.712467, Long: -85.01525) (Figure 3.1). Gulf menhaden samples were 1401 collected from this location using a 10-foot cast net. At this time of year and 1402 location, the menhaden were concentrated in large, slow-moving, and tightly 1403 packed schools in lower salinity water, allowing all fish to be collected at the 1404 same site within this location. A second collection was made by Florida Fish and 1405 Wildlife in July 2013 from the St Vincent Sound location (SVS) in Apalachicola 1406 Bay, Florida. The choice of location was determined by where the menhaden 1407 were found. At this time, 07/02/2013, the menhaden were not found in tightly 1408 packed schools or in the lower salinity waters of TMC. Because the menhaden 1409 were not so tightly grouped, sampling took place at four sites within SVS (SVS02, 1410 SVS03, SVS04, SVS05), with latitude and longitude coordinates corresponding 1411 to 29.6813, -85.206167 (SVS02); 29.700267, -85.17595 (SVS03); 29.686033, - 1412 85.1372 (SVS04) and 29.688233, -85.124167 (SVS05), respectively (Figure 1413 58 3.1). Collections at SVS sites were done using a 183 m seine net at 1 m depth. 1414 Water quality measurements of temperature, salinity, pH, dissolved oxygen, and 1415 turbidity were also recorded at all sites (Table 3.1). 1416 Figure 3.1. Collection sites in Apalachicola Bay, Florida The blue circle represents Two-Mile Channel site where Gulf menhaden (Brevoortia patronus) were collected in May 2013. The red circles represent the sites at St. Vincent Sound where menhaden were collected in July 2013. 1417 59 Location Sample Site Salinity Temp pH DO Secchi ppt oC T1 TMC01 1.8 27.1 7.5 5.7 0.7 T2 TMC01 1.8 27.1 7.5 5.7 0.7 T3 TMC01 1.8 27.1 7.5 5.7 0.7 T4 TMC01 1.8 27.1 7.5 5.7 0.7 Two Mile T5 TMC01 1.8 27.1 7.5 5.7 0.7 Channel T6 TMC01 1.8 27.1 7.5 5.7 0.7 T7 TMC01 1.8 27.1 7.5 5.7 0.7 T8 TMC01 1.8 27.1 7.5 5.7 0.7 T9 TMC01 1.8 27.1 7.5 5.7 0.7 T10 TMC01 1.8 27.1 7.5 5.7 0.7 S1 SVS02 37.6 28.4 8.0 5.5 0.6 S2 SVS02 37.6 28.4 8.0 5.5 0.6 S3 SVS02 37.6 28.4 8.0 5.5 0.6 S4 SVS03 36.3 27.8 7.9 5.5 0.6 St. S5 SVS04 36.4 28.7 8.0 5.8 0.4 Vincent S6 SVS04 36.4 28.7 8.0 5.8 0.4 Sound S7 SVS04 36.4 28.7 8.0 5.8 0.4 S8 SVS05 36 29 7.8 5.7 0.4 S9 SVS05 36 29 7.8 5.7 0.4 S10 SVS05 36 29 7.8 5.7 0.4 Table 3.1: Water quality measurements from each site Water quality measurements of temperature, salinity, pH, dissolved oxygen, and turbidity were recorded at the time and place of sample collection using a YSI 556 multiparameter water quality meter. The sites and samples are color coded to be consistent with the colors used in Figures 3.6 and 3.7. 1418 3.3.2 Sample preparation, DNA extraction, and estimation of DNA quality 1419 Sample preparation, DNA extraction and estimation of DNA quality was done 1420 as described in Chapter 2. 1421 3.3.3 MiSeq library preparation and high throughput sequencing 1422 Sequencing of DNA from all twenty stomach samples was performed on the 1423 Illumina MiSeq platform located in the BioAnalytical Services Laboratory (BAS 1424 Lab) at the University of Maryland Center for Environmental Science-Institute of 1425 Marine and Environmental Technology. The library was prepared by Illumina?s 1426 standard library construction protocol as detailed in the 16S metagenomic 1427 sequencing library preparation procedures available at: 1428 https://support.illumina.com/downloads/16s_metagenomic_sequencing_library_p 1429 60 reparation.html, with the exception that a dual indexing approach was used, as 1430 described in Fadrosh et al (Fadrosh et al, 2014; Hanif et al, 2020). PCR was 1431 performed using denaturation at 95? ?C for 3?min, followed by 8?cycles at 95? ?C 1432 for 30?s, 55? ?C for 30?s, 72??C for 30?s, and a final extension step at 72??C for 1433 5?min. PCR clean-up was done using AMPure XP beads to purify the 16S rRNA 1434 gene V3 amplicon away from free primers and primer-dimer species. After PCR 1435 cleanup, libraries were quantified, normalized and pooled. A 2 x 300 cycle run 1436 was performed in the MiSeq, providing high?quality, paired reads of the V3-V4 1437 16S rRNA gene. 1438 3.3.4 Post-sequencing pipeline 1439 Removal of index sequences (called de-multiplexing), base calling and data 1440 quality assessment were performed on the MiSeq instrument. MacQIIME 1441 (Caporaso et al, 2010) was used to process and assess quality of output reads 1442 from the sequencing primers. Post sequence processing was done using the 1443 recommended QIIME pipeline for Illumina reads (Caporaso et al, 2010). The 1444 MacQIIME script, join_paired_ends.py, was used to join forward and reverse 1445 reads. Paired reads were filtered for low quality reads (quality score of <25) and 1446 short read length (<200 bp) which were removed from the library using the 1447 split_libraries.py command. Chimeric sequences were identified de novo using 1448 the USEARCH61 (v6.1.544) algorithm with the script identify_chimeric_seqs.py 1449 (Edgar, 2010). This was followed by the removal of PhiX sequences by a BLAST 1450 analysis with the scripts parallel_blast.py and filter_fasta.py. The remaining reads 1451 were used for operational taxonomic unit (OTU) picking using the script 1452 pick_otus.py. A UCLUST de novo clustering method within the USEARCH61 1453 (v6.1.544) algorithm was used to pick OTUs for both sequencing methods 1454 (Edgar, 2010). Taxonomic assignment was done using the script 1455 assign_taxonomy.py with the Greengenes gg_13_8 as a reference database at 1456 90 % similarity. These data were used to construct an OTU table for subsequent 1457 relative sequence abundance analysis and diversity analysis. 1458 61 3.3.5 Microbiota data analysis 1459 For this study, 16S rRNA sequences assigned as chloroplasts were 1460 considered protist prey and removed from this analysis. The package Phyloseq 1461 within the statistical software R and QIIME was used to assess OTU richness 1462 and evenness using the estimates of alpha-diversity metrics, Observed OTUs 1463 (species), Chao 1, Shannon and Simpson index. To analyze how closely the 1464 samples were related to each other and to compare observed differences in the 1465 microbial community from the two Apalachicola Bay sites, beta diversity analyses 1466 were determined based on Bray-Curtis distance metric and visualized using the 1467 ordination method principal coordinate analysis (PCoA) into two-dimensional 1468 plots. 1469 3.4 Results 1470 3.4.1 Characterization of stomach bacteria communities from menhaden 1471 caught at Two-Mile Channel and St. Vincent Sound 1472 DNA extracted from twenty menhaden stomachs, ten from each sampling 1473 location, were sequenced were sequenced using the Illumina MiSeq high- 1474 throughput sequencing platform. A total number of 1,420,063 and 906,483 raw 1475 reads were retrieved from TMC and SVS samples respectively. After post 1476 sequencing processing these numbers were reduced to 1,411,708 and 898,662 1477 reads from TMC and SVS samples respectively (Table 3.2). Binning at 0.03 % 1478 divergence resulted in 14,184 total OTU assignments from sixty-five phyla. There 1479 were ten phyla that represented greater than 1 % of the total relative read 1480 abundance are shown in Figure 3.2. All were represented in stomachs from fish 1481 caught at both locations although in very different relative abundances (percent 1482 reads). The stomach microbial communities from menhaden caught at TMC were 1483 dominated (representation greater than 5 %), in descending order, by 1484 Proteobacteria, Chloroflexi, Acidobacteria, Actinobacteria and Bacteroidetes, 1485 (Figure 3.2A). The stomach microbial communities from menhaden caught at 1486 SVS were dominated (representation greater than 5 %), in descending order, by 1487 Proteobacteria, Chloroflexi, Acidobacteria, Actinobacteria, and Bacteroidetes 1488 62 (Figure 3.2B). Proteobacteria was the most dominant taxon representing 49.8 % 1489 and 44.1 % of the relative abundance for menhaden stomach DNA from TMC 1490 and SVS samples respectively. This was followed by Chloroflexi, representing a 1491 much lower fraction at 11.5 % and 16.5 % for menhaden from TMC and SVS 1492 samples respectively. Due to their abundance I looked more deeply into these 1493 two phyla. 1494 Figure 3.2: Taxonomic composition at the phylum level of menhaden stomach microbiota from Two Mile Channel and St Vincent Sound samples Proportional representation of all phyla with more than 1 % abundance (percent reads) are shown as pie charts: (A) TMC and (B) SVS. 1495 The classes represented in Proteobacteria and Chloroflexi in stomach 1496 samples from TMC and SVS can be seen in Figure 3.3. In samples from both 1497 locations, Proteobacteria were represented by the classes 1498 Gammaproteobacteria, Deltaproteobacteria, Alphaproteobacteria, and 1499 Betaproteobacteria with Gammaproteobacteria showing the highest 1500 representation. The next most represented classes were Deltaproteobacteria in 1501 TMC samples and Alphaproteobacteria in SVS samples. Of the four mentioned 1502 classes, Betaproteobacteria was the least represented. The four most abundant 1503 classes from the phylum Chloroflexi were Anaerolineae, Ellin6529, SL56, and 1504 Dehalcoccoidetes, with Anaerolineae having the highest representation in both 1505 TMC and SVS samples. Anaerolineae representation was greater in samples 1506 from SVS than from TMC samples. The next most represented class was 1507 63 Ellin6529 in both TMC and SVS samples with a greater representation in TMC 1508 samples. 1509 Figure 3.3: Taxonomic composition at the class level of Proteobacteria and Chloroflexi from Two Mile Channel and St Vincent Sound samples Relative abundance of the dominant classes from the two most abundant phyla: A) 1510 Proteobacteria and B) Chloroflexi 64 There is much variation between individual samples, even from the same 1511 location. Figure 3.4 shows the variation in the relative abundance of classes 1512 between each sample (Figure 3.4). These represent classes that have a relative 1513 abundance of greater than 1 %. Classes that have a relative abundance of less 1514 than 1 % abundance were grouped in Other Taxa. To facilitate comparison 1515 between samples, the number of OTUs was normalized to 34,000 per sample. It 1516 should be noted that there are five taxa that were less than 1 % of the relative 1517 abundance in SVS samples but greater than 1 % of the relative abundance in 1518 TMC samples. Two of these were from the phylum Bacteroidetes assigned as 1519 Saprospirae and Bacteroidia. The other three were assigned as BPC102, PRR- 1520 12, and OM190 from the phyla Acidobacteria, WS3, and Planctomycetes 1521 respectively. Nine phyla were represented by eighteen classes. Proteobacteria, 1522 Acidobacteria, and Bacteroidetes were represented by four classes each. The 1523 remaining phyla were represented by a single class. Chloroflexi was the second 1524 most abundant phylum, but it was represented solely by Anaerolineae. The 1525 relative abundance other classes of Chloroflexi that are shown in Figure 3.4 1526 accounted for less than 1 % representation. 1527 Figure 3.4: Variation in taxonomic composition at the class level of stomach microbiota in each sample from Two Mile Channel and St Vincent Sound The relative abundance of OTUs at the class rank from each sample from TMC and SVS sites, as measured by percent reads. The number of OTUs per sample was normalized to 34,000. Classes are grouped by phyla. 1528 65 Given the high relative abundance of Proteobacteria, I looked further into this 1529 phylum. Table 3.3 shows the most abundant proteobacterial OTUs expressed as 1530 the mean number of OTUs and mean number of reads per sample. The lowest 1531 classification given to the OTUs is shown here. Not all OTUs were represented in 1532 each sample but are reflected in the mean number of OTUs. Overall, the highest 1533 representation is of the family Piscirickettsiaceae (class Gammaproteobacteria). 1534 Samples from both sampling locations have roughly the same number of OTUs 1535 representing this family although the mean number of reads were higher in SVS 1536 samples. This is followed by percent representation of the OTUs from the family 1537 Marinicellaceae (class Gammaproteobacteria), again with approximately the 1538 same number of OTUs from each location and again with higher percent 1539 representation in samples from SVS compared to those from TMC. The number 1540 of OTUs from the order Chromatiales, another gammaproteobacterial class is 1541 similar in samples from both locations although the percent representation in 1542 samples from SVS is more than two-fold higher than in samples from TMC. The 1543 number of OTUs from the family Rhodobacteracea (class Alphaproteobacteria) is 1544 comparable in samples from both locations, however, representation in SVS 1545 samples is more than two-fold higher in comparison to samples from TMC. In 1546 TMC samples, although OTUs from the family Piscirickettsiaceae had the highest 1547 percent representation, the highest diversity of OTUs was from the order 1548 Chromatiales. For all taxa, except OTUs from the Alphaproteobacteria genus, 1549 Anaerospora, the highest percent representation was seen in samples from SVS. 1550 Halomonas spp showed the greatest difference between TMC and SVS samples 1551 with six-fold higher representation at St. Vincent Sound from a small number of 1552 OTUs, probably reflecting the differences in salinity. Though sequences from 1553 both sampling locations were dominated by Proteobacteria, the most abundant 1554 OTUs in all samples was from Synechococcus spp. 1555 66 Two Mile Channel St. Vincent Sound Lowest Class Mean # Mean # Mean # Mean # classification reads OTUs reads OTUs f:Rhodobacteracea 518 62.7 1282.7 71.8 Alphaproteobacteria g:Anaerospora 65 2.6 40.1 71.8 Betaproteobacteria f:Oxalobacteraceae 104.2 4.6 576.7 3.1 g:Desulfococcus 376.1 44 700.9 40.5 Deltaproteobacteria g:Desulfosarcina 29.4 3.1 60 5.1 f:Desulfobulbaceae 377.4 47.8 700.9 31.9 g:Shewanella 31.9 2.7 98 2.4 o:Chromatiales 534.7 88.1 1155.7 72.3 f:HTCC2089 67.4 11.6 149.3 9.7 Gammaproteobacteria g:Halomonas 61.7 2.2 429 3.3 f:Thiohalorhabdales 29.9 2.9 299.4 4.7 f:Piscirickettsiaceae 1202 81.1 1831.4 79.9 f:Marinicellaceae 626.2 43.4 914.4 46 Table 3.3: Representation of the most abundant proteobacteria from Two Mile Channel and St Vincent Sound samples Representation of the most abundant proteobacteria by the mean number of reads and mean number of OTUs from all samples from TMC and SVS. The data from each site was normalized to 34,000 reads/sample. The lowest assigned taxonomic levels are shown. 1556 3.4.2 Shared and unique OTUs 1557 There are sixty classes of bacteria not represented in SVS samples, however 1558 only 44 of these could be attributed to a known taxonomic assignment. Most of 1559 the taxonomic assignments were given a generic identifier, for example class B5- 1560 096 of the phylum Fibrobacteres. The unidentified taxa were of low abundance 1561 (less than 1 % relative abundance) and most belonged to the phylum 1562 Planctomycetes. In contrast there were only seven classes that were not 1563 represented in TMC samples five of which could not be given a named 1564 taxonomic assignment. These were also of low abundance. 1565 Shared OTUs: In order to get an idea of which bacteria could be considered 1566 candidates of a resident core microbiota in menhaden stomachs, we looked at 1567 OTUs that were present in all samples regardless of sampling site or location 1568 (Figure 3.5 and Table 3.4). These shared OTUs, could indicate a common 1569 menhaden stomach microbiome for further study, or could just reflect the taxa 1570 present in the water from all sites. Ninety-six OTUs from six phyla were observed 1571 in all samples, representing 32.45 % of the total reads. The number of OTUs and 1572 their lowest classification can be seen in Table 3.2. The highest number of OTUs 1573 67 were assigned to the phylum Proteobacteria represented by forty-nine OTUs 1574 (Figure 3.5A and Table 3.4). This was followed by the phylum Actinobacteria, 1575 Acidobacteria, Chloroflexi, Bacteroidetes and Cyanobacteria which had 1576 seventeen, thirteen, twelve, three, and two OTUs, respectively. Figure 3.5, 1577 panels B-G shows that the number of reads per OTU varied greatly between 1578 each OTU as well as between samples from the TMC and SVS locations. The 1579 number of reads from OTUs having the same lowest classification were summed. 1580 For example, Cyanobacteria had only two OTUs both of which had the lowest 1581 classification of g:Synechococcus. The number of reads for each OTU were 1582 summed to represent the number of OTUs for Synechococcus. Samples from 1583 SVS had more reads for all OTUs, samples except for those assigned to the 1584 genera Anaerospora and Synechoccocus. 1585 68 Figure 3.5: Taxonomic composition of the shared OTUs from all samples Shared OTUs, defined as OTUs found in every sample at both sampling locations. A) Pie chart depicting the number of shared OTUs per phylum. B-G) Bar charts depicting read number (y-axis) of each shared OTU classified to its lowest taxonomic level from each phylum. B) Proteobacteria; C) Acidobacteria; D) Actinobacteria; E) Bacteroidetes; F) Chloroflexi; G) Cyanobacteria. Blue and red bars represent TMC and SVS samples respectively. 1586 69 Table 3.4: Taxonomic composition of the shared OTUs from all samples The taxonomic composition of the shared OTUS are shown from the phylum level to the lowest assigned taxonomic levels. The number of OTUs are given for each of the lowest assigned taxonomic levels. 1587 70 The greatest taxonomic resolution of shared taxa was seen at the order level 1588 with twenty-four orders in total being represented. The highest diversity belonged 1589 to Gammaproteobacteria with nine known orders represented. This was followed 1590 by Deltaproteobacteria, Alphaproteobacteria, and Betaproteobacteria with, five, 1591 two and one order(s) being represented, respectively. The order with the highest 1592 representation was Thiotrichales from the Gammaproteobacteria (Table 3.4). At 1593 the family level this order was represented by Piscirickettsiaceae and 1594 Thiotrichaceae, with most reads coming from the former. The order 1595 Desulfobacterales, from the Deltaproteobacteria, contained the second largest 1596 number of reads, represented by the families Desulfobacteraceae and 1597 Desulfobulbaceae (genus Desulfococcus), with the most reads being assigned to 1598 the former of the two. The order with the highest representation in the phylum 1599 Acidobacteria was Sva0725. All OTUs from the Actinobacteria were from order 1600 Acidimicrobiales, with the highest representation from the family koll13. From the 1601 phylum Bacteroidetes, all shared OTUs were from the family Flavobacteriaceae. 1602 The orders GCA004 and SO208 from the phylum Chloroflexi had the highest 1603 number of reads. Despite Proteobacteria having the greatest number of both 1604 OTUs and shared OTUs, the shared OTUs with the greatest number of reads 1605 from Cyanobacteria, the two Synechoccus spp. 1606 Unique OTUs: In terms of unique taxa, there are thirty-six OTUs represented 1607 solely in TMC samples and one OTU solely represented in SVS samples (Table 1608 3.5). Unique taxa are defined here as taxa represented in all samples from one 1609 sampling location with no representation in the other sampling location. Amongst 1610 the OTUs unique to TMC samples, there were eight belonging to the phylum 1611 Cyanobacteria, all of which were assigned at the genus level as either 1612 Synechococcus or Pseudanabaena. There were six OTUs assigned to the 1613 phylum Bacteroidetes and further classified to the families Saprospiraceae, 1614 Flavobacteriaceae, and Cryomorphaceae and one OTU assigned only to the 1615 order Flavobacteriales. The phylum Proteobacteria represented the greatest 1616 number of unique OTUs at thirteen. These were further classified as 1617 g:Rhodobacter; o:Myxococcales; f:Acetobacteraceae; o:Rickettsiales; 1618 71 f:Xanthomonadaceae; and f:Alcaligenaceae, all of which were only represented 1619 by a single OTU, as well as f:Sinobacteraceae and f:Comamonadaceae which 1620 were represented by two and three OTUs respectively. The phylum 1621 Actinobacteria and Planctomycetes were represented by two and three OTUs 1622 respectively. f:C111 and g:Candidatus represented Actinobacteria. 1623 f:Gemmataceae; o:Phycisphaerales; and o:CL500-15 represented 1624 Planctomycetes. The remaining phyla Acidobacteria, Chloroflexi, 1625 Gemmatimonadetes, and WS3 were each represented by a single OTU, 1626 classified as c:AT-s54; f:A4b; c:Gemm-2; and o:Sediment, respectively. The 1627 single OTU unique to SVS samples was from the phylum Cyanobacteria with the 1628 lowest classification as g:Synechococcus. Overall, the most abundant of the 1629 unique OTUs were from the genus Synechococcus. However, unique OTUs 1630 accounted for less than 1 % of the relative representation. 1631 72 Phylum Lowest Classification TMC % SVS % representation representation Cyanobacteria g:Pseudanabaena 0.262 g:Pseudanabaena 0.018 g:Pseudanabaena 0.017 g:Synechococcus 0.011 g:Synechococcus 0.010 g:Synechococcus 0.008 g:Synechococcus 0.007 g:Synechococcus 0.005 g:Synechococcus 0.007 Bacteroidetes f:Saprospiraceae 0.047 f:Saprospiraceae 0.019 f:Flavobacterioceae 0.011 f:Cryomorphaceae 0.011 f:Cryomorphaceae 0.010 o:Flavobacteriales 0.005 Chloroflexi F:A4b 0.047 Proteobacteria o:Myxococcales 0.025 g:Rhodobacter 0.031 f:Acetobacteraceae 0.023 o:Rickettsiales 0.019 f:Comamonadaceae 0.014 f:Xanthomonadaceae 0.014 f:Sinobacteraceae 0.012 f:Alcaligenaceae 0.009 o:PHOS-HD298108 0.009 f:Aeromonadaceae 0.008 f:Comamondaceae 0.008 f:Sinobacteraceae 0.005 f:Comamondaceae 0.004 Planctomycetes o:Phycisophaerales 0.029 f:Gemmataceae 0.008 o:CL500-15 0.005 Gemmatimonadet c:Gemm-2 0.023 es WS3 o-Sediment-1 0.016 Actinobacteria f:C111 0.015 g:Candidatus 0.009 Unassigned - 0.009 Acidobacteria c:AT-s54 0.005 Table 3.5: Unique OTUs from Two Mile Channel and St Vincent Sound samples 1632 The taxonomic composition of OTUs unique to TMC and SVS are shown, as well as their mean percent representation at the lowest ass7ig3n ed taxonomic levels classifications for TMC and SVS. 3.4.3 Alpha and beta diversity analysis 1633 Samples from TMC show higher diversity: Alpha diversity was investigated using 1634 Observed OTUs, Chao 1, Shannon and Simpson indices (Figure 3.6). The 1635 combined samples from TMC on average have more unique OTUs in 1636 comparison to the combined samples from SVS based on both Observed 1637 species and Chao1 estimates, both common measures of species richness. This 1638 indicates that the combined TMC samples have more unique OTUs in 1639 comparison to SVS samples, in most cases double the number of unique OTUs. 1640 Both Shannon and Simpson diversity metrics look at species richness and 1641 evenness but places more emphasis on richness and evenness respectively. 1642 However, both diversity metrics indicate higher diversity for SVS samples. 1643 Furthermore, TMC samples have a greater range of values for both diversity 1644 metrics. This may indicate that the microbial communities of SVS samples more 1645 closely resemble each other in comparison to TMC samples. In samples from 1646 both locations, there are three samples that appear to have a higher number of 1647 rare or unique OTUs. This could indicate that such OTUs may provide 1648 adaptability to environmental changes, since higher numbers of rare OTUs often 1649 indicate a redundancy in the environmental microbial processes. Within the TMC 1650 samples, there is higher diversity between samples, even though all samples 1651 from this location were from the same site. In contrast, the SVS samples seem to 1652 have a similar diversity to each other. Furthermore, the relative abundance of the 1653 OTUs in the SVS samples are more like each other than OTUs in the TMC 1654 samples. 1655 74 Observed Chao1 Shannon Simpson 1.00 6 6000 0.95 9000 5 0.90 Site 4000 SVS02 6000 SVS03 SVS04 SVS05 0.85 TMC01 4 2000 3000 0.80 3 0.75 SVS TMC SVS TMC SVS TMC SVS TMC Figure 3.6: Comparison of the differences in the microbial community from each sample by alpha diversity metrics The alpha diversity metrics using Observed species (OTUs), Chao1, Shannon, and Simpson analyses are shown for all samples from each sampling site to measure OTU richness and evenness. The samples are color coded to reflect the sampling sites and locations as in Table 3.1. 1656 Beta diversity analyses shows differences in samples from TMC and SVS: A 1657 hierarchical clustering of samples based on Bray-Curtis analysis shows cluster 1658 75 Alpha Diversity Measure 0.2 Site SVS02 SVS03 SVS04 SVS05 0.0 TMC01 SiteName St. Vincent Sound Two?Mile Channel ?0.2 ?0.4 ?0.2 0.0 0.2 Axis.1 [44.2%] Figure 3.7: Comparison of the differences in the microbial community from each sample by beta diversity analysis Beta diversity of the samples from all sites was determined using Bray-Curtis distance metrics visualized by using Principal Coordinate Analysis (PCoA) plots. The samples are color coded to reflect the sampling sites and locations as in Table 3.1. 1659 relationships between samples, indicating differences in bacterial composition 1660 between samples from Two Mile Channel and St. Vincent Sound (Figure 3.7). 1661 PCoA plots shows a clear distinction between TMC and SVS samples that 1662 correlates with the differences in salinity at the two sites. One sample from site 1663 SVS05 and one sample from site SVS04 more closely resemble each other than 1664 other samples from the same sites. It should be noted that water from the four 1665 SVS sites differ from each other with respect to turbidity and dissolved oxygen 1666 (Table 3.1). The samples from SV04 and SV05 that come from water with higher 1667 dissolved oxygen and lower turbidity cluster away from the samples at SV02 and 1668 SV03. 1669 76 Axis.2 [14.8%] 3.5 Discussion 1670 For fish in general, it remains unclear how diversity of the microbiota is 1671 partitioned among niches, either between stomach and lower gut or at other body 1672 sites like gill. While a great diversity of gut microorganisms has been uncovered 1673 across fish species, most communities have been dominated by the 1674 Proteobacteria (Roeselers et al, 2011; Sullam et al, 2012; Givens et al, 2015; 1675 Ghanbari et al, 2015; Song et al, 2016). Consistent with this, in Gulf menhaden 1676 from the TMC and SVS locations in Apalachicola Bay, Proteobacteria had the 1677 highest representation in terms of total reads, shared OTUs and unique OTUs in 1678 samples. Although samples from both sampling locations were dominated by 1679 Proteobacteria, the most abundant OTUs in all samples was from 1680 Synechococcus spp. Comparison of 16S rRNA gene sequences in Gulf 1681 menhaden from TMC and SVS suggest the stomach microbiota to be dynamic, 1682 correlating with water characteristics at the different sites. Unfortunately, 1683 although water physical characteristics are known, water samples at each site 1684 were not taken for comparison. However, in general the stomach microbiota is 1685 expected to change with location, reflecting the different physical conditions and 1686 therefore microbial community of the surrounding water. Salinity can affect the 1687 microbial community of water (Egerton et al, 2018). Similarly, turbidity and 1688 dissolved oxygen can also influence the microbial community. Peck, as early as 1689 1893, hypothesized that menhaden stomach contents reflected the plankton in 1690 the surface waters they occupied (Peck, 1893). However, although differences in 1691 composition of the microbial community at the class level from the two locations 1692 reflected differences in water characteristics, a predominance of the same 1693 bacterial phyla; Proteobacteria, Chloroflexi, Bacteroidetes, Acidobacteria and 1694 Actinobacteria was seen at each site in each location. Ninety-six OTUs were 1695 present in all samples, representing twenty-four orders and comprising 32.45 % 1696 of the total reads. These are candidates for a core menhaden stomach 1697 microbiome. 1698 77 Whether or not some of the stomach microbiota constitute a needed core 1699 microbiome, their role in the menhaden should be considered and their possible 1700 roles in the digestion of food or as food items themselves should be investigated. 1701 Conclusive evidence for a metabolically active gastric microbiota is difficult to 1702 provide from 16S rRNA gene sequencing-based microbiota analysis alone. 1703 However, consideration of highly represented and shared taxa should be 1704 deliberated in the context of the menhaden feeding apparatus. 1705 In the similarly sized allopatric species, Atlantic menhaden (B. tyrannus), the 1706 plankton filtering capability of juveniles was calculated from the experimental 1707 determination of clearing rates of uniformly sized particles. For 14-cm fork-length 1708 fish, a size representing the upper limit of juvenile size ranges, the fish can filter 1709 uniformly sized particles in a range of 7?9 ?m in diameter at filtration efficiencies 1710 of approximately 10 % (Friedland et al, 1984). Based on this, the effective pore 1711 size of the gill rakers of menhaden juveniles typical of spring and summer 1712 nursery residents (7.5-cm fork-length fish) is 60 % of this, or 4.2-5.4 mm 1713 diameter. However, for fish in the wild, particle retention at these sizes is 1714 enhanced when a background of other particles, typical of a water column are 1715 present, such as detritus and other plankton particles. Detritus has certainly been 1716 observed in the esophagus of juveniles (Durbin & Durbin, 1975). Coupled with 1717 this, scanning electron microcopy and histology revealed complex 1718 hydrodynamical arrangements that utilize both the structure of the gill rakers and 1719 the fluid dynamics of mucous to trap, translocate, and consume extremely small 1720 food particles (Friedland et al, 2006). From this, it was suggested that juvenile 1721 menhaden retain organisms smaller than their effective filtering capability, and it 1722 has been anticipated that juveniles can retain picoplankton including 1723 bacterioplankton. Epifluorescent microscopy of 0.2 ?m-filtered material showed 1724 that autofluorescing coccoid cyanobacterial cells, 1-2 ?m in diameter, probably 1725 Synechococcus spp, could be seen in the esophagus consistent with the findings 1726 here that Synechococcus spp are found in all stomach samples (Friedland et al, 1727 2006). However, the same cells could also be seen intact and fluorescing in the 1728 78 region immediately before the anal vent suggesting they can survive gut passage 1729 and so are unlikely to have been digested and used as food. 1730 There is a range of opinions on the importance of detritus as a food source. 1731 In juvenile Atlantic menhaden in estuarine creeks up to 80 % of the stomach 1732 content of has been reported to be amorphous matter/detritus and up to 50 % in 1733 adults in coastal waters. It has been argued that phytoplankton and zooplankton 1734 production are not sufficient to support observed levels of juvenile menhaden 1735 populations and that marsh detritus be an important food source (Peters & 1736 Schaaf, 1981; Lewis & Peters, 1994). This was confirmed for Gulf menhaden 1737 using a combination of ?13C, ?15N, and ?34S stable isotope analysis to trace the 1738 flow of organic matter (Deegan et al, 1990). These studies showed that juvenile 1739 Gulf menhaden caught near salt marshes derived approximately 30 % of their 1740 nutrition from the detritus. If this is the case, the stomach microbiota could play a 1741 role in utilization of such a food source. The OTUs from Rhodobacteraceae in 1742 menhaden juvenile stomachs could be contributing to this. There are seven 1743 OTUs shared in all samples from the Rhodobacteraceae and OTUs from this 1744 family are among the most abundant reads with 1.5 % representation in samples 1745 from TMC and 3.8 % in samples from SVS. Marine Rhodobacteraceae encode 1746 arylsulphatase genes required for cleaving sulphate from breakdown products of 1747 the polysaccharide fucoidan, a component of diatom extracellular matrix and the 1748 means by which diatoms can bind to detrital matter (Wustman et al, 1998). The 1749 arylsulfatase activity in stomach Rhodobacteraceae has the potential to make 1750 sulfated polysaccharides in the amorphous material available for digestion. 1751 Furthermore, the high abundance of Rhodobacteraceae in the stomach 1752 microbiota of juvenile menhaden could reflect their association with macroalgae 1753 or detrital particles. Fucoidan desulphonation is also important in the Eastern 1754 oyster (Crassostrea virginica). The ability to use sulfated polysaccharides is 1755 thought to come from sulfohydrolases from Planctomycete OTUs in the stomach 1756 microbiota that help make sulfated polysaccharides available for digestion (King 1757 et al, 2012). Marine Rhodobacteraceae are also major vitamin suppliers for B12- 1758 auxotrophic prokaryotes and eukaryotic primary producers, such as 1759 79 chlorophytes, diatoms, dinoflagellates, coccolithophores and brown algae and in 1760 this capacity, could also play a role in menhaden nutrition. Flavobacteriales and 1761 Rhodobacter encode chitinases which could assist in the digestion of copepods 1762 and aeromonads which have cellulase systems could assist in the digestion of 1763 microalgal cell walls (Munoz et al, 2014). 1764 Halomonas species are also among the most abundant Proteobacteria in 1765 samples from both TMC and SVS and are among the shared OTUs. Halomonas 1766 spp have been found in a broad variety of saline environments, including 1767 estuaries, the ocean, and saline lakes and grow over the range of NaCl from 5 to 1768 25 %. Halomonas spp have the capacity to oxidize sucrose, glycerol and 1769 hydrolyze the disaccharide, cellobiose, as well as producing halotolerant alkaline 1770 protease, giving them a potential role in digestion. 1771 Where oil is naturally present, for example, from seeps associated with oil- 1772 containing strata on the floor of the Gulf of Mexico, the community of microbes 1773 that collectively feeds on all the different compounds contained in the oil is well 1774 established and diverse. Even where the background levels of oil are low, a few 1775 microbes with the capability of degrading oil always seem to be present. There 1776 are a large number of hydrocarbon-degrading bacterial species in oil-rich 1777 environments, such as oil spill areas and oil reservoirs and that their abundance 1778 and quantity are closely related to the types of petroleum hydrocarbons and the 1779 surrounding environmental factors. Within the most abundant Proteobacteria and 1780 the shared OTUs in our samples from TMC and SVS are five genera from the 1781 Proteobacteria, Anaerospora, Desulfococcus, Desulfosarcina, Shewanella and 1782 Halomonas, that have been linked with the processes of anaerobic hydrocarbon 1783 degradation, sulfate reduction, denitrification and/or methanogenesis associated 1784 with petroleum biodegradation (Kostka et al, 2011; Koo et al, 2015). 1785 Desulfosarcina is a hydrocarbon degrader. It is found in oil contaminated 1786 marine sediments can function as a chemoorganotroph or chemoautotroph, 1787 using formate, acetate, propionate, butyrate, higher fatty acids, other organic 1788 acids, alcohols, and benzoate or similar aromatic compounds as electron donors 1789 80 for carbon sources produced by other members of an oil degrading consortia. 1790 Halomonas spp are recognized for producing exopolysaccharides (EPS) with 1791 amphiphilic properties that bind to hydrophobic substrates, such as 1792 hydrocarbons, increasing the solubilization of aromatic hydrocarbons and 1793 enhancing their biodegradation the microbial community. It was associated with 1794 oil-contaminated surface waters collected during the active phase of the 1795 Deepwater Horizon oil spill (Gutierrez et al, 2013; Kostka et al, 2011) and has 1796 been found associated with bacterial communities in crude oil contaminated 1797 marine sediments (Bargiela et al, 2015). Halomonas can also participate in the 1798 conversion of uric acid to ammonia stimulating growth of hydrocarbonoplastic 1799 bacteria (Gertler et al, 2015). Rhodobacteraceae can be used as sentinels for the 1800 later stages of degradation when more recalcitrant oil hydrocarbon compounds 1801 such as polycyclic aromatic hydrocarbons predominate. Bacteria with the 1802 potential to degrade PAHs express genes for oxygenases or peroxidases, such 1803 as those found in Rhodococcus (alkB1 and alkB2) (Di Gennaro et al, 2010; Koo 1804 et al, 2015). A similar line-up of key players in oil degradation can be seen in 1805 Atlantic cod in the gut microbiota of fish in the waters close to the oil fields of 1806 southern Norway (Walter et al, 2019). These included Gammaproteobacteria 1807 from the orders Oceanospiralles and Alteromonadales (Shewanella), 1808 Alphaproteobacteria from the order Rhodobacteraceae (f: Rhodobacteraceae) 1809 and Deltaproteobacteria from the order Desulfobacterales (Desulfosarcina). 1810 In summary, analysis of 16S rRNA gene sequences in the stomachs of Gulf 1811 menhaden from TMC and SVS were dominated by Proteobacteria with the 1812 overall composition reflecting differences in water characteristics at the different 1813 sites, mainly salinity. This is consistent with the idea that the stomach microbiota 1814 to a large extent reflect the microbial constituents of the surface waters they 1815 occupy through their filter-feeding lifestyle. Reflecting this picture of the 1816 menhaden microbiota functioning as a biomonitor, the most abundant 1817 Proteobacteria contain genera that have been associated with petroleum 1818 biodegradation in keeping with the location of Apalachicola Bay in the northern 1819 Gulf of Mexico. Beyond mirroring the bacterial consortia in the surface waters, 1820 81 some of the ninety-six OTUs present in all samples suggest that a core stomach 1821 microbiota may be providing a contribution to the nutrition of these rapidly 1822 growing juveniles. Among this consortium are bacteria that could provide a 1823 source of B-vitamins, and others that have the capacity to oxidize sucrose and 1824 glycerol and hydrolyze cellobiose, as well as those having chitinase and cellulase 1825 activities. In addition, arylsulfatases from the Rhodobacteraceae could make 1826 sulfated polysaccharides from diatoms available for digestion. This would be 1827 particularly valuable for juveniles feeding in estuarine waters in early summer in 1828 which up to 80 % of their stomach contents consist of detritus from particulate 1829 organic matter, mainly from the Apalachicola River and surrounding freshwater 1830 wetlands, and coastal marshes. 1831 3.6 Acknowledgements 1832 This work was supported by the National Oceanic and Atmospheric 1833 Administration, Educational Partnership Program award to the Living Marine 1834 Resources Cooperative Science Center (LMRCSC) NA11SEC4810002 and 1835 NA16SEC4810007. Ammar Hanif was supported in part by MD SeaGrant 1836 Graduate Student Fellowship, NA10OAR4170072 SA7528129-D and in part as 1837 an LMRCSC Graduate Fellow. Thanks to Dr. Stacy Smith, Delaware State 1838 University for coordinating the menhaden collection. 1839 1840 82 Chapter 4: Diet of juvenile Gulf Menhaden (Brevoortia 1841 patronus) using DNA metabarcoding 1842 4.1 Abstract 1843 The Gulf menhaden, Brevoortia patronus, is a key forage fish species that 1844 serves as a trophic link between the plankton and piscivorous predators in the 1845 Gulf of Mexico. As juveniles, in early spring, these obligate filter-feeders forage 1846 on plankton and plant detritus in the lower salinity tidal creeks and near-shore 1847 marshes of Appalachicola Bay. Later in the summer, they are found in the open 1848 bay in areas like St. Vincent Sound. Diet assessment is difficult because the food 1849 items are small and easily digested. Indirect assessment of diet by stable isotope 1850 analysis is complicated by system-specific processes. Here we show that DNA 1851 metabarcoding, can provide an alternative method for assessing diet as defined 1852 by eukaryotic prey. The MiSeq Illumina high-throughput sequencing platform was 1853 used to sequence the V4-V5 region of the 18S ribosomal (rRNA) gene of the 1854 stomach contents from juvenile menhaden collected at two locations within 1855 Apalachicola Bay. Fish were collected in early spring (May 2013) at Two Mile 1856 Channel close to the Apalachicola River, and six weeks later in summer at St. 1857 Vincent Sound, in the enclosed Apalachicola Bay. Of the water characteristics 1858 measured, salinity was observed to make the greatest difference to organisms 1859 found in the stomach. 1048 unique OTUs (species) were identified, 1035 of 1860 which were found in samples from both locations. Members of the 1861 Stramenopile/Alveolalata/Rhizaria (SAR) clade account for 66 % representation 1862 in samples from Two Mile Channel, dominated by the diatoms Cyclotella and 1863 Skeletonema, as well as the ciliate Oligotrichia. In contrast, Metazoa 1864 (zooplankton) dominate in samples from St. Vincent Sound, accounting for 83.77 1865 % of the reads. These are mainly Acartia copepods. Not only does the diet shift 1866 from SAR to Metazoa at these two locations, but significant differences in the 1867 species represented within these taxa occur. This dietary shift is indicative of a 1868 difference in trophic level in fish caught at SVS, supported by an increase in ?15N 1869 in the muscle of fish from SVS. The results from 18S rRNA gene sequences 1870 83 have provided a more complete description of the diet of menhaden from the two 1871 locations as the fish move away from the estuary to the enclosed bay, pointing to 1872 an early trophic shift in juveniles of this ecologically and economically important 1873 fish. 1874 4.2 Introduction 1875 Forage fish species are exceptionally important to the structure and 1876 functioning of marine ecosystems, serving as the main conduit of energy flow 1877 from lower to upper trophic levels (Pikitch et al, 2014). They can exact either top- 1878 down control on plankton or bottom-up control on predators (Cury et al, 2000). 1879 Gulf menhaden (Brevoortia patronus) is a key forage fish species that serves as 1880 a trophic link between the plankton and piscivorous predators in the Gulf of 1881 Mexico. Numerous ecosystem models have shown the overall importance of Gulf 1882 menhaden (Brevoortia patronus) to the ecosystem (Geers et al, 2016; O?Farrell 1883 et al, 2017). It is an important forage fish species along the Gulf coast providing 1884 an important food source for fish, predatory seabirds and marine mammals 1885 (Vaughan et al, 2007). Many of the commercially and recreationally harvested 1886 fish species along the Gulf coast rely on the abundant schools of menhaden, 1887 including king mackerel (Scomberomorus cavalla), Spanish mackerel 1888 (Scomberomorus maculates), dorado (Coryphaena hippurus), crevalle jack 1889 (Caranx hippos), tarpon (Megalops atlanticus), red drum and bonito (Sarda 1890 sarda) (Vaughan et al, 2007; Franklin, 2011; Sagarese et al, 2016). Among other 1891 species, the diets of the blacktip shark (Carcharhinus limbatus) and the brown 1892 pelican (Pelecanus occidentalis), Louisiana's state bird, consist of over 95 % 1893 menhaden (Franklin, 2011). 1894 Gulf menhaden along with shrimp support the largest fisheries by landings 1895 and by revenue in the Gulf of Mexico (Vaughan et al, 2007; O?Farrell et al, 2017). 1896 Gulf menhaden also support a large directed reduction fishery. Gulf menhaden 1897 are reduced to fish oil and meal used in livestock feed, aquaculture feed, 1898 menhaden oil, high in omega-3 fatty acids, has been used for pharma- and 1899 nutraceuticals, cosmetics and other consumer products (Olsen et al, 2014; 1900 84 SEDAR, 2013; Menhaden Advisory Commission, 2015). Assessments of the 1901 fishery have concluded that Gulf menhaden are not overfished or undergoing 1902 overfishing (Vaughan et al, 2007). However, this assessment does not fully 1903 consider the ecological role of the species and its ecological importance has not 1904 been adequately quantified suggesting that the potential exists for this large 1905 fishery to yield an ecological impact (Olsen et al, 2014; Sagarese et al, 2016). 1906 Despite the importance of Gulf menhaden as a dominant prey fish and its 1907 economic importance in fisheries, there is a dearth of specific dietary data for this 1908 species. Most of the menhaden dietary studies have been on the allopatric 1909 species, the Atlantic menhaden, Brevoortia tyrannus, which does not overlap with 1910 B. patronus. In view of this, a more complete understanding of the diet of Gulf 1911 menhaden is needed. Juvenile and adult Gulf menhaden are obligate filter 1912 feeders. After menhaden spawn in coastal waters and bays, their larvae are 1913 transported into various estuarine habitats where they metamorphose into 1914 juveniles. After transformation, juveniles remain in nearshore areas where they 1915 travel in dense schools, often near the surface (Lassuy, 1983). The 1916 morphological changes associated with transformation from the larval form are 1917 accompanied by a change in feeding behavior. The larvae are carnivorous 1918 particulate feeders, the juveniles switch to omnivorous filter feeders, reported to 1919 consume phytoplankton, zooplankton, and detritus (Deegan et al, 1990). Gulf 1920 menhaden are obligate filter feeders and as such are constantly sampling the 1921 environment as they feed. As juveniles they live in the mesohaline and 1922 oligohaline zones of estuaries where they filter the water column via their gill 1923 rakers. Friedland et al (1984) calculated the plankton filtering capability of Atlantic 1924 menhaden juveniles by measuring filtration efficiencies of uniformly sized 1925 particles in captive fish (Friedland et al, 1984). For 14-cm fork-length fish, a size 1926 representing the upper limit of juvenile size ranges for this species, the fish can 1927 filter uniformly sized particles in a range of 7?9 ?m in diameter at filtration 1928 efficiencies of approximately 10 %, although maximum filtration efficiency is for 1929 particles approximately 100 ?m (Friedland et al, 1984). Based on this, the 1930 effective pore size of the gill rakers of menhaden juveniles typical of spring and 1931 85 summer nursery residents (7.5-cm fork-length fish) was estimated to be 60 % of 1932 this, or 4.2-5.4 ?m diameter. However, for fish in the wild, particle retention is 1933 enhanced when a background of other particles is present, as is typical in 1934 estuarine water columns. It has been proposed that detrital material plays a role 1935 in the retention of these smaller particles into the stomach (Friedland et al, 1984). 1936 Detritus is primarily plant-related structural material that washes out of the 1937 estuaries and coastal marshes and can also include bacteria, fungi, microalgae 1938 and protozoa. Its makeup and quantity in the menhaden stomach reflect the 1939 constituents of the water filtered (Deegan et al, 1990; Lewis & Peters, 1994). 1940 There is a range of opinions on the importance of detritus as a food source. 1941 However, analysis of stable isotope evidence has supported its contribution to 1942 the diet of Gulf menhaden (Deegan et al, 1990; Olsen et al, 2014). 1943 Visual identification has conventionally been used for the analysis of gut 1944 contents of fish based on prey morphology (Hyslop, 1980). The method is time 1945 consuming and requires expertise in taxonomic classification. In addition, 1946 menhaden have a gizzard-like stomach that grinds ingested items to an 1947 amorphous paste making microscopy techniques extremely difficult to use for 1948 identification (Lewis & Peters, 1994). Coupled with extensive digestion in the 1949 stomach, this allows only partial or ambiguous prey specimen identification 1950 (Baker et al, 2014). To understand food web dynamics, the whole dietary breadth 1951 needs to be described. Through DNA metabarcoding methods, organisms in the 1952 stomach content of filter feeders can now be assessed even when diagnostic 1953 taxonomic features are lacking (Pompanon et al, 2012). DNA metabarcoding has 1954 increasingly become the method of choice in characterizing diets of marine 1955 animals (Symondson, 2002; Jarman et al, 2002; Deagle et al, 2009; Pompanon 1956 et al, 2012). Applying DNA techniques to diet identification has increased 1957 identification resolution, particularly in marine systems (Blankenship & Yayanos, 1958 2005; Riemann et al, 2010; Cleary et al, 2012; Jakubavi?i?t? et al, 2017; 1959 Waraniak et al, 2019). Such work has identified trophic linkages within food webs 1960 and determined predator diet breadth and preference. Hanif et al (2020) has 1961 successfully developed a DNA metabarcoding method to characterize the 1962 86 stomach contents in Gulf menhaden using 18S rRNA gene sequences (Hanif et 1963 al, 2020). In the present study, we apply this method to analyze and compare the 1964 eukaryotic prey items of juvenile Gulf menhaden collected from two different 1965 locations within Apalachicola Bay, Florida. 1966 4.3 Methods 1967 4.3.1 Sample collection 1968 Delaware State University (DSU) collaborators collected samples on May 23, 1969 2013 at Two-Mile Channel in Apalachicola Bay, Florida, (TMC01, Lat:29.712467, 1970 Long: -85.01525) (Figure 4.1). Gulf menhaden samples were collected from this 1971 location using a 10-foot cast net. Ten samples from this low salinity site (T1-10) 1972 were used for this study. At that time of year and location, the menhaden were 1973 concentrated in large, slow-moving, and tightly packed schools, allowing all fish 1974 to be collected from the same site within this location. Six weeks later, July 2, 1975 2013, a second collection was made by Florida Fish and Wildlife from the St 1976 Vincent Sound location (SVS). At that time and location, the menhaden were not 1977 so tightly grouped. Fewer fish were collected per cast such that ten fish were 1978 collected from different sites at this location. This resulted in three fish (S1-3) 1979 being collected at site SVS02, one fish (S4) collected at SVS03, three fish (S5-7) 1980 collected at SVS04 and three fish (S8-10) collected at SVS05. These four sites 1981 corresponded to latitude and longitude coordinates 29.6813, -85.206167 1982 (SVS02); 29.700267, -85.17595 (SVS03); 29.686033, -85.1372 (SVS04) and 1983 29.688233, -85.124167 (SVS05), respectively (Figure 4.1). Collections were 1984 done at 1 m depth. Water quality measurements of temperature, salinity, pH, 1985 dissolved oxygen, and Secchi depth were recorded at all sites (Table 4.1). 1986 87 Figure 4.1: Collection sites in Apalachicola Bay The blue circle represents the sole site in Two Mile Channel where all ten samples were collected in May 2013 (TM01, Lat: 29.712467, Long: -85.01525). The red circles represent the sites in St. Vincent Sound where samples were collected in July 2013. A total of ten samples were collected from the four sites SVS02, SVS03, SVS04, SVS05, Lat, Long: 29.6813, -85.206167; 29.700267, -85.17595; 29.686033, -85.1372; 29688233, -85.124167, respectively. 1987 88 Location Sample Site Salinity Temp pH DO Secchi ppt oC T1 TMC01 1.8 27.1 7.5 5.7 0.7 T2 TMC01 1.8 27.1 7.5 5.7 0.7 T3 TMC01 1.8 27.1 7.5 5.7 0.7 T4 TMC01 1.8 27.1 7.5 5.7 0.7 Two Mile T5 TMC01 1.8 27.1 7.5 5.7 0.7 Channel T6 TMC01 1.8 27.1 7.5 5.7 0.7 T7 TMC01 1.8 27.1 7.5 5.7 0.7 T8 TMC01 1.8 27.1 7.5 5.7 0.7 T9 TMC01 1.8 27.1 7.5 5.7 0.7 T10 TMC01 1.8 27.1 7.5 5.7 0.7 S1 SVS02 37.6 28.4 8.0 5.5 0.6 S2 SVS02 37.6 28.4 8.0 5.5 0.6 S3 SVS02 37.6 28.4 8.0 5.5 0.6 S4 SVS03 36.3 27.8 7.9 5.5 0.6 St. S5 SVS04 36.4 28.7 8.0 5.8 0.4 Vincent S6 SVS04 36.4 28.7 8.0 5.8 0.4 Sound S7 SVS04 36.4 28.7 8.0 5.8 0.4 S8 SVS05 36 29 7.8 5.7 0.4 S9 SVS05 36 29 7.8 5.7 0.4 S10 SVS05 36 29 7.8 5.7 0.4 Table 4.1: Water quality measurements from each sample site in Apalachicola Bay. Water quality measurements of temperature, salinity, pH, dissolved oxygen, and turbidity were recorded at the time and place of sample collection using a YSI 556 multiparameter water quality meter. The sites and samples are color coded to be consistent with the colors used in Figures 4.6 and 4.7. 1988 4.3.2 Sample preparation and DNA extraction 1989 Sample preparation, DNA extraction from menhaden stomach contents and 1990 estimation of DNA quality was done as described by Hanif et al (2020) (Chapter 1991 2). 1992 4.3.3 High throughput sequencing 1993 High throughput sequencing was performed on the Illumina MiSeq platform 1994 located in the BioAnalytical Services Laboratory (BAS Lab) at the University of 1995 89 Maryland Center for Environmental Science, Institute of Marine and 1996 Environmental Technology as described by Hanif et al (2020), using the protocol 1997 recommended in the Nextera DNA Library Prep Reference Guide (Illumina 1998 Document #15027987v1). with the exception of sequencing primers. The 1999 sequencing primer set used were 574*F (CGGTAAYTCCAGCTCYV) and 1132R 2000 (CCGTCAATTHCTTYAART) targeting the V4-V5 region of the 18S rRNA gene as 2001 developed by Hugerth et al (2014). 2002 4.3.4 Post sequencing pipeline 2003 Removal of index sequences (called de-multiplexing), base calling and data 2004 quality assessment were performed on the MiSeq instrument. MacQIIME was 2005 used to process and assess the quality of output reads using the recommended 2006 QIIME pipeline for Illumina reads. The MacQIIME script, join_paired_ends.py, 2007 was used to join forward and reverse reads. Paired reads were then filtered for 2008 low quality reads (quality score of <25) and short read length (<200 bp) to be 2009 removed from the library using the split_libraries.py command. Chimeric 2010 sequences were identified de novo using Usearch61 with the script 2011 identify_chimeric_seqs.py. This was followed by identifying sequences containing 2012 PhiX sequences by a BLAST analysis of the library with PhiX sequence. The 2013 resulting file after the removal of chimeric and PhiX sequences was used for 2014 operational taxonomic unit (OTU) (species) picking. OTUs generated from 18S 2015 rRNA gene sequencing were picked and taxonomy was assigned using the 2016 UCLUST method against the Silva 111 Eukaryote-only database (Edgar 2010; 2017 Quast et al, 2013). This was used to construct an OTU table for subsequent 2018 analysis. 2019 4.3.5 Data analysis 2020 Observed OTUs, Shannon index, and Inverse Simpson index metrics were 2021 used to assess alpha diversity. A Bray-Curtis matrix was generated and 2022 visualized using principal coordinate analysis to assess beta diversity. DESeq2 2023 was used to test for significant differentially abundant taxa. All analyses were 2024 done in R primarily using the packages Phyloseq and DESeq2. 2025 90 4.3.6 Stable isotope analysis 2026 Menhaden collected from Apalachicola Bay were immediately placed on dry 2027 ice and frozen until samples were transported back to the lab. Once back at the 2028 laboratory, all menhaden were stored at -80 ?C until preparation for isotopic 2029 analysis. Using a stainless X-Acto knife, slivers of menhaden muscle were place 2030 in tins and oven-dried at 60 ?C for 6 h. The tissue was pulverized using a mortar 2031 and pestle. Approximately 1-1.3 mg of muscle was removed from each sample 2032 and weighed. The prepared samples were sent to the Central Appalachian 2033 Laboratory Stable Isotope Facility (CASIF) of the University of Maryland Center 2034 for Environmental Sciences for ? 13C and ? 15N analysis. 2035 Given the low sample size, the Anderson-Darling normality test was used to 2036 test for normality. This was followed by a test to determine if the variance 2037 between the TMC and SVS sample groups was similar. A student?s t-test was 2038 used to examine significant differences of the means of the ?13C and ?15N values 2039 between TMC and SVS samples. Results were deemed significant at ? <= 0.05. 2040 4.4 Results 2041 4.4.1 Characterization of stomach 18S rRNA gene sequences from menhaden 2042 caught at Two-Mile Channel and St. Vincent Sound 2043 DNA from all TMC samples generated useable reads, however only eight 2044 SVS samples produced reads. Table 4.2 gives a summary of reads per sample. A 2045 total number of 315,874 and 350,223 raw reads were retrieved from TMC and 2046 SVS samples respectively. After post sequencing processing in QIIME, these 2047 numbers were reduced to 315,072 and 349,697 with an average of 31,072 and 2048 34,697 reads per sample. Two of the SVS samples did not produce reads of 2049 sufficient quality and were not included in this calculation or further analysis. 2050 91 Table 4.2: Comparison of the number of the number of raw reads and post- processing reads from Two Mile Channel and St Vincent Sound samples A region of approximately 550 bp encompassing the V4-V5 hypervariable regions of the 18S rRNA genes was targeted for sequencing using the protocol developed by Hugerth (Hugerth et al 2014). The raw sequencing and post-processing reads from each were expressed as reads per sample, mean number of reads per sample and total reads. 2051 There was a high percentage of reads that matched the teleost fish 2052 Tenualosa reevesii after using the SILVA 111 Eukaryote only database for 2053 taxonomic assignment (Hanif et al, 2020). We reasoned that these sequences 2054 represented the 18S rRNA sequence of the menhaden and removed them from 2055 our analysis. Subsequent sequencing of menhaden 18S rRNA (accession # 2056 92 MN335200) showed it to be 99 % identical to that of Tenualosa reevesii. The 2057 relative abundance of this OTU was sample dependent and ranged from 81.6 to 2058 0.3 %. Excluding the reads assigned to T. reevesii did not change the overall 2059 relative abundance pattern of the remaining taxa, except for sample T10. 2060 Binning at 0.03 % divergence resulted in total of 1048 unique OTUs 2061 assignments, with 1035 OTUs shared in samples from each location, 10 unique 2062 to samples from TMC and 3 unique to samples from SVS. There were seven taxa 2063 with greater than 0.36 % representation in the total relative abundance (Figure 2064 4.2, Suppl Table 4.1). All these taxa were represented in the stomachs of fish 2065 caught at both locations, although at very different relative abundances (percent 2066 reads). In samples from both locations the dominant taxa were from the 2067 Stramenopile/Alveolalata/Rhizaria (SAR) clade and the kingdom Metazoa. In 2068 samples from TMC, OTUs were dominated, in descending order, by Alveolata 2069 (alveolates) with a mean representation of 38.5 %, Metazoa 30 % and 2070 Stramenopiles (diatoms) at 27.5 % (Figure 4.2 A, Suppl Table 4.1). DNA from 2071 the stomach samples collected from SVS sites were dominated, in descending 2072 order, by Metazoa at 83.77 %, Stramenopiles (diatoms) at 8.53 % and Alveolata 2073 (alveolates) at 7.2 % (Figure 4.2 B). Representation of Rhizaria was negligible in 2074 samples from both locations. The remaining most abundant OTUs were assigned 2075 to Chloroplastida, Fungi, Cryptomonadales and Foraminifera. Representatives of 2076 all mentioned taxa were present in at least one sample per site. However, the 2077 reads from Cryptomondales and Fungi were more prevalent in samples from the 2078 TMC than the SVS location. Figure 4.3 shows the variation in the relative 2079 abundance of the dominant taxa in each sample. 2080 93 Figure 4.2: Taxonomic composition at the higher taxa level of menhaden stomach contents from Two Mile Channel and St Vincent Sound samples Proportional representation of most abundant higher classification taxa are shown as pie charts: (A) TMC and (B) SVS. 2081 Two-Mile channel Saint Vincent sound 100% 90% 80% Other 70% Cryptomonadales 60% Fungi 50% Chloroplastida 40% Rhizaria 30% Alveolata Stramenopiles 20% Metazoa 10% 0% 1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 9 10 Figure 4.3: Variation in taxonomic composition of stomach contents of the major taxa in each sample from Two Mile Channel and St Vincent Sound The relative abundance of major taxa of each sample from the TMC and SVS sites is shown at the major taxa level. 2082 Given that the menhaden stomach contents contain mainly OTUs from the 2083 SAR clade and the kingdom Metazoa, which together reflect 96.44 % and >99 % 2084 of the total reads from TMC and SVS, respectively, I focused my analysis further 2085 to look at the abundances of these taxa at lower classifications. Looking only at 2086 OTUs within the SAR clade that account for 66 % and 15.82 % of reads from the 2087 94 Relative abundance samples from TMC and SVS, respectively, revealed these OTUs belonged to 2088 three taxa; the Stramenopile phylum Bacillariophyta (diatoms), the alveolate 2089 class Dinophyceae (dinoflagellates) and the alveolate phylum Ciliophora (ciliates) 2090 (Figure 4.4). Although dinoflagellates and diatoms are considered to be 2091 phytoplankton, ciliates are non-photosynthetic and are considered to be 2092 microzooplankton. The overall relative abundance from these three taxa is shown 2093 in Suppl Table 4.2 and as relative abundance per sample in Figure 4.4A. In 2094 samples from TMC, ciliates, dinoflagellates and diatoms overall represent 30.46, 2095 18.41 and 42.64 % of the representation from SAR clade OTUs. This means that 2096 overall, phytoplankton have 60 % representation in stomach samples from TMC. 2097 Representatives of all three taxa were present in all samples except S7 and S9 in 2098 which dinoflagellates were absent. In samples from TMC, diatoms were the 2099 largest represented SAR taxon for samples T4, T5, T7, T8, T9, and T10 (Figure 2100 4.4A). For the remainder of the TMC samples, the ciliates were the largest 2101 represented SAR taxon. In samples from SVS, ciliates, dinoflagellates and 2102 diatoms overall represent 27.9 %, 1.23 % and 36.62 % of the representation from 2103 SAR clade OTUs, respectively. Diatoms were the most abundant SAR taxon in 2104 SVS samples except for S6 which was dominated by ciliates (Figure 4.4A). 2105 Figure 4.4B shows the representation of OTUs from the SAR clade 2106 assigned at the genus level. There was a higher diversity of SAR OTUs in TMC 2107 samples than in SVS samples, except for T10 which was dominated by diatoms. 2108 The three most abundant SAR species in TMC samples were the diatoms 2109 Cyclotella, Skeletonema, and the ciliate Oligotrichia accounting for 22 %, 10.41 2110 %, and 12.63 % relative abundance respectively. The ciliates Choreotrichia and 2111 Haptoria had overall relative abundancies of 8 % and 7.31 %, respectively and 2112 the dominant dinoflagellates were Peridiniales and Gyrodinium at 6.53 % and 2113 7.54 % relative abundances, respectively. The diatom Odontella was the least 2114 represented species in TMC samples only appearing in two samples from this 2115 site versus six samples from SVS. As in samples from TMC, Cyclotella was also 2116 the most abundant SAR species in SVS samples with 25 % relative abundance. 2117 In SVS samples however, this was followed by the ciliate Choreotrichia, the 2118 95 diatoms Thalassiosira and Actinoptychus, and an unassigned diatom species 2119 with relative abundancies of 19 %, 12 %, 8.5 % and 11 % respectively. Ciliates 2120 and diatoms have similar representation within the SAR clade in stomach 2121 samples from TMC and SVS but are different species. 2122 Figure 4.4: Variation in taxonomic composition of stomach content OTUs within the SAR clade in each sample from Two Mile Channel and St Vincent Sound The relative abundance of SAR clade OTUs at the level of: (A) ciliates, dinoflagellates and diatoms; (B) at the genus level in each sample from the TMC and SVS sites. 2123 96 Metazoa account for 83.77 % of reads from SVS samples and approximately 2124 70 % of these were copepods (Figure 4.5A, Suppl Table 4.3). If the ciliate 2125 microzooplankton, are included with the mesooplanktonic copepods, zooplankton 2126 have over 90 % representation in stomach samples from SVS. There were six 2127 copepod species identified, although the copepod Vahinius was only found in one 2128 sample from SVS (Figure 4.5B). Stomach samples from SVS were dominated by 2129 Acartia which represented a mean relative abundance of 51 %. Metazoa 2130 accounted for only 30 % mean representation in samples from TMC and 2131 copepods only represented 21 % of these, with 7.3 % mean relative abundance 2132 for Acartia and 12 % for Canuella. In TMC samples, the most highly represented 2133 metazoan was the polychaete worm, Barantolla, from the phylum Annelida, with 2134 20 % relative abundance. This is not apparent in Figure 4.5A because it is 2135 included with other polychaete worms as well nemerteans and a nematode, 2136 Metadesmolaimus. Overall polychaete worms have a 24.13 % relative 2137 abundance and the nemertean worms 4.4 %. A bivalve, Neotrigonia, is present in 2138 TMC samples with a 5.4 % relative abundance. The adult annelid and molluscan 2139 species found had size ranges inconsistent with the size of menhaden gill rakers 2140 so their presence in menhaden stomach are likely to represent eggs or juveniles 2141 or detritus. Also, in stomach samples from TMC was an unassigned rotifer with 8 2142 % overall relative abundance. 2143 97 A) Two-Mile channel Saint Vincent sound 100% 90% 80% 70% 60% other rotifer pseudocoelomate - annelid - nematode - nemertea 50% gastropod bivalve teleost 40% arthropod copepod 30% 20% 10% 0% B) 100% Other Rotifera (unclassified) 90% Metadesmolaimus Gastrotricha pseudocoelomate Zygeupolia rubens annelid 80% Capitella sp. FP-2009 nematode Cirratulus sp. AMW24931 nemertea 70% Carinoma tremaphoros Barantolla lepte 60% Neritina gastropod Donax Nucula 50% Balanus bivalve Neotrigonia 40% Misgurnus anguillicaudatus Osteobrama belangeri teleost 30% Betta pi Danio rerio Janicella 20% arthropod Vahinius Canuella 10% Taeniacanthus copepod Armatobalanus 0% Acartia 1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 9 10 Figure 4.5: Variation in taxonomic composition of stomach content OTUs within the Metazoa in each sample from Two Mile Channel and St Vincent Sound The relative abundance of metazoan OTUs at the level of: (A) copepods, rotifers, worms (mix of annelids, nematodes and nemerteans), bivalves, arthropods, gastropods and teleosts; (B) at the genus level in each sample from the TMC and SVS sites. 2144 The relative abundance of the remaining OTUs was low ranging from 3 - 2145 0.01 % except for teleost DNA. Teleost DNA was represented in all samples 2146 except for T1 in which Betta pi, Osteobrama belangeri, and Misgurnus 2147 angillicaudatus were absent. The mean relative abundance of teleosts was 10 % 2148 for both TMC and SVS samples. However, for TMC samples this was inflated by 2149 the high relative abundance of teleost DNA in sample T10. Teleost DNA in this 2150 sample showed a very different relative abundance compared to that in other 2151 98 Relative abundance Relative abundance TMC samples. The closest matches to the teleost DNA was of species that are all 2152 freshwater from S. Asia and commonly found in the freshwater aquarium pet 2153 trade. However, they could be from species without sequences in the SILVA 2154 database. 2155 There were ten OTUs represented in the TMC samples that were undetected 2156 in the SVS samples. Four of these were unassigned eukaryotes. The remainder 2157 were assigned as Goniomonas, a genus of cryptomonads, Discosea, a class of 2158 Amebozoa, Prymnesiophyceae, a class of Haptocyta and Katablepharis, 2159 Leucocryptos and Roombia, all genera of the katablepharid Cryptista. Though 2160 unique to the TMC samples their relative abundance was extremely low, ranging 2161 between 1e-05 to 1e-06 %. In contrast only three OTUs were solely represented 2162 in the SVS samples. One was an unclassified Cryptophyceae, one an assigned 2163 member of a haptophyte family, Pavlovophyceae, and the last was an uncultured 2164 Rhodophyte (red alga). Similarly, all had extremely low relative abundance 2165 ranging between 1e-05 to 1e-06 %. 2166 4.4.2 Alpha and beta diversity analyses show that stomach samples from TMC 2167 are more diverse than those from SVS 2168 The alpha diversity metrics, Observed Species, Shannon and Simpson 2169 inverse measures all show that DNA in stomach samples from TMC was more 2170 diverse than that from SVS samples with respect to both richness and evenness 2171 (Figure 4.6). Furthermore, each metric was shown to be significantly different as 2172 measured by Mann-Whitney test (p-value > 0.05). The Bray-Curtis dissimilarity 2173 test was visualized by principal coordinate analysis (PCoA) (Figure 4.7). There is 2174 a clear separation of samples by sampling location, most likely reflecting the 2175 difference in salinity. TMC samples group closely together with the exception of 2176 one sample. This is in contrast to SVS samples which do not group as closely 2177 together, even for those samples taken at the same site. Testing for significant 2178 differential abundance was done using the DES2Seq package in R. Overall there 2179 were fifty-seven genera whose abundances were found to be significantly 2180 different between samples from the different sampling locations (Figure 4.8). 2181 99 Focusing on the most dominant taxa as shown in Figures 4.4 and 4.5, the 2182 phylum Metazoa, three copepod genera (Taeniacanthus, Acartia, 2183 Armatobalanus), three teleosts (Misgurnus anguillicaudatus, Osteobrama 2184 belangeri, Danio rerio), the annelid Barantolla, the nematode, Metadesmolaimus, 2185 the ribbon worm, Carinoma tremaphoros, the single decapod (Janicella), and a 2186 bivalve (Balanus) had significantly different abundances at the two locations. 2187 Amongst the Stramenopiles there were five diatoms (Odontella, Cyclotella, 2188 Thalassiosira, Skeletonema, Chaetoceros) whose abundances were found to be 2189 significantly different. For the Alveolata, the abundances of all ciliates and 2190 dinoflagellates mentioned in Figure 4.3 were found to be significantly different 2191 between locations. 2192 Observed Shannon InvSimpson 30 4 500 400 20 3 Site 300 SVS02 SVS03 SVS04 SVS05 TMC01 200 2 10 100 1 SVS TMC SVS TMC SVS TMC Figure 4.6: Comparison of the differences in stomach contents from each sample by alpha diversity metrics The alpha diversity metrics Observed Species (OTUs), Shannon, and Simpson inverse measures are shown for all samples from each sampling site to measure OTU richness and evenness. The samples are color coded to reflect the sampling sites and 2193 locations as in Table 4.1. 100 Alpha Diversity Measure 0.25 Site SVS02 SVS03 SVS04 SVS05 0.00 TMC01 SiteName St. Vincent Sound Two?Mile Channel ?0.25 ?0.3 0.0 0.3 0.6 Axis.1 [55.3%] Figure 4.7: Comparison of the differences in stomach contents from each sample by beta diversity analysis Beta diversity of the samples from all sites was determined using Bray-Curtis distance metrics, visualized using Principal Coordinate Analysis (PCoA) plots. 2194 101 Axis.2 [13.6%] Figure 4.8: Significant differential abundance in OTUs from Two Mile Channel and St Vincent Sound Significant differential abundance in chosen OTUs from 57 genera were analyzed using the DES2Seq package in R. 2195 4.4.3 Stable isotope analysis 2196 Carbon and nitrogen stable isotopes (?13C and ?15N) can be used to 2197 evaluate the relative contributions of different food sources and the trophic 2198 position of fish (Anderson et al 1987; Lochman and Phillips 1996; Gu et al 2199 1996a,b; Gamboa-Delgado et al 2008). Because isotope compositions reflect the 2200 organic compounds that have been incorporated into the bodies of consumers, 2201 the measurements of ?13C and ?15N provide l information on the dietary 2202 component assimilated by consumers. Isotope enrichment in consumers takes 2203 place during the assimilation of carbon and nitrogen from the diet (Post, 2002). 2204 The average isotope enrichment during each trophic transfer is considered to be 2205 102 0.5% for ?13C and 3.4% for ?15N. However, many unknown and uncontrolled 2206 factors such as food sources and differences in growth rates may affect the 2207 magnitude of isotope fractionation giving large variations in both stable isotopes 2208 (Post, 2002). 2209 Samples from TMC were lower in both ?13C and ?15N in comparison to those 2210 from SVS. TMC samples ?13C values ranged from -27.31 to -22.25 and ?15N 2211 values ranged from 11.39 to 12.64. SVS samples ?13C values ranged from -24.60 2212 to -21.13 and ?15N values ranged from 12.43 to 13.08. The range of both ?13C 2213 and ?15N values was greater for TMC samples than SVS samples. This resulted 2214 in the mean values of TMC and SVS for ?13C -24.57 and -22.70 and ?15N 12.03 2215 and 12.76 respectively. Though the means are close, the means of the ?13C and 2216 ?15N were shown to be significantly different for both TMC and SVS samples as 2217 determined by t-test (p-value=0.01722 and p-value=0.0002654 respectively, 2218 (Figure 4.9). The increase in ?15N in the muscle of fish from SVS is consistent a 2219 trophic shift. However, the difference in stable isotope composition is not 2220 definitive by itself since system-specific processes can alter local isotope values 2221 and values for plankton and debris from the different sites are not available for 2222 comparison. 2223 13.20 13.00 12.80 12.60 12.40 mean TMS 12.20 mean SVS 12.00 individual TMC samples 11.80 individual SVS samples 11.60 11.40 11.20 -28.00 -27.00 -26.00 -25.00 -24.00 -23.00 -22.00 -21.00 -20.00 -19.00 13 ? C Figure 8F.i gSutarbel e4 i.s9o: tSoptaeb alena ilsyositso opfe f iallnetalysis of muscle Brevoor?tia patronus ? C and ? N values for samples 13C and ?15N levels for isolated muscle filflerotm fr othme TeMacCh asnadm SpVlSe switeerse. Bdoetthe rinmdiinveiddu,a l and mean values are shown for each sample set. TMC samples were shown to be significanly different expressed as both individual and mean values. than SVS samples as determined by t-test (TMC p = 0.01722; SVS p = 0.0002654). 2224 103 ?1 5 N 4.5 Discussion 2225 In the present study, we have compared the stomach contents of juvenile 2226 Gulf menhaden collected from two different locations within Apalachicola Bay, FL 2227 using a DNA metabarcoding method we previously developed (Hanif et al 2020). 2228 MiSeq Illumina high-throughput sequencing was used to analyze DNA amplicons 2229 from the V4-V5 region of the 18S ribosomal (rRNA) gene from the stomachs of 2230 juvenile menhaden. The fish were collected in May 2013 at Two Mile Channel 2231 (TMC), a low salinity estuarine site close to the Apalachicola River estuary, and 2232 six weeks later at St Vincent Sound (SVS), a high salinity site at the western end 2233 of the enclosed Apalachicola Bay. Although water at the sites varied little with 2234 regard to temperature, dissolved oxygen, pH, Secchi measurements, the 2235 salinities were very different; 1.8 ppt at TMC and between 36-37.6 ppt at four 2236 different sites in SVS. I identified 1048 unique OTUs (species) in the stomach 2237 contents, 1035 of which were found in samples from both locations. However, the 2238 stomach contents from fish caught at the two locations were very distinct. In 2239 stomach samples from Two Mile Channel, members of the 2240 Stramenopile/Alveolate/ Rhizaria (SAR) lineage account for 66 % representation 2241 and Metazoa at 30 %. The SAR OTUs were dominated by the diatoms Cyclotella 2242 and Skeletonema, as well as the ciliate Oligotrichia. In contrast, stomach 2243 samples from St. Vincent Sound, members of the Metazoa account for 83.77 % 2244 representation and SAR at 15 %. The metazoans are mainly Acartia copepods. 2245 Since ciliates are considered to be microzooplankton, this means that 2246 phytoplankton have just over 60 % representation in samples from TMC and 2247 zooplankton have over 90 % representation in samples from SVS. However, it 2248 must be remembered that numbers of reads for any species do not correspond to 2249 numbers of organisms for a variety of reasons including copy number of 18S 2250 rRNA genes which vary across species. 2251 Some of our findings match what is known about plankton communities in 2252 Apalachicola Bay. Diatoms are abundant year-round and species that represent 2253 a large fraction of net phytoplankton include Thalassiosira spp, Cyclotella and 2254 104 Skeletonema costatum (Estabrook, 1973). Copepods are the main constituents 2255 of the plankton accounting for 80 % of the plankton population (Putland, 2005). 2256 Thirty-six species of copepods have been identified from the Apalachicola Bay 2257 system with Acartia tonsa being the dominant species in every area. Acartia 2258 tonsa densities averaged over 5,500 numbers per cubic meter copepod naupliar 2259 stages found are generally six to 16 times greater than the number of adults. 2260 Some species expected were not seen. For instance, forty-two species of fish 2261 larvae and thirteen species of planktonic fish eggs have been identified in 2262 ichthyoplankton surveys (Blanchet, 1979), although the only fish sequences we 2263 found matched most closely with exotic cyprinids. The most abundant species 2264 found was the bay anchovy, accounting for over 75 % of all larvae identified and 2265 92 % of all fish eggs collected. Bay anchovy larvae peak during the months of 2266 this collection. 2267 Overall, identification of stomach contents by DNA metabarcoding has shown 2268 that menhaden sampled at TMC had a diet of mainly phytoplankton while those 2269 sampled at SVS had a diet of mainly zooplankton. The fish were an average of 2270 54.1 mm (TL) and 63.4 mm (SL) at TMC and an average of 85.3 mm (TL) and 2271 100.2 mm (SL) (Supplemental Table 4.4). Juvenile menhaden are obligate filter 2272 feeders. As juveniles they live in tidal creeks, marsh and open bay areas where 2273 they filter the water column via their gill rakers. Apalachicola Bay is a productive 2274 estuary located in the northern Gulf of Mexico. The high productivity is a result of 2275 the Apalachicola River delivering freshwater and nutrients to the bay (Livingston 2276 1984, Mortazavi et al 2000a,b, 2001). Apalachicola Bay is a river-dominated 2277 system with the major source of freshwater input coming from the Apalachicola 2278 River. Maximum river flows occur during late winter to early spring months and 2279 are highly correlated with Georgia rainfalls (Meeter 1979). Nutrient input supports 2280 high levels of phytoplankton productivity (Mortazavi et al 2000b) which in turn 2281 supports the Bay?s secondary productivity (Chanton & Lewis 2002). The 2282 migration pattern of juvenile Gulf menhaden in Appalachicola Bay involves the 2283 sequential use of tidal and marsh creeks in early spring, followed by the open 2284 bay areas like SVS later in the summer. Food availability increases early in the 2285 105 year in tidal creeks. High productivity comes from the influx of nutrients with high 2286 spring river flow, the flushing of detritus from the river mouth, coupled with 2287 warmer temperatures than the open bay area. It has been shown that detrital 2288 material can also be used as a food source in juvenile Gulf menhaden using 2289 stable isotope evidence (Deegan et al,1990). This results in rapid growth and 2290 high survival. However, in mid-summer, when food availability in the tidal creeks 2291 begins to decline, the fish move to the open bay area where phytoplankton and 2292 zooplankton are increasing. 2293 Prior to our study, the best evidence for dietary shifts in Gulf menhaden came 2294 from stable isotope analyses, although such studies cannot identify specific diet 2295 items. It has been shown that 13C enrichment can be used to determine the 2296 carbon source in the food web and 15N enrichment can be used to identify the 2297 foraging trophic level (Fry 1988, Vander Zanden et al, 1999). The source of the 2298 isotopes, therefore, provides insight into temporal, spatial, and ontogenetic 2299 variation of the consumer in the local environment. Olsen (Olsen et al, 2014) 2300 examined stable carbon (13C) and nitrogen (15N) isotope ratios traced through 2301 coastal food webs to investigate the trophic level of Gulf menhaden and their role 2302 along the Mississippi coast in the northern Gulf of Mexico ecosystem. On the 2303 basis of this, Olsen et al concluded that the most important dietary item for 2304 juvenile (<100 mm total length) fish was phytoplankton (74.0 % dietary 2305 composition), while that of subadults (100?200 mm) and adults (>200 mm) was 2306 zooplankton (61.6 % for sub-adults and 52.4 % for adults). Juvenile fish also 2307 utilized a larger component of terrestrial-based detritus as the source carbon 2308 than older sub-adult fish farther from the lower parts of the estuary and offshore. 2309 The authors suggested that juvenile menhaden are ?trophically balanced? 2310 between a phytoplanktivore and zooplanktivore with an opportunistic feeding 2311 strategy based on the available food sources but did proportionally consume two 2312 to three times more phytoplankton than larger menhaden. Our results suggest a 2313 somewhat different picture for juveniles that are essentially opportunists. Gulf 2314 menhaden juveniles consume high proportions of zooplankton in mid-summer if 2315 they are available. 2316 106 The Olsen studies noted that these trophic level calculations based on stable 2317 isotope analysis are very sensitive to the ?15N baselines used and were careful 2318 to compare stable isotope levels of menhaden stomach contents to that of four 2319 different size fractions of plankton, as well as stable isotope composition of black 2320 needle rush Juncus roemerianus, the dominant marsh grass. They found that 2321 isotope values for both ?13C and ?15N were significantly different between 2322 plankton size fractions. Changes in stable isotope tracers commonly occur 2323 across estuarine salinity gradients from freshwater to the sea (Fry, 2002). The 2324 tracer gradients reflect the different geochemistries and mixing of freshwater and 2325 seawater, and these bottom-up geochemical influences are recorded in estuarine 2326 food webs in the isotopic compositions of animals. Watershed-level inputs of 2327 freshwater and nutrients can exert strong influences on isotopic values of 2328 estuarine consumers, especially filter feeders that largely depend on 2329 phytoplankton production. Deviations from conservative isotope mixing can 2330 especially with inputs of non-phytoplankton foods such as macrophyte detritus. 2331 This means that the measuring of consumer isotopes may only reflect watershed 2332 nutrient loading. Consistent with this, Chanton & Lewis (2002) using stable 2333 isotope analysis, showed that the estuary consumer diets appeared to vary 2334 depending on their position within the estuary and that this was associated with 2335 riverine inflows. 2336 Our results show that the menhaden juvenile diet is mainly phytoplankton in 2337 fish from TMC in spring and almost entirely zooplankton in fish from SVS in mid- 2338 summer. Furthermore, significant abundance analysis shows significant 2339 differences in the species represented within many taxa at the two locations. 2340 These differences suggest that fish caught at SVS are feeding at a higher trophic 2341 level in. The stable isotope analysis shows an increase in ?15N in the muscle of 2342 fish from SVS, also supporting a difference in trophic level. However, although 2343 suggestive of a trophic shift, the difference in stable isotope composition is not 2344 definitive by itself since system-specific processes can alter local isotope values 2345 without the corresponding local baseline ?15N values. Local dissolved inorganic 2346 nitrogen sources can vary substantially with some upstream sources such as 2347 107 artificial fertilizers being quite light and some local producers like nitrogen-fixing 2348 cyanobacteria being anomalously light as well. However, the results from 18S 2349 rRNA gene sequences have provided a more complete description of the diet of 2350 fish from the two locations as they move away from the estuary to the enclosed 2351 bay, suggesting an early trophic shift in juveniles of this ecologically and 2352 economically important fish. 2353 4.6 Acknowledgements 2354 This work was supported by the National Oceanic and Atmospheric 2355 Administration, Educational Partnership Program award to the Living Marine 2356 Resources Cooperative Science Center (LMRCSC) NA11SEC4810002 and 2357 NA16SEC4810007. I was supported in part by MD SeaGrant Graduate Student 2358 Fellowship, NA10OAR4170072 SA7528129-D and in part as an LMRCSC 2359 Graduate Fellow. Thanks, are extended to Dr. Ryan Woodland for discussions on 2360 stable isotope analysis. 2361 2362 TMC (% SVS (% representation) representation) Metazoa 30.08 83.77 Stramenopiles 27.5 8.53 Alveolata 38.5 7.2 Rhizaria 0.36 0.09 Total SAR 66.36 15.82 Chloroplastida 0.85 1.05 Fungi 0.74 0.078 Cryptomonodales 2.12 - Supplemental Table 4.1: Representation of the most abundant higher taxa from Two Mile Channel and St Vincent Sound samples Mean proportional representation of the most abundant higher taxa, are shown as mean percentage reads from TMC and SVS. 2363 108 SAR ID TMC (% SVS (% representation) representation) Oligohymenophora ciliate 1.09 3.31 Mesodinidae ciliate 1.42 - Choreotrichia ciliate 8.01 18.95 Oligotrichia ciliate 12.63 5.59 Haptoria ciliate 7.31 0.079 All ciliates 30.46 27.9 Peridiniopsis dino 2.7 0.12 Gyrodinium dino 7.54 0.15 Peridiniales dino 2.7 0.12 Thoracosphaaeraceae dino 1.64 0.53 All dinos 14.58 0.92 All alveolates 45 28.82 Actinoptychus diatom 0.039 8.5 Odontella diatom 0.012 1.25 Uncultured diatom diatom 3.3 11.27 Thalassiosira diatom 4.12 12.03 Chaetoceros diatom 2.85 0.79 Cyclotella diatom 21.89 25.47 Skeletonema diatom 10.41 0.23 All diatoms 42.62 59 Supplemental Table 4.2: Taxonomic composition of the SAR clade at the genus level of stomach contents from Two Mile Channel and St Vincent Sound The mean proportional representation of SAR OTUs (species) at the genus level from TMC and SVS sites, as measured by mean percentage reads. Genera are grouped as ciliates, dinoflagellates and diatoms. 2364 109 Metazoa ID TMC (% SVS (% representation) representation) Acartia copepod 7.3 51.2 Balanus copepod 0.95 4.99 Armatobalanus copepod 5.2 Taeniacanthus copepod 4.4 Canuella copepod 12.22 Janicella copepod all copepods 21.2 66.4 shrimp 0.03 0.045 teleosts 9.9 10 Neotrigonia bivalve 7.2 - Neritina gastropod 1 - Barantolla Annelid:polychaete 20 0.8 Capitella annelid:polychaete 1.72 0.011 Cirratulus Annelid:polychaete 2.41 0.028 all polychaetes 24.13 0.839 Carinioma nemertean 3.34 0.03 Zygeupolia nemertean 1.05 - Gastrotricha nemertean 0.019 0.88 all nemertean worms 4.4 % 0.91 % Mesodesmo nematode 0.019 0.39 all rotifers 7.98 0.08 Danio rerio FW cyprinid from S. 4.73 5.25 Asia Betta pi FW perciform from 2.85 2.06 Thailand & Malaysia Osteobrama FW cyprinid from 1.53 1.71 belangeri India and Myanmar Misgurnus FW cyprinid from 0.86 1.02 anguillicaudatus India & Myanmar all fish 9.97 10.04 other 32.69 22.9 Supplemental Table 4.3: Taxonomic composition of Metazoa at the genus level of stomach contents from Two Mile Channel and St Vincent Sound The mean proportional representation of metazoan OTUs (species) at the genus level from TMC and SVS sites, as measured by mean percentage reads. Genera are grouped as copepods, rotifers, annelids, nematodes and nemerteans, bivalves, arthropods, gastropods and teleosts. 2365 110 Location Site Sample Date TL (mm) SL (mm) T1 48 54 T2 55 61 T3 61 70 T4 55 60 Two Mile T5 53 65 TMC01 T6 May 2013 Channel 50 62 T7 54 65 T8 49 58 T 9 71 84 T10 45 55 S1 83 103 SVS02 S2 86 97 S3 100 106 SVS03 S4 106 122 St. S5 85 105 Vincent S6 July 2013 SVS04 74 88 Sound S7 70 82 S8 84 99 SVS05 S9 85 100 S10 80 100 Supplemental Table 4.4: Length of fish caught at TMC and SVS: TMC avg TL = 54.1 mm; TMC avg SL = 63.4 mm; SVS avg TL = 85.3 mm; SVS avg SL = 100.2 mm 2366 2367 2368 111 Chapter 5: Future steps and final thoughts 2369 5.1 Limitations 2370 I have identified over a thousand potential diet items in the stomachs of Gulf 2371 menhaden using DNA metabarcoding without even considering any of the 2372 prokaryotic species as a food source. This seems like a major leap from 2373 describing the menhaden diet as "mainly phytoplankton" or "mainly zooplankton". 2374 An ongoing challenge for ecological studies has been the collection of data with 2375 high precision and accuracy at a sufficient scale to detect effects relevant to 2376 management of critical global change processes (like climate change). If we 2377 know that a forage fish like Gulf menhaden operate at the "wasp-waist" of an 2378 ecosystem, we need to know what they are eating, but also everything that is 2379 above and below. Biased observations of stomach content looking only at a 2380 narrow number of species expected to be present or drawing inferences from the 2381 small portion of the stomach contents that can be identified visually, are unable 2382 to describe the biodiversity of the diet (Lindenmayer & Likens, 2011). 2383 The current study had its limitations. For instance, although I had suitable 2384 fish from an interesting, productive estuary, I did not have samples of the water 2385 they came from to verify that these filter-feeding fish really did "sample the 2386 environment". A collaborator on the "megaproject" helpfully provided me with 2387 stable isotope analysis on muscle fillets from the very fish I used. However, I did 2388 not have the corresponding data on different size fractions of plankton, or detritus 2389 from the river mouth of off the marshes. Although this project on Gulf menhaden 2390 represented a large collaboration of interested parties, it foundered somewhat on 2391 adequate experimental design. Unfortunately, neither I nor my advisor had been 2392 able to give our input on the experimental design. Nevertheless, it is clear that 2393 the difference in stomach content in fish from TMC and those from SVS is 2394 startling and the percent representation of zooplankton in stomach samples from 2395 TMC is unexpectedly compared with older studies. Zooplankton have over 90 % 2396 representation in samples from SVS when the microzooplankton is folded in. In 2397 the samples from TMC, total zooplankton representation is almost 40 %. Beyond 2398 112 this, within each major taxa, whatever the representation, the species 2399 composition is quite different in the stomachs of fish from the two locations. 2400 Overall this is very different from the picture of juveniles consuming two to three 2401 times more phytoplankton than larger juveniles that have left the immediate tidal 2402 estuary areas as deduced from Olsen?s impressive stable isotope analysis 2403 (Olsen et al, 2014). I am left instead seeing juvenile menhaden as opportunistic 2404 feeders and this would seem to be a great strategy for a fish that spends its 2405 summer months in an enclosed bay with the possibility of plankton blooms and 2406 crashes dependent on factors like the inflow of water from the river. 2407 Apalachicola Bay, as a National Estuarine Research Reserve has been the 2408 subject of many research studies from both federal scientist and the faculty and 2409 students from surrounding academic institutions. I imagined that in such a well- 2410 studied estuary there might have been a comprehensive catalog of picoplankton 2411 as well as micro and mesozooplankton with information on seasonal variations. 2412 However, such in depth descriptions of the composition of planktonic species 2413 over the years and seasons do not seem to be available for Apalachicola Bay as 2414 they are for other important water bodies like the Great Lakes. 2415 5.2 How should we convert sequence reads to dietary data? 2416 When high throughput sequencing first became available, the potential 2417 applications in diet studies were clear and the methods were quickly embraced 2418 by the community (Deagle et al, 2009; Valentini et al, 2009). Studies reported by 2419 Deagle et al (2019) indicate that using relative read abundance (RRA) can 2420 provide an accurate overview of population-level diet, although using read counts 2421 as an indication of biomass in samples is more controversial. Relative read 2422 abundances are sensitive to DNA recovery biases, differential affinity of some 2423 sequences for the ?universal? primers used and differences in gene copy number 2424 between species (Alberdi et al, 2018). 2425 5.3 Studying ecosystems with DNA metabarcoding 2426 A major hurdle for many identification workflows is the time-consuming and 2427 challenging process of sorting and identifying organisms. The rapid development 2428 113 of DNA metabarcoding as a biodiversity observation tool provided a potential 2429 solution. Due to the high resolution and prey detection capacity DNA 2430 metabarcoding has been increasingly used to address ecological questions 2431 grounded in dietary relationships. One of the advantages of DNA-based 2432 techniques for prey identification purposes is that successful amplification can be 2433 achieved in samples that are usually not in optimal condition (i.e. feces and gut 2434 contents) as it only requires a small amount of tissue for DNA extraction 2435 (Teletchea, 2009). However, although the approach is certainly feasible, the "big 2436 data" generated is actually REALLY BIG such that adequate, thoughtful analysis 2437 can take a long time. However, to develop a picture of the whole ecosystem and 2438 how it can change over the years and seasons, accurate and reliable estimates 2439 of biodiversity are essential. This can feed into successful ecosystem 2440 management as well as shape environmental policy (Hooper et al 2005; Rees et 2441 al 2004). Monitoring biodiversity using DNA metabarcoding as a long term 2442 undertaking by a dedicated team could go a long way to providing this. An 2443 estuary such as Apalachicola Bay would be ideal site for this. Estuaries are 2444 highly productive systems that provide food, shelter and nursery habitats for 2445 greater density, survival rates and growth of juvenile fish (Beck et al, 2001; Kraus 2446 & Secor, 2005). Such a project could help provide data on the effects of climate 2447 change and perhaps help to project its impacts. It could monitor the effects of 2448 environmental catastrophes such as the Deepwater Horizon oil spill or follow the 2449 impacts of diverting water from the Apalachicola Rivers. It would allow us to ask 2450 questions such as: will climate-driven changes in planktonic species composition 2451 act as bottom-up regulators of productivity? 2452 5.4 Use of filter feeders as environmental biomonitors in environmental 2453 metabarcoding studies 2454 Environmental DNA metabarcoding and high throughput sequencing focuses 2455 on detecting organisms from the DNA trace they leave in the environment is a 2456 molecular biodiversity assessment method (Taberlet, et al, 2012; Thomsen et al, 2457 2012). In recent years the use of characterizing this environmental DNA (eDNA) 2458 114 using high throughput sequencing technology to study marine ecology has 2459 greatly increased. Various uses of this method and technology include monitoring 2460 marine fish biodiversity, detection of invasive species, effects of introducing or 2461 slowing conservation effort, planktonic identification and biomonitoring for rare 2462 taxa (Thomsen et al, 2012; Bohmann et al 2014, Cristescu et al, 2018). 2463 There is an increasing amount of work showing that the metabarcoding of 2464 eDNA samples is more effective in determining local aquatic community structure 2465 than traditional net surveys (Siegenthaler et al, 2019; Boussarie et al, 2018; 2466 Thomsen et al, 2012; Valentini et al, 2016). Some advantages to this method are 2467 the identification of small cryptic or decomposed organisms, reduced cost and 2468 effort for analysis, and independence from the developmental stage of organisms 2469 (Chariton et al, 2015; Hajibabaei et al, 2011; Lejzerowicz et al, 2015; Leray & 2470 Knowlton, 2015). However, one of the drawbacks of identification through DNA 2471 metabarcoding can be the inability to determine whether a DNA sequence comes 2472 from an egg, larva, juvenile or adult (Valentini et al 2016). 2473 Environmental DNA extracted from water samples usually integrates 2474 information over large spatial scales but has a low temporal resolution due to the 2475 high dispersion and the low persistence of DNA in sea water (Barnes & Turner, 2476 2016; Thomsen et al., 2012). There has been increasing use of filter feeding 2477 species as environmental samplers constituting a valuable and effective method 2478 for biomonitoring. Leeches and carrion flies have been used as biodiversity 2479 sampling tools in studies examining mammal biodiversity in terrestrial habitats 2480 (Calvignac?Spencer et al, 2013; Schnell et al, 2015, 2012). In the marine 2481 environment shrimp and sponges have been used to determine marine 2482 biodiversity (Siegenthaler et al, 2019; Mariani et al, 2019). In our study looking at 2483 the 18S rRNA gene sequencing results in Gulf menhaden, we find organisms 2484 that have not been identified in previous studies. For example, the shrimp, 2485 bivalve, and worm genera have not previously been reported in the stomach 2486 contents of menhaden. Given that the adult organisms are too large to be filtered 2487 via menhaden gill rakers, we conclude that some evidence of them (i.e., eggs, 2488 115 larvae, or decaying body part) is present in the water as part of marine snow or 2489 detritus. For this reason, we propose that menhaden should be added to the list 2490 of environmental biomonitors, using stomach contents as a proxy for ecosystem 2491 health. However, more research with menhaden is needed to verify that their 2492 stomach content accurately reflects what is in the water column. The area 2493 ?sampled? by a school of menhaden will likely provide information on a larger 2494 spatial scale than acquired by a water samples since the fish actively move 2495 around and shows seasonal movements. The ?sampled biodiversity? is naturally 2496 encapsulated in the stomachs of menhaden, from netting the fish all the way to 2497 DNA extraction in the laboratory. It represents a significant way to streamline and 2498 by?pass many of the fastidious steps required to reduce degradation and 2499 contamination when sampling water; a fact that is often understated in eDNA 2500 research. 2501 5.5 Accounting for functional ecological importance/significance of the stomach 2502 microbiota 2503 Microbial ecology using DNA metabarcoding focuses on identification of 2504 bacterial species but does not, by itself, provide insight of their function. There 2505 are even fewer studies on the diversity of microeukaryotic organisms such as 2506 diatoms, dinoflagellates and ciliates and their function. In the face of climate 2507 change more work needs to be done to understand how changing environments 2508 will affect the marine microbiota and how that will reflect marine processes, as 2509 well as how changes in the stomach microbiota of fish and other aquatic 2510 organisms. Will this result in a loss or gain in function? How will organisms cope 2511 with new microbial residence? Could new microbial residences aid with the 2512 adjustment in a new climate in terms of heat stress, disease, salinity changes, pH 2513 changes, etc. 2514 Given that fish have a more intimate interaction with the surrounding 2515 microbiota than land animals, interactions between the aquatic microbial 2516 community and the fish microbiota need further attention. My study saw 2517 differences in the stomach microbiota with respect to sampling location. Although 2518 116 profiling phylogenetic marker genes, such as the 16S rRNA gene, is a key tool 2519 for studies of microbial communities, it does not provide direct evidence of 2520 metabolic or functional capabilities of a microbial community. However, there is a 2521 computational approach to predict the functional composition of a microbial 2522 community from the OTUs found and a database of reference genomes. 2523 PICRUSt uses an extended ancestral-state reconstruction algorithm to predict 2524 which gene families are present and can combine gene families to estimate the 2525 composite functional capacity of a community (Langille et al, 2013; Douglas et al, 2526 2018). Phylogenetic trees based on 16S rRNA sequences closely resemble 2527 clusters obtained based on shared gene content, and researchers often infer 2528 properties of uncultured organisms from cultured relatives. This 'predictive 2529 metagenomic' approach could provide useful insights into the thousands of 2530 uncultivated microbial communities for which only marker gene surveys are 2531 currently available. 2532 5.6 The evolution of DNA metabarcoding analysis methods 2533 One limitation to DNA metabarcoding is simply the amount of data needed to 2534 be analyzed from a high throughput sequencing run. Depending on the number 2535 of samples reads can easily be in the order of 1 x 106. This poses many 2536 challenges in trying to process this data. This is further complicated by the 2537 unconventional type of data in that often is highly dimensional with the number of 2538 taxa much greater than the number of samples. This leads to many debates on 2539 best practices when analyzing such data and what inferences can be made from 2540 the subsequent results, for example the use of model-based for abundance 2541 analysis, when to rarefy, the use of normalization and differential abundance 2542 strategies, and taxonomic assignment (Tsilimigras & Fodor 2016; McMurdie & 2543 Holmes 2014; Weis et al, 2017; Hui 2016; White et al, 2009; Angiuoli et al, 2011, 2544 Caporaso et al, 2010). As new methods for analysis are constantly being 2545 developed it may be difficult for the analyzer to keep up. For example, during the 2546 course of this study several updates and methods for post sequencing analysis 2547 as well improved taxonomic assignment became available (Callahan et al, 2016). 2548 117 As these new and improved methods become available more insight can be 2549 gained from the same data. Therefore, it is imperative that sequence data be 2550 securely stored with services such as those offered by NCBI. 2551 A continuing limitation of high throughput sequencing is taxonomic database 2552 curation. Given that taxonomic assignment is based on a curated database the 2553 accuracy of the assignment is only as good as the database itself. Two issues 2554 arise when using a database. The first is how the taxa are ranked and the other 2555 is the completeness of the database. The taxonomic ranks of organisms can 2556 change with new research of that particular organism. This can pose an issue 2557 when looking at previous studies. For example, the SILVA taxonomic database 2558 used to assign taxonomy was updated during this study. In doing so, new taxa 2559 were added and some the taxonomic ranks were adjusted. This update led to the 2560 ability to determine the sequence of menhaden 18S rRNA which has been 2561 deposited in NCBI?s GeneBank. This in turn allowed me to determine the amount 2562 of menhaden DNA contamination in the stomach samples. It is important to 2563 maintain a well curated taxonomic database as the use of DNA metabarcoding in 2564 biodiversity studies are going to increase. Perhaps for the investigator one way 2565 around this would be to develop a database specific to his/her study, an 2566 approach not without its own problems. 2567 2568 118 Bibliography 2569 Able, KW. 2005. A re-examination of fish estuarine dependence: evidence for 2570 connectivity between estuarine and ocean habitats. Estuarine, Coastal and Shelf 2571 Science 64: 5?17. 2572 Ahrenholz, DW. 1981. Recruitment and exploitation of Gulf menhaden, 2573 Brevoortia patronus. Fish Bull US 79: 325?36. 2574 Ahrenholz, DW. 1991. Recruitment and exploitation of Gulf menhaden, 2575 Brevoortia patronus. Fishery Bulletin-United States,53, 3-19. 2576 Alder, J, Campbell, B, Karpouzi, V, Kaschner, K, & Pauly, D. 2008. Forage fish: 2577 from ecosystems to markets. Annual Rev Environ & Resources 33: 153. 2578 Amann, RI, Ludwig, W, & Schleifer, KH 1995. Phylogenetic identification and in 2579 situ detection of individual microbial cells without cultivation. Microbiol Rev 59: 2580 143?69. 2581 Angiuoli et al 2011 2582 Anderson, C, & Cabana, G. 2006. Does ?15N in river food webs reflect the 2583 intensity and origin of N loads from the watershed. Sci Total Environ, 367(2-3), 2584 968-978. 2585 Anderson, JD. 2007. Systematics of the North American menhadens: molecular 2586 evolutionary reconstructions in the genus Brevoortia (Clupeiformes: Clupeidae). 2587 Fishery Bulletin 105: 368?78. 2588 Arthur, SC 1931. The birds of Louisiana. Department of Conservation. 2589 Ashelford, KE, Weightman, AJ & Fry, JC. 2002. PRIMROSE: a computer 2590 program for generating and estimating the phylogenetic range of 16S rRNA 2591 oligonucleotide probes and primers in conjunction with the RDP-II database. 2592 Nucleic Acids Res 30: 3481-3489. 2593 Baker, R, Buckland, A, & Sheaves, M. 2014. Fish gut content analysis: robust 2594 measures of diet composition. Fish and Fisheries 15: 170?77. 2595 Bakun, A, Babcock, EA, & Santora, C. 2009. Regulating a complex adaptive 2596 system via its wasp-waist: grappling with ecosystem-based management of the 2597 New England herring fishery. ICES J Mar Sci 66: 1768?75. 2598 Bargiela R, Gertler, C, Magagnini, M., Mapelli, F, Chen, J, Daffonchio, D, 2599 Golyshin, PN, & Ferrer, M. 2015. Degradation network reconstruction in uric acid 2600 & ammonium amendments in oil-degrading marine microcosms guided by 2601 metagenomic data. Front Microbiol 6: 1270. 2602 Barnes, MA, & Turner, CR. 2016. The ecology of environmental DNA and 2603 implications for conservation genetics. Conservation Genetics, 17, 1?17. 2604 Beck, MW, Heck, Kl, Able, KW, Childers, Dl, Eggleston, DB, Gillanders, B, 2605 Weinstein, MP (2001). The identification conservation, and management of 2606 119 estuarine and marine nurseries for fish and invertebrates. BioScience, 51, 633? 2607 641. 2608 Bellemain, E., Carlsen, T., Brochmann, C., Coissac, E., Taberlet, P., & Kauserud, 2609 H. 2010. ITS as an environmental DNA barcode for fungi: an in silico approach 2610 reveals potential PCR biases. BMC Microbiol 10: 1. 2611 Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kotter, T., Meinl, T., Ohl, P., 2612 Thiel, K., & WisIdel, B. 2009. KNIME: the Konstanz information miner. SIGKDD 2613 Explor Newsl 11: 26?31. 2614 Blanchet, RH 1978. The distribution and abundance of ichthyoplankton in the 2615 Apalachicola Bay, Florida, area. M.S. Thesis. Florida State University, 2616 Tallahassee. 2617 Blankenship, LE, & Yayanos, AA. 2005. Universal primers and PCR of gut 2618 contents to study marine invertebrate diets. Mol Ecol 14: 891?99. 2619 Blankenship, L., & Levin, LA. (2007). Extreme food webs: Foraging strategies 2620 and diets of scavenging amphipods from the ocean?s deepest 5 kilometers. 2621 Limnology and Oceanography, 52(4), 1685-1697. 2622 Bohmann, K., Monadjem, A., Lehmkuhl Noer, C., Rasmussen, M., Zeale, M. R., 2623 Clare, E. et al. (2011). Molecular diet analysis of two African free-tailed bats 2624 (Molossidae) using high throughput sequencing. PLoS One, 6, e21441. 2625 Bohmann, K, Evans, A, Gilbert, P, Thomas, P, Carvalho, GR, Creer, S, Knapp, 2626 D, Yu, W, Bruyn, M. 2014. Environmental DNA for wildlife biology and 2627 biodiversity monitoring. Trends in Ecology & Evolution 29: 358-367. 2628 Bohmann et al 2017. Next-Generation Global Biomonitoring: Large-scale, 472 2629 Automated Reconstruction of Ecological Networks. Trends in Ecology & 2630 Evolution. 2631 Bolyen E, Rideout, JR, Dillon, MR, Bokulich NA, Abnet, CC, Al-Ghalith, GA, 2632 Alexander, H, Alm, EJ, Arumugam, M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, 2633 Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodr?guez AM, 2634 Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall 2635 DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, 2636 Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, 2637 Holste H, HuttenhoIr C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler 2638 BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, 2639 Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, 2640 Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan 2641 SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, 2642 Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers 2643 A, Robeson MS 2nd, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song 2644 SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, 2645 Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, V?zquez-Baeza Y, 2646 Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Iber KC, 2647 Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R 2648 120 Boom, R., Sol, C. J., Salimans, M. M., Jansen, C. L., Irtheim-van Dillen, P. M., & 2649 van der Noordaa, J. 1990. Rapid and simple method for purification of nucleic 2650 acids. J Clin Microbiol 28: 495?503. 2651 Bradley, I. M., A. J. Pinto, and J. S. Guest. 2016. Design and evaluation of 2652 Illumina MiSeq-Compatible, 18S rRNA gene-specific primers for improved 2653 characterization of mixed phototrophic communities. Appl Environ Microbiol 82: 2654 5878-5891. 2655 Brown-Peterson, N. J., Leaf, R. T., Schueller, A. M., & Andres, M. J. 2017. 2656 Reproductive dynamics of gulf menhaden (Brevoortia patronus) in the northern 2657 Gulf of Mexico: effects on stock assessments. Fishery Bulletin 115: 284?99. 2658 Bundy, A., Boldt, J. L., & Cook, A. M. 2012. Relative importance of fisheries, 2659 trophodynamic and environmental drivers in a series of marine ecosystems. Mar. 2660 Ecol. 2661 Bush, A, Compson, ZG, Monk, W, Porter, TM, Steves, R, Emilson, E, Gagne, N, 2662 Hajlbabael, M, Roy, M & Baird, DJ. 2019 Studying ecosystems with DNA 2663 metabarcoding: lessons from aquatic macroinvertebrates. Front Ecol, 10.3389. 2664 https://www.frontiersin.org/articles/10.3389/fevo.2019.00434/fullA 2665 Caporaso JG. 2019 Reproducible, interactive, scalable and extensible 2666 microbiome data science using QIIME 2. Nat Biotechnol 37:852-857. 2667 Callahan, J. Benjamin, McMurdie, J. Paul, Rosen, J. Michael, Han, W. Andrew, 2668 Johnson, Jo. A. Amy, Holmes, P. Susan. 2016. DADA2: High-Resolution sample 2669 inference from Illumina amplicon data. Nature Methods 13: 281-583. 2670 Calvignac?Spencer, Leendertz, FH, Gilbert, MT, & Schubert, G. 2013. An 2671 invertebrate stomach's view on vertebrate ecology: Certain invertebrates could 2672 be used as ?vertebrate samplers? and deliver DNA?based information on many 2673 aspects of vertebrate ecology. BioEssays, 35: 1004-1013. 2674 Caporaso JG, Kuczynski, J, Stombaugh, J, Bittinger, K, Bushman, FD, Costello, 2675 EK, Fierer, N, Pena, AG, Goodrich, JK, & Gordon, JI. 2010. QIIME allows 2676 analysis of high-throughput community sequencing data. Nature Methods 7: 2677 335?36. 2678 Carreon-Martinez, L, & Heath, DD. 2010. Revolution in food web analysis and 2679 trophic ecology: diet analysis by DNA and stable isotope analysis. Mol Ecol 19: 2680 25?2. 2681 Carreon-Martinez, L, Johnson, TB, Ludsin, SA, & Heath, DD. 2011. Utilization of 2682 stomach content DNA to determine diet diversity in piscivorous fishes. J Fish Biol 2683 78: 1170?82. 2684 Chanton, JP. & Lewis, FG. 1999. Plankton and dissolved inorganic carbon 2685 isotopic composition in a river-dominated estuary: Appalachicola Bay, Florida. 2686 Estuaries 22:575-583. 2687 121 Chariton, AA, Stephenson, S, Morgan, MJ, Steven, AD, Colloff, MJ, Court, LN & 2688 Hardy, CM. 2015. Metabarcoding of benthic eukaryote communities predicts the 2689 ecological condition of estuaries. Env Pollution, 203: 165-174. 2690 Christmas, J. Y., & Gunter, G. 1960. Distribution of menhaden, genus Brevoortia, 2691 in the Gulf of Mexico. Transactions of the American Fisheries Society 89: 338-43. 2692 Christmas, JY., McBee, JT., Waller, RS., and Sutter, FC. 1982. Habitat Suitability 2693 Index Models: Gulf Menhaden. U.S. Department of the Interior Fish and Wildlife 2694 Service, FWS/OBS-82/10.23. p. 31. 2695 Christmas, J., & Waller, RS. 1975. Location and time of menhaden spawning in 2696 the Gulf of Mexico. Gulf Coast Research Laboratory. 2697 Claassen, S., duToit, E., Kaba, M., Moodley, C., Zar, H. J., AND Nicol, M. P. 2698 2013. A comparison of the efficiency of five different commercial DNA extraction 2699 kits for extraction of DNA from faecal samples. Journal of microbiological 2700 methods 94: 103?10. 2701 Cleary, AC, Durbin, EG, & Rynearson, TA. 2012. Krill feeding on sediment in the 2702 Gulf of Maine (North Atlantic). Mar Ecol Progr Ser 455: 157?72. 2703 Clements, K. D., Angert, E. R., Montgomery, W. L., AND Choat, J. H. 2014. 2704 Intestinal microbiota in fishes: what?s known and what?s not. Molecular ecology 2705 23: 1891?98. 2706 Clooney AG, Fouhy F, Sleator RD, O' Driscoll A, Stanton C, Cotter PD, Claesson 2707 MJ. 2016 Comparing Apples and Oranges: Next Generation Sequencing and Its 2708 Impact on Microbiome Analysis PLoS One, e0148028. 2709 Colston TJ, & Jackson, CR. 2016. Microbiome evolution along divergent 2710 branches of the vertebrate tree of life: what is known & unknown. Molecular 2711 ecology 25: 3776?800. 2712 Combs, R. M. 1969. Embryogenesis, Histology, and Organology of the Ovary of 2713 Brevoortia patronus. Gulf and Caribbean Research 2: 333?434. 2714 Corse, E., Megl?cz, E., Archambaud, G., Ardisson, M., Martin, J. F., Tougard, C., 2715 Chappaz, R., & Dubut, V. 2017. A from-benchtop-to-desktop workflow for 2716 validating HTS data and for taxonomic identification in diet metabarcoding 2717 studies. Mol Ecol Resour 17: e146?59. 2718 Craig, C, Kimmerer, WJ, Cohen, CS. 2013. A DNA-based method for 2719 investigating feeding by copepod nauplii J. Plankton Res. (2014) 36(1): 271?275. 2720 Cristescu, Melania E. and Hebert, Paul D. N., Uses and Misuses of 2721 Envrionmental DNA in Biodiversity Science and Conservation 2018. Annual 2722 Review of Ecology, Evolution, and Systematics. 49:209-230. 2723 Cury, P., Bakun, A., Crawford, R. J. M., Jarre, A., Qui?ones, R. A., Shannon, L. 2724 J., & Verheye, H. M. 2000. Small pelagics in upwelling systems: patterns of 2725 interaction and structural changes in wasp-waist ecosystems. ICES Journal of 2726 Marine Science: Journal du Conseil 57: 603?18. 2727 122 Dahlberg, M. D. 1970. Atlantic and Gulf of Mexico menhadens, genus Brevoortia 2728 (Pisces: Clupeidae). Ph.D. thesis, University of Florida. 2729 Dailey et al, 2008 2730 Deagle, B. E., Tollit, D. J., Jarman, S. N., Hindell, M. A., Trites, A. W., & Gales, 2731 N. J. (2005). Molecular scatology as a tool to study diet: analysis of prey DNA in 2732 scats from captive Steller sea lions. Mol Ecol, 14(6), 1831-1842. 2733 Deagle, B. E., Gales, N. J., Evans, K., Jarman, S. N., Robinson, S., Trebilco, R. 2734 et al. (2007). Studying seabird diet through genetic analysis of faeces: a case 2735 study on macaroni penguins (Eudyptes chrysolophus). PLoS One, 2(9), e831. 2736 Deagle, B. E., Kirkwood, R., & Jarman, S. N. (2009). Analysis of Australian fur 2737 seal diet by pyrosequencing prey DNA in faeces. Mol Ecol, 18(9), 2022-2038. 2738 Deagle, B. E., Chiaradia, A., McInnes, J., & Jarman, S. N. (2010). 2739 Pyrosequencing faecal DNA to determine diet of little penguins: is what goes in 2740 what comes out. Conservation Genetics, 11(5), 2039-2048. 2741 Deagle, B. E., Thomas, A. C., Shaffer, A. K., Trites, A. W., & Jarman, S. N. 2742 (2013). Quantifying sequence proportions in a DNA-based diet study using Ion 2743 Torrent amplicon sequencing: which counts count. Mol Ecol Resour, 13(4), 620- 2744 633. 2745 Deagle, BE, Jarman, SN, Coissac, E, Pompanon, F, & Taberlet, P. (2014). DNA 2746 metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect 2747 match. Biol Lett, 10(9). 2748 Deagle, B. E., Clarke, L. J., Kitchener, J. A., Polanowski, A. M., & Davidson, A. T. 2749 (2018). Genetic monitoring of open ocean biodiversity: An evaluation of DNA 2750 metabarcoding for processing continuous plankton recorder samples. Mol Ecol 2751 Resour, 18(3), 391-406. 2752 Deagle, BE, Thomas, AC, McInnes, JC, Clarke, LJ., Vesterinen, EJ., Clare, EL. 2753 et al. (2019). Counting with DNA in metabarcoding studies: How should I convert 2754 sequence reads to dietary data. Mol Ecol, 28(2), 391-406. 2755 Deegan, LA. 1986. Changes in body composition and morphology of young-of- 2756 the-year gulf menhaden, Brevoortia patronus Goode, in Fourleague Bay, 2757 Louisiana. Journal of Fish Biology 29: 403?15. 2758 Deegan, LA 1993. Nutrient and energy transport between estuaries and coastal 2759 marine ecosystems by fish migration. Canadian Journal of Fisheries and Aquatic 2760 Sciences 50: 74?79. 2761 Deegan, LA, Peterson, BJ, & Portier, R. 1990. Stable isotopes & cellulase activity 2762 as evidence for detritus as a food source for juvenile Gulf menhaden. Estuaries 2763 13: 14?19. 2764 Deegan, LA, & Thompson, BA. 1987. Growth rate and life history events of 2765 Young?of?the?Year Gulf menhaden as determined from otoliths. Transacs 2766 American Fish Soc, 116: 663-667. 2767 123 Degnan, PH, & Ochman, H. 2016. Illumina-based analysis of microbial 2768 community diversity. ISME J 6: 183?94. 2769 De Barba, M., Miquel, C., Boyer, F., Mercier, C., Rioux, D., Coissac, E., & 2770 Taberlet, P. 2014. DNA metabarcoding multiplexing and validation of data 2771 accuracy for diet assessment: application to omnivorous diet. Mol Ecol Resour 2772 14: 306?23. 2773 Di Gennaro P., Terreni, P., Masi, G., Botti, S., De Ferra, F., & Bestetti, G. 2010. 2774 Identification & characterization of genes involved in naphthalene degradation in 2775 Rhodococcus opacus R7. Applied Microbiol & Biotechnol 87: 297?308. 2776 Djurhuus, A., K. Pitz, N. A. Sawaya, J. Rojas-M?rquez, B. Michaud, E. Montes, 2777 F. Muller-Karger, & M. Breitbart. 2018. Evaluation of marine zooplankton 2778 community structure through environmental DNA metabarcoding. Limnol 2779 Oceanogr Methods 16: 209-221. 2780 Douglas, GM, Belko, RG, Langille, MG. 2018. Predicting the Functional Potential 2781 of the Microbiome from Marker Genes Using PICRUSt. Methods Mol Biol, 2782 1849169-177. 2783 Dunshea, G., Barros, N. B., Ills, R. S., Gales, N. J., Hindell, M. A., & Jarman, S. 2784 N. (2008). Pseudogenes and DNA-based diet analyses: a cautionary tale from a 2785 relatively Ill sampled predator-prey system. Bull Entomol Res, 98(3), 239-248. 2786 Durbin A. G., & Durbin, E. G. 1975. Grazing rates of the Atlantic menhaden, 2787 Brevoortia tyrannus, as a function of particle size & concentration. Mar Biol 33: 2788 265?77. 2789 Edgar, R. C. 2010. Search & clustering orders of magnitude faster than BLAST. 2790 Bioinformatics 26: 2460?61. 2791 Egerton S., Culloty, S., Whooley, J., Stanton, C., & Ross, R. P. 2018. The Gut 2792 Microbiota of Marine Fish. Front Microbiol 9: 873. 2793 Eichmiller JJ, Miller LM, Sorensen PW. 2016. Optimizing techniques to capture 2794 and extract environmental DNA for detection and quantification of fish. Mol Ecol 2795 Resour. 16:56-68. 2796 Essington, T. E., Moriarty, P. E., Froehlich, H. E., Hodgson, E. E., Koehn, L. E., 2797 Oken, K. L., Siple, M. C., AND Stawitz, C. C. 2015. Fishing amplifies forage fish 2798 population collapses. Proc Natl Acad Sci U S A 112: 6648?52. 2799 Estabrook, R. H. (1973). Phytoplankton of Apalachicola Bay. Ph.D thesis, Florida 2800 State University. 2801 Etzold, D. J., and J. Y. Christmas. 1979. A Mississippi marine finfish 2802 management plan. MS-AL Sea Grant Consortium MASGP-78-046 1?12. 2803 Fadrosh D. W., Ma, B., Gajer, P., Sengamalay, N., Ott, S., Brotman, R. M., & 2804 Ravel, J. 2014. An improved dual-indexing approach for multiplexed 16S rRNA 2805 gene sequencing on the Illumina MiSeq platform. Microbiome 2: 6. 2806 124 Felske, A., & Akkermans, A. D. L. (1998). Spatial homogeneity of abundant 2807 bacterial 16S rRNA molecules in grassland soils. Microbial Ecology, 36(1), 31-36. 2808 Fierer, EK, Pena, N, Goodrich, AG, & Gordon, J. I. 2010. QIIME allows analysis 2809 of high-throughput community sequencing data. Nature Methods 7: 335?36. 2810 Fillbrunn, A., Dietz, C., Pfeuffer, J., Rahn, R., Landrum, G. A., & Berthold, M. R. 2811 2017. KNIME for reproducible cross-domain analysis of life science data. J 2812 Biotechnol 261: 149?56. 2813 Folmer, O., Black, M., Hoeh, W., Lutz, R., & Vrijenhoek, R. 1994. DNA primers 2814 for amplification of mitochondrial cytochrome c oxidase subunit I from diverse 2815 metazoan invertebrates. Mol Mar Biol Biotechnol 3: 294?99. 2816 Fore, P. L. 1970. Oceanic distribution of eggs and larvae of the Gulf menhaden. 2817 Forin-Wiart, M. A., Poulle, M. L., Piry, S., Cosson, J. F., Larose, C., & Galan, M. 2818 2018. Evaluating metabarcoding to analyse diet composition of species foraging 2819 in anthropogenic landscapes using Ion Torrent and Illumina sequencing. Sci Rep 2820 8: 17091. 2821 Franklin, 2007 2822 Franklin, H. B. 2011. The most important fish in the sea: Menhaden and America. 2823 Island Press, 280 pp. 2824 Friedland, K. D., Haas, L. W., & Merriner, J. V. 1984. Filtering rates of the 2825 juvenile Atlantic menhaden Brevoortia tyrannus (Pisces: Clupeidae), with 2826 consideration of the effects of detritus & swimming speed. Marine Biology 84: 2827 109?17. 2828 Friedland, K. D., AND Haas, L. W. 1988. Emigration of Juvenile Atlantic 2829 Menhaden, Brevoortia tyrannus (Pisces, Clupeidae), from the York River 2830 Estuary. Estuaries 11: 45?50. 2831 Friedland, K. D., Ahrenholz, D. W., Smith, J. W., Manning, M., & Ryan, J. 2006. 2832 Sieving functional morphology of the gill raker feeding apparatus of Atlantic 2833 menhaden. J Exp Zool A Comp Exp Biol 305: 974?85. 2834 Fry, B. 1988. Food web structure on Georges Bank from stable C, N, and S 2835 isotopic compositions. Limnology and Oceanography 33:1182?1190. 2836 Fry, B 2002 Conservative Mixing of Stable Isotopes Across Estuarine Salinity 2837 Gradients: A Conceptual Framework for Monitoring Watershed Influences on 2838 Downstream Fisheries Production. Estuaries, 25: 264-271. 2839 Gamboa-Delgado, J., Canavate, J. P., Zerolo, R., & Vay, L. (2008). Natural 2840 carbon stable isotope ratios as indicators of the relative contribution of live and 2841 inert diets to growth in larval Senegalese sole (Solea senegalensis). Aquaculture. 2842 Geers, T. M., Pikitch, E. K., AND Frisk, M. G. 2016. An original model of the 2843 northern Gulf of Mexico using Ecopath with Ecosim and its implications for the 2844 effects of fishing on ecosystem structure and maturity. Deep-Sea Res II 129: 2845 319?31. 2846 125 Gertler C., Bargiela, R., Mapelli, F., Han, X., & Chen, J. 2015. Conversion of uric 2847 acid into ammonium in oil-degrading marine microbial communities: a possible 2848 role of Halomonads. Microbial Ecol 70: 724?40. 2849 Ghanbari M., Kneifel, W., & Domig, K. J. 2015. A new view of the fish gut 2850 microbiome: advances from next-generation sequencing. Aquaculture 448: 464? 2851 75. 2852 Gl?sner, A. H., Reischl, J., AND Gessner, U. 2015. Analyses of Intestinal 2853 Microbiota: Culture versus Sequencing. ILAR J 56: 228?40. 2854 Givens C. E., Ransom, B., Bano, N., & Hollibaugh, J. T. 2015. Comparison of the 2855 gut microbiomes of 12 bony fish & 3 shark species. Mar Ecol Progr 518: 209?23. 2856 Go??biewski, M., AND Tretyn, A. 2019. Generating amplicon reads for microbial 2857 community assessment with next-generation sequencing. J Appl Microbiol doi: 2858 10.1111. 2859 Goode, G.B. 1878. A revision of the American species of the genus Brevoortia, 2860 with a description of a new species from the Gulf of Mexico. Proceedings of the 2861 U.S. National Museum 1:30-42. 2862 Gossling, J., & Slack, J. M. 1974. Predominant gram-positive bacteria in human 2863 feces: numbers, variety, and persistence. Infect Immun 9: 719?29. 2864 Grabowski, P., & Rappsilber, J. 2019. A Primer on Data Analytics in Functional 2865 Genomics: How to move from data to insight. Trends Biochem Sci 44: 21?32. 2866 Greenstone, M. H., Iber, D. C., Coudron, T. A., Payton, M. E., AND Hu, J. S. 2867 2012. Removing external DNA contamination from arthropod predators destined 2868 for molecular gut-content analysis. Mol Ecol Resour 12: 464?69. 2869 Gu, B., Schell, D. M., & Huang, X. (1996). Stable isotope evidence for dietary 2870 overlap between two planktivorous fishes in aquaculture ponds. Can J. 2871 Gutierrez T., Singleton, D. R., Berry, D., Yang, T., Aitken, M. D., & Teske, A. 2872 2013. Hydrocarbon-degrading bacteria enriched by the Deepwater Horizon oil 2873 spill identified by cultivation & DNA-SIP. The ISME journal 7: 2091. 2874 Gunter, G. 1945. Studies on marine fishes of Texas. Austin, Publications of the 2875 Institute of Marine Science, University of Texas, Austin, Texas. 1(1):1-190. 2876 Hajibabaei et al, 2011. Assessing biodiversity of a freshwater benthic 568 2877 macroinvertebrate community through non-destructive environmental barcoding 2878 of DNA 569 from preservative ethanol. 2879 Hanif A., White, J., Place, A. R., & Jagus, R. 2020. Methodology for the 2880 identification of stomach contents in the filter feeding fish (Brevoortia patronus) 2881 using DNA metabarcoding. Limnol & Oceano Methods submitted 2882 Han, Z., Sun, J., Lv, A., & Wang, A. 2019. Biases from different DNA extraction 2883 methods in intestine microbiome research based on 16S rDNA sequencing: a 2884 case in the koi carp, Cyprinus carpio var. Koi. Microbiology, 8: e00626. 2885 126 Harms-Tuohy, C. A., Schizas, N. V., & Appeldoorn, R. S. (2016). Use of DNA 2886 metabarcoding for stomach content analysis in the invasive lionfish Pterois 2887 volitans in Puerto Rico. Marine Ecology Progress Series, 558, 181-191. 2888 Harper, G. L., Sheppard, S. K., Harwood, J. D., Read, D. S., Glen, D. M., 2889 Bruford, M. W. et al. (2006). Evaluation of temperature gradient gel 2890 electrophoresis for the analysis of prey DNA within the guts of invertebrate 2891 predators. Bulletin of Entomological Research, 96(3), 295-304. 2892 Harris, T. W., Lee, R., Schwarz, E., Bradnam, K., Lawson, D., Chen, W. et al. 2893 (2003). WormBase: a cross-species database for comparative genomics. Nucleic 2894 Acids Res, 31(1), 133-137. 2895 Hiergeist, A., Gl?sner, J., Reischl, U., & Gessner, A. (2015). Analyses of 2896 intestinal microbiota: culture versus sequencing. ILAR journal, 56(2), 228-240. 2897 Hoffmann, C., Dollive, S., Grunberg, S., Chen, J., Li, H., Wu, G. D., Lewis, J. D., 2898 & Bushman, F. D. 2013. Archaea and fungi of the human gut microbiome: 2899 correlations with diet and bacterial residents. PLoS One 8: e66019. 2900 Holechek, J. L., Vavra, M., AND Pieper, R. D. 1982. Botanical composition 2901 determination of range herbivore diets: a review. Journal of Range Management 2902 309?15. 2903 Hooper et al, 2005 2904 Houde, E. D., & Fore, P. L. 1973 Guide to the identity of eggs and larvae of some 2905 Gulf of Mexico clupeid fishes. Florida Department of Natural Resources, Marine 2906 Research Laboratory Leaflet Series 4, (Part1, No. 23), 14 pp. 2907 Hugerth, L. W., Muller, E. E. L., Hu, Y. O. O., Lebrun, L. A. M., Roume, H., 2908 Lundin, D., Wilmes, P., & Andersson, A. F. 2014. Systematic design of 18S rRNA 2909 gene primers for determining eukaryotic diversity in microbial consortia. PLoS 2910 One 9: e95567. 2911 Hui 2016 2912 Hyslop, E. J. 1980. Stomach contents analysis-a review of methods and their 2913 application. J Fish Biol 17: 411?29. 2914 Ingerson-Mahar, J. 2002. Relating diet and morphology in adult carabid beetles. 2915 The Agroecology of Carabid Beetles 111?36. 2916 Illumina Document #15027987v1 2917 Jakubavi?i?t?, E., Bergstr?m, U., Ekl?f, J. S., Haenel, Q., & Bourlat, S. J. 2017. 2918 DNA metabarcoding reveals diverse diet of the three-spined stickleback in a 2919 coastal ecosystem. PLoS One 12: e0186929. 2920 Jami M., Ghanbari, M., Kneifel, W., & Domig, K. J. 2015. Phylogenetic diversity & 2921 biological activity of culturable Actinobacteria isolated from freshwater fish gut 2922 microbiota. Microbiol Res 175: 6?15. 2923 127 Jarman, S. N., Deagle, B. E., & Gales, N. J. (2004). Group?specific polymerase 2924 chain reaction for DNA?based analysis of species diversity and identity in dietary 2925 samples. Molecular Ecology, 13:1313-1322. 2926 Jarman, S. N., Redd, K. S., & Gales, N. J. 2006. Group-specific primers for 2927 amplifying DNA sequences that identify Amphipoda, Cephalopoda, 2928 Echinodermata, Gastropoda, Isopoda, Ostracoda and Thoracica. Molecular 2929 Ecology Notes 6: 268?71. 2930 Jiang, B., Sun, J., Lv, A., Hu, X., Shi, H., Sung, Y., Wang, Q., & Wang, Y. 2019. 2931 Impact of DNA extraction methods on the observed microbial communities from 2932 the intestinal flora of the penaeid shrimp Litopenaeus vannamei. FEMS Microbiol 2933 Lett 366: 10.1093. 2934 Jo, H., Ventura, M., Vidal, N., & Gim, J. S. (2016). Discovering hidden 2935 biodiversity: the use of complementary monitoring of fish diet based on DNA 2936 barcoding in freshwater ecosystems. Ecol & Evoln 6: 219-232. 2937 Jordan et al, 2005 2938 Judy, MH, & Lewis, R. 1983. Distribution of eggs and larvae of Atlantic 2939 menhaden, Brevoortia tyrannus, along the Atlantic coast of the United States. US 2940 Department of Commerce, National Oceanic and Atmospheric Administration, 2941 National Marine Fisheries Service. 2942 June, FC, & Carlson, FT 1971. Food of young Atlantic menhaden, Brevoortia 2943 tyrannus, in relation to metamorphosis. Fish Bull 68: 493?512. 2944 Kendall, A. W., AND Reintjes, J. W. 1975. GEOGRAPHIC AND 2945 HYDROGRAPHIC DISTRIBUTION OF ATLANTIC MENHADEN EGGS AND 2946 LARVAE ALONG MIDDLE ATLANTIC COAST FROM RV DOLPHIN CRUISES, 2947 1965-66. Fishery Bulletin 73: 317?35. 2948 Ketchum, R. N., Dieng, M. M., Vaughan, G. O., Burt, J. A., & Idaghdour, Y. 2016. 2949 Levels of genetic diversity and taxonomic status of Epinephelus species in United 2950 Arab Emirates fish markets. Mar Pollut Bull 105: 540?45. 2951 King, R. A., Read, D. S., Traugott, M., & Symondson, W. O. (2008). Molecular 2952 analysis of predation: a review of best practice for DNA-based approaches. Mol 2953 Ecol, 17(4), 947-963. 2954 King G. M., Judd, C., Kuske, C. R., & Smith, C. 2012. Analysis of stomach & gut 2955 microbiomes of the eastern oyster (Crassostrea virginica) from coastal Louisiana, 2956 USA. PLoS One 7: e51475. 2957 King, G. M., Smith, C., Tolar, B., & Hollibaugh, J. T. (2013). Analysis of 2958 composition and structure of coastal to mesopelagic bacterioplankton 2959 communities in the northern Gulf of Mexico. Front Microbiol. 3: 438. 2960 Klindworth, A., E. Pruesse, T. SchIer, J. Peplies, C. Quast, M. Horn, & F. O. 2961 Gl?ckner. 2012. Evaluation of general 16S ribosomal RNA gene PCR primers for 2962 classical and next-generation sequencing-based diversity studies. Nucleic Acids 2963 Research 41: 1-11. 2964 128 Koo et al, 2015 2965 Kostka J. E., Prakash, O., Overholt, W. A., Green, S. J., Freyer, G., Canion, A., 2966 Delgardio, J., Norton, N., Hazen, T. C., & Huettel, M. 2011. Hydrocarbon- 2967 degrading bacteria & the bacterial community response in Gulf of Mexico beach 2968 impacted by the Deepwater Horizon oil spill. Appl Environ Microbiol 77: 7962?74. 2969 Kraus, RT, & Secor, D. H. 2005. Application of the nursery-role hypothesis to an 2970 estuarine fish. Marine Ecology Progress Series 291: 301. 2971 Kroger, R. L., AND Guthrie, J. F. 1973. Migrations of tagged juvenile Atlantic 2972 menhaden. Transactions of the American Fisheries Society 102: 417?22. 2973 Lagier, J.-C., Million, M., Hugon, P., Armougom, F.,& Raoult, D. 2012. Human 2974 gut microbiota: repertoire and variations. Front Cell Infect Microbiol 2: 136. 2975 Langille, M. G. and others. 2013. Predictive functional profiling of microbial 2976 communities using 16S rRNA marker gene sequences. Nat Biotechnol 31: 814? 2977 21. 2978 Lewis, V. P., & Peters, D. S. (1994). Diet of juvenile and adult Atlantic menhaden 2979 in estuarine and coastal habitats. Transactions of the American Fisheries 2980 Society, 123(5), 803-810. 2981 Lassuy, 1 D. R. 1983. Species profiles-life histories & environmental 2982 requirements (Gulf of Mexico): Gulf menhaden. repositoriestdlorg. 2983 Lejzerowicz, F, Esling, P, Pillet, L, Wilding, TA, Black & Pawlowski, J. 2015. 2984 High-throughput sequencing and morphology perform equally well for benthic 2985 monitoring of marine ecosystems. Sci Report, 5: 13932. 2986 Leray, M, & Knowlton, N. 2015. DNA barcoding and metabarcoding of 2987 standardized samples reveal patterns of marine benthic diversity. PNAS 112: 2988 2076?81. 2989 Leray, M, Yang, JY, Meyer, CP, Mills, SC, Agudelo, N, RanIz, V, Boehm, JT & 2990 Machida, RJ. 2013. A new versatile primer set targeting a short fragment of the 2991 mitochondrial COI region for metabarcoding metazoan diversity: application for 2992 characterizing coral reef fish gut contents. Front Zool 10: 34. 2993 Lessa, EP, & Applebaum, G. (1993). Screening techniques for detecting allelic 2994 variation in DNA sequences. Molecular ecology, 2, 119-129. 2995 Lewis, RM, & Roithmayr, CM. 1981. Spawning and sexual maturity of gulf 2996 menhaden, Brevoortia patronus. Fishery Bulletin 78(4):947-951. 2997 Lewis VP & Peters, DS 1994. Diet of juvenile & adult Atlantic menhaden in 2998 estuarine & coastal habitats. Transactions of the American Fisheries Society 123: 2999 803?10. 3000 Li J, Lawson Handley LJ, Read DS, H?nfling B.2018. The effect of filtration 3001 method on the efficiency of environmental DNA capture and quantification via 3002 metabarcoding. Mol Ecol Resour. doi: 10.1111/1755-0998.12899. 3003 129 Liljestrand, EM, Wilberg, MJ, AND Schueller, AM. 2019. Estimation of movement 3004 and mortality of Atlantic menhaden during 1966?1969 using a Bayesian multi- 3005 state mark-recovery model. Fisheries Research 210: 204?13. 3006 Lindeque, PK, Parry, HE, Harmer, RA., Somerfield, PJ, & Atkinson, A. 2013. 3007 Next generation sequencing reveals the hidden diversity of zooplankton 3008 assemblage. PLoS One 8: e81327. 3009 Lindenmeyer, D. B., AND Likens, G. E. 2011. Direct measurement versus 3010 surrogate indicator species for evaluating environmental change and biodiversity 3011 loss. Ecosystems, 3012 Livingston, RJ, X. Niu, GF Lewis III, & Woodsum, GC. 1997. Freshwater input to 3013 a Gulf estuary: Long-term control of trophic organization. Ecological Applications 3014 7:277-299. 3015 Livingston, RJ. 2007. Phytoplankton bloom effects on a gulf estuary: water 3016 quality changes and biological response. Ecological Applications 17: S110?28 3017 Lochman, R, & Phillips, H. (1996). Stable isotopic evaluation of the relative 3018 assimilation of natural and artificial foods by golden shiners Notemigonus 3019 crysoleucas in ponds. Journal of the World Aquaculture 27:168-177. 3020 Lozupone, C. A., M. Hamady, S. T. Kelley, and R. Knight. 2007. Quantitative and 3021 qualitative beta diversity measures lead to different insights into factors that 3022 structure microbial communities. Appl Environ Microbiol 73: 1576-1585. 3023 Lynch, PD, Brush, MJ, Condon, ED, & Latour, RJ. 2010. Net removal of nitrogen 3024 through ingestion of phytoplankton by Atlantic menhaden Brevoortia tyrannus in 3025 Chesapeake Bay. Marine Ecology Progress Series 401: 195?209. 3026 Majaneva, M, Diserud, OH, Eagle, SHC, Bostr?m, E, Hajibabaei, M, Ekrem T. 3027 2018 Environmental DNA filtration techniques affect recovered biodiversity. Sci 3028 Rep 8:4682. 3029 Mariani, S, Baillie, C, Colosimo, G, Riesgo. 2019 Sponges as natural 3030 environmental DNA samplers. Current Biology 29: R395-R402. 3031 Maroneze, D. M., Tupinambas, T. H., Alves, C. B., & Vieira, F. (2011). Fish as 3032 ecological tools to complement biodiversity inventories of benthic 3033 macroinvertebrates. Hydrobiologia 3034 Maroneze, D. M., Tupinambas, T. H., Alves, C. B., & Vieira, F. (2011). Fish as 3035 ecological tools to complement biodiversity inventories of benthic 3036 macroinvertebrates. Hydrobiologia DOI 10.1007/s10750-011-0747-8. 3037 Martin, D. L., Ross, R. M., Quetin, L. B., & Murray, A. E. 2006. Molecular 3038 approach (PCR-DGGE) to diet analysis in young Antarctic krill Euphausia 3039 superba. Marine Ecology Progress Series 319: 155?65. 3040 McMurdie, PJ, & Holmes, S. 2014. Waste not, want not: why rarefying 3041 microbiome data is inadmissible. PLoS Comput Biol 10: e1003531. 3042 130 Meeter D. A., Livingston, R. J., & Woodsum, G. C. 1979. Ecological processes in 3043 coastal & marine systems. 315?38. 3044 Menhaden Advisory Commission. 2015. The Mehaden Fishery of the Gulf of 3045 Mexico: a regional management plan. Gulf States Marine Fisheries Commission. 3046 Meusnier, I., Singer, G. A. C., Landry, J.-F., Hickey, D. A., Hebert, P. D. N., & 3047 Hajibabaei, M. 2008. A universal DNA mini-barcode for biodiversity analysis. 3048 BMC Genomics 9: 214. 3049 Minello, TJ, & Webb, JW. 1997. Use of natural and created Spartina alterniflora 3050 salt marshes by fishery species and other aquatic fauna in Galveston Bay, 3051 Texas, USA. Mar EcolProgr Ser 151: 165?79. 3052 Moortazavi, B, Iverson, RL, Landing, WM, & Huang, W. 2000. Phosphorus 3053 budget of Apalachicola Bay: a river-dominated estuary in the northeastern Gulf of 3054 Mexico. Mar Ecol Progr Ser 198: 33?42. 3055 Moortazavi, B, Iverson, RL, Landing, WM, Lewis, FG, & Huang, W. 2000. Control 3056 of phytoplankton production and biomass in a river-dominated estuary: 3057 Apalachicola Bay, Florida, USA. Mar Ecol Progr Ser 198: 19-31. 3058 Moortazavi, B, Iverson, RL, & Huang, W. 2000. 2001. Dissolved organic nitrogen 3059 and nitrate in Apalachicola Bay, Florida: spatial distributions and monthly 3060 budgets. Mar Ecol Progr Ser 214: 79-91. 3061 Moreby, S. J. 1988. An aid to the identification of arthropod fragments in the 3062 faeces of gamebird chicks (Galliformes). Ibis 130: 519?26. 3063 M?hling, M., J. Woolven-Allen, J. C. Murrell, & I. Joint. 2008. Improved group- 3064 specific PCR primers for denaturing gradient gel electrophoresis analysis of the 3065 genetic diversity of complex microbial communities. ISME J 2: 379-392. 3066 Murray, D. C., Bunce, M., Cannell, B. L., Oliver, R., Houston, J., White, N. E. et 3067 al. (2011). DNA-based faecal dietary analysis: a comparison of qPCR and high 3068 throughput sequencing approaches. PLoS One, 6(10), e25776. 3069 Muyzer, G., De Waal, E. C., & Uitterlinden, A. G. (1993). Profiling of complex 3070 microbial populations by denaturing gradient gel electrophoresis analysis of 3071 polymerase chain reaction-amplified genes coding for 16S rRNA. Appl. Environ. 3072 Microbiol., 59(3), 695-700. 3073 Nelson, W. R., Ingham, M. C., AND Schaaf, W. E. 1977. LARVAL TRANSPORT 3074 AND YEAR-CLASS STRENGTH OF ATLANTIC MENHADEN, BREVOORTIA- 3075 TYRANNUS. Fishery Bulletin 75: 23?41. 3076 Nicholson, WR. 1978. Gulf menhaden, Brevoortia patronus, purse seine fishery: 3077 Catch, fishing activity, and age and size composition, 1964-73, (ed.), US 3078 Department of Commerce, National Oceanic and Atmospheric Administration, 3079 National Marine Fisheries Service. 3080 Nicholson, N, 1978. Gulf menhaden, Brevoortia patronus, purse seine fishery: 3081 Catch, fishing activity, and age and size composition, 1964-73. US Department of 3082 131 Commerce, National Oceanic and Atmospheric Administration, National Marine 3083 Fisheries Service, Technical Report SSRF-722. 8 pp. 3084 NOAA-NMFS. 2015. Forecast for the 2015 Gulf and Atlantic Menhaden Purse- 3085 Seine Fisheries and Review of the 2014 Fishing Seaso. Gulf and Atlantic 3086 Forecast 2015. 3087 O?Farrell, H., Gruss, A., Sagarese, S. R., Babcock, E. A., & Rose, K. A. 2017. 3088 Ecosystem modeling in the Gulf of Mexico: current status and future needs to 3089 address ecosystem-based fisheries management and restoration activities. Rev 3090 Fish Biol Fisheries 27: 587?614. 3091 Olsen, Z., Fulford, R., Dillon, K., & Graham, W. 2014. Trophic role of gulf 3092 menhaden Brevoortia patronus examined with carbon and nitrogen stable 3093 isotope analysis. Marine Ecology Progress Series 497: 215?27. 3094 PACHECO, A. L., AND G. C. GRANT. 1965. Studies of the Early Life History of 3095 Atlantic Menhaden in Estuarine Nurseries: Part I, Seasonal Occurrence of 3096 Juvenile Menhaden and Other Small Fishes in a Tributary Creek of Indian River, 3097 Delaware, 1957-58. US Fish and Wildlife Service. 3098 Peck, J. I. 1893. On the food of menhaden. Bull US Fish Comm. p113. 3099 Pendleton, B. A. G. 1989. Proceedings of the Northeast Raptor Management 3100 Symposium and Workshop. Institute for Wildlife Research, National Wildlife 3101 Federation. 3102 Peters D. S., & Schaaf, W. E. 1981. Food requirements & sources for juvenile 3103 Atlantic menhaden. Transactions of the American Fisheries Society 110: 317?24. 3104 Peterson, B. J., & Fry, B. (1987). Stable isotopes in ecosystem studies. AnnRev 3105 Ecol Systemat. 18: 293-320. 3106 Pikitch, E., Boersma, P. D., Boyd, I. L., Conover, D. O., Cury, P., Essington, T., 3107 Heppell, S. S., Houde, E. D., Mangel, M., & Pauly, D. 2012. Little fish, big impact: 3108 managing a crucial link in ocean food Ibs. Lenfest Ocean Program, Washington, 3109 DC 108: csiro:EP12435. 3110 Pikitch, E. K., Rountos, K. J., Essington, T. E., Santora, C., Pauly, D., Watson, 3111 R., Sumaila, U. R., Boersma, P. D., Boyd, I. L., AND Conover, D. O. 2014. The 3112 global contribution of forage fish to marine fisheries and ecosystems. Fish and 3113 Fisheries 15: 43?64. 3114 Plag?nyi, ?. E., AND Essington, T. E. 2014. When the SURFs up, forage fish are 3115 key. Fisheries Research 159: 68?74. 3116 Pollock, J., Glendinning, L., WisedchanIt, T., & Watson, M. 2018. The Madness 3117 of Microbiome: Attempting to find consensus ?Best Practice? for 16S microbiome 3118 studies. Appl Environ Microbiol 84: e02627-17. 3119 Pompanon, F., Deagle, B. E., Symondson, W. O. C., Brown, D. S., Jarman, S. 3120 N., & Taberlet, P. 2012. Who is eating what: diet assessment using next 3121 generation sequencing. Molecular Ecology 21: 1931?50. 3122 132 Potter, IC, Tweedley, JR, Elliott, M, AND Whitfield, AK. 2015. The ways in which 3123 fish use estuaries: a refinement and expansion of the guild approach. Fish and 3124 Fisheries 16: 230?39. 3125 Post, DM. 2002. Using stable isotopes to estimate trophic position: models, 3126 methods, and assumptions. Ecology 83: 703?18. 3127 Powell (1993). M. (2002). Using stable isotopes to estimate trophic position: 3128 models, methods, and assumptions. Ecology, 83, 703-718. 3129 Pruesse, E., C. Quast, K. Knittel, B. M. Fuchs, W. Ludwig, J. Peplies, & F. O. 3130 Gl?ckner. 2007. SILVA: a comprehensive online resource for quality checked 3131 and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids 3132 Res 35: 7188-7196. 3133 Putland, J. N. (2005). Ecology of phytoplankton, Acartia tonsa, and 3134 microzooplankton in Apalachicola Bay, Florida. Florida State University. 3135 Putland, J. N., & Iverson, R. L. (2007). Ecology of Acartia tonsa in Apalachicola 3136 Bay, Florida, and implications of river water diversion. Mar Ecol Progr Series 340, 3137 173-187. 3138 Putland, J. N., & Iverson, R. L. (2007). Microzooplankton: major herbivores in an 3139 estuarine planktonic food Ib. Mar Ecol Progr Series 345, 63-73. 3140 Quail, M. A., Smith, M., Coupland, P., Otto, T. D., Harris, S. R., Connor, T. R., 3141 Bertoni, A., SIrdlow, H. P., & Gu, Y. 2012. A tale of three next generation 3142 sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and 3143 Illumina MiSeq sequencers. BMC Genomics 13: 341. 3144 Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., SchIer, T., Yarza, P., Peplies, J., 3145 & Gl?ckner, F. O. 2013. The SILVA ribosomal RNA gene database project: 3146 improved data processing and Ib-based tools. Nucleic Acids Res 41: D590?6. 3147 Rajili?-Stojanovi?, M., Smidt, H., & de Vos, W. M. 2007. Diversity of the human 3148 gastrointestinal tract microbiota revisited. Environ Microbiol 9: 2125?36. 3149 Reid, G. K. 1955. A summer study of the biology and ecology of East Bay, 3150 Texas. Part II. The fish fauna of East Bay, the Gulf beach, and summary. tamug- 3151 irtdlorg 3152 Reintjes , JW. 1962 Development of eggs and yolk-sac larvae of yellowfin 3153 menhaden. U.S. Dep. Inter., Fish Wildl. Serv., Fish Bull. 62: 93-102 3154 Reintjes, JW. 1970. The Gulf menhaden and our changing estuaries. Proc Gulf 3155 Carrib Fish Inst 13: 87?90. 3156 Riemann, LH, Alfredsson, MM, Hansen, TD, Als, TG, Nielsen, P, Munk, K, 3157 Aarestrup, GE, Maes, H, Sparholt, MI, Petersen, M, Bachler & Castonguay, M. 3158 2010. Qualitative assessment of the diet of European eel larvae in the Sargasso 3159 Sea resolved by DNA barcoding. Biol Lett 6: 819?22. 3160 133 Riccioni, G., Stagioni, M., Piccinetti, C., AND Libralato, S. 2018. A metabarcoding 3161 approach for the feeding habits of European hake in the Adriatic Sea. Ecol Evol 3162 8: 10435?47. 3163 Robinson , K., Ruzicka, J., Hernandez, F., Graham, W., Decker, M., Brodeur, R., 3164 Sutor, M., 2015. Evaluating energy flows through jellyfish and gulf menhaden 3165 (Brevoortia patronus) and the effects of fishing on the northern Gulf of Mexico 3166 ecosystems. ICES Journal of Marine Science. 72 10.1093/icesjms/fsv088. 3167 Roeselers G., Mittge, E. K., Stephens, W. Z., Parichy, D. M., Cavanaugh, C. M., 3168 Guillemin, K., & Rawls, J. F. 2011. Evidence for a core gut microbiota in the 3169 zebrafish. ISME J 5: 1595?608. 3170 Roithmayr, C. M., & Waller, R. A. 1963. Seasonal occurrence of Brevoortia 3171 patronus in the northern Gulf of Mexico. Transactions of the American Fisheries 3172 Society 92: 301?2. 3173 Rollo, F., Ubaldi, M., Ermini, L., & Marota, I. (2002). Otzi?s last meals: DNA 3174 analysis of the intestinal content of the Neolithic glacier mummy from the Alps. 3175 Proc Natl Acad Sci U S A, 99(20), 12594-12599. 3176 Sagarese, SR, Nuttall, MA, Geers, TM, Lauretta, MV, Walter, JF, Serafy, JE. 3177 2016. Quantifying the trophic importance of Gulf menhaden within the Northern 3178 Gulf of Mexico ecosystem. Mar & Coastal Fish, 8:23-45. 3179 Schiebelhut, LM, Abboud, SS, G?mez Daglio, LE, Swift, HF, & Dawson, MN. 3180 2017. A comparison of DNA extraction methods for high-throughput DNA 3181 analyses. Mol Ecol Resour 17: 721?29. 3182 Schloss, PD, Istcott, SL, Ryabin, T, Hall, JR, Hartmann, M, Hollister, EB, 3183 Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, 3184 Thallinger GG, Van Horn DJ, Iber, CF 2009. Introducing mothur: Open- 3185 source,platform-independent, community-supported software for describing and 3186 comparing microbial communities. Applied and Environ Microbiol, 75, 7537? 3187 7541. 3188 Schrader, C, Schielke, A, Ellerbroek, L, & Johne, R. 2012. PCR inhibitors - 3189 occurrence, properties and removal. J Appl Microbiol 113: 1014?26. 3190 Schnell, IB., Bohmann, K, & Gilbert, MTP. 2015. Tag jumps illuminated?reducing 3191 sequence?to?sample misidentifications in metabarcoding studies. Molecular 3192 ecology resources 15: 1289?303. 3193 Schnell, IB, Thomsen, PF, & Gilbert, MTP. 2012. Screening mammal biodiversity 3194 using DNA from leeches. Curr Biol, 22: R262-R263. 3195 SEDAR, 2103. SEDAR 34A ? Gulf of Mexico Menhaden Stock Assessment 3196 Report. SEDAR, North Charleston, SC 422 p. 3197 SEDAR63, 2018. SEDAR 63 ? Gulf of Mexico Menhaden Stock Assessment 3198 Report. SEDAR, North Charleston, SC 3199 134 Shaw, RF, Cowan, JH, & Tillman, TL. 1985. Distribution and density of Brevoortia 3200 patronus (Gulf menhaden) eggs and larvae in the continental shelf waters of 3201 western Louisiana. Bulletin of Mar Sci 36: 96?103. 3202 Siegenthaler, A, Wangensteen, OS, Soto, AZ, Benvenuto, C, Corrigan, L, & 3203 Mariani, S. (2018). Metabarcoding of shrimp stomach content: Harnessing a 3204 natural sampler for fish biodiversity monitoring. Mol. Ecol. Resour. 19, 206?220. 3205 Simon M, Scheuner, C, Meier-Kolthoff, JP, Brinkhoff, T, Wagner-D?bler, I, 3206 Ulbrich, M, Klenk, HP, Schomburg, D, Petersen, J, & G?ker, M. 2017. 3207 Phylogenomics of Rhodobacteraceae reveals evolutionary adaptation to marine 3208 & non-marine habitats. ISME J 11: 1483?99. 3209 Simmons, EG. 1967. The Texas menhaden fishery. Texas Parks and Wildlife 3210 Department. 3211 Simmons, EG, & Breuer, JP. 1964. edition. The Texas menhaden fishery. Texas 3212 Parks and Wildlife Department Bulletin. 3213 Simpson, C. A., Bi, H., Liang, D., Wilberg, M. J., Schueller, A. M., Nesslage, G. 3214 M., & Walsh, H. J. 2017. Spawning locations and larval dispersal of Atlantic 3215 Menhaden during 1977?2013. ICES Journal of Marine Science 74: 1574?86. 3216 Simpson et al, 2016 3217 Soininen, EM, Valentini, A, Coissac, E, Miquel, C, Gielly, L, Brochmann, C. et al. 3218 (2009). Analysing diet of small herbivores: the efficiency of DNA barcoding 3219 coupled with high-throughput pyrosequencing for deciphering the composition of 3220 complex plant mixtures. Front Zool, 6, 16. 3221 Song W, Li, L, Huang, H, Jiang, ., Zhang, F, Chen, X, Zhao, M, & Ma, L. 2016. 3222 The gut microbial community of Antarctic fish detected by 16S rRNA gene 3223 sequence analysis. Biomed Res Int 2016: 3241529. 3224 Sullam KE, Essinger, SD, Lozupone, C., O?Connor, MP, Rosen, GL, Knight, R, 3225 Kilham, SS, & Russell, JA 2012. Environmental & ecological factors that shape 3226 the gut bacterial communities of fish: a meta-analysis. Mol Ecol 21: 3363?78. 3227 Suttkus, RD 1956. Early life history of the Gulf menhaden, Brevoortia patronus. 3228 Louisiana Transactions of the North American Wildlife Conference 21: 390?406. 3229 Symondson, W. O. C. 2002. Molecular identification of prey in predator diets. Mol 3230 Ecol 11: 627?41 3231 Taberlet, P, Coissac, E, Pompanon, F, Brochmann, C, Willerslev, E. 2012. 3232 Towards next-generation biodiversity assessment using DNA metabarcoding. 3233 Mol. Ecol, 21: 2045-2050. 3234 Tarnecki A., Burgos, FA, Ray, CL, & Arias, CR. 2017. Fish intestinal microbiome: 3235 diversity and symbiosis unravelled by metagenomics. J Appl Microbiol 123: 2?17. 3236 Telechea, 2009 3237 135 Thomsen, PF, Kielgast, J, Iversen, LL, M?ller, PR, Rasmussen, M, & Willerslev, 3238 E. 2012. Detection of a diverse marine fish fauna using environmental DNA from 3239 seawater samples. PLoS ONE, 7, e41732. 3240 Tsilimigras, MC and Fodor, AA. 2016. Compositional data analysis of the 3241 microbiome: fundamentals, tools, and challenges. Ann Epidemiol., 26(5): 330-5. 3242 Tuoinambas, T. H., Callisto, M., & Santos, G. B. (2007). Benthic 3243 macroinvertebrate assemblages structure in two headwater streams, south- 3244 eastern Brazil. Rev Brasileira de Zool, 24. 3245 Turner, WR. 1969. Life history of menhadens in the eastern Gulf of Mexico. 3246 Transactions of the American Fisheries Society 98: 216?24. 3247 Ushio, M. 2019. Use of a filter cartridge combined with intra?cartridge bead? 3248 beating improves detection of microbial DNA from water samples. Methods in 3249 Ecology & Evolution 10: 1142?56. 3250 Valentini, A, Miquel, C, Nawaz, MA, Bellemain, E, Coissac, E, Pompanon, F, 3251 Gielly, L, Cruaud, C, Nascetti, G, Wincker, P, Swenson, JE, Taberlet, P, 2009. 3252 New perspectives in diet analysis based on DNA barcoding and parallel 3253 pyrosequencing: the trnL approach. Mol. Ecol. Resour. 9: 51e60. 3254 Van Valkenburg, SD, Jones, JK, & Heinle, DR. 1978. A comparison by size class 3255 and volume of detritus versus phytoplankton in Chesapeake Bay. Estuarine and 3256 Coastal Marine Science 6: 569?82. 3257 Vander Zanden, MJ, & Rasmussen, JB. 1999. Primary consumer ?15N and 3258 ?13C and the trophic position of aquatic consumers. Ecology 80: 1395?1404. 3259 Vanysacker, L, Declerck, SA, Hellemans, B, De Meester, L, Vankelecom, I, & 3260 Declerck, P. 2010. Bacterial community analysis of activated sludge: an 3261 evaluation of four commonly used DNA extraction methods. Applied Microbiology 3262 and Biotechnology 88: 299?307. 3263 Vaughan, D. S., Shertzer, K. W., AND Smith, J. W. 2007. Gulf menhaden 3264 (Brevoortia patronus) in the US Gulf of Mexico: fishery characteristics and 3265 biological reference points for management. Fisheries Research 83: 263?75. 3266 Walter J. M., Bagi, A., & Pampanin, D. M. 2019. Insights into the potential of the 3267 Atlantic cod gut microbiome as biomarker of oil contamination in the marine 3268 environment. Microorganisms 7: 209. 3269 Wang, Y, Tian, RM, Gao, ZM, Bougouffa, S, & Qian, P-Y. 2014. Optimal 3270 eukaryotic 18S and universal 16S/18S ribosomal RNA primers and their 3271 application in a study of symbiosis. PLoS One 9: e90053. 3272 Waraniak, JM, Marsh, TL, & Scribner, KT. 2019. 18S rRNA metabarcoding diet 3273 analysis of a predatory fish community across seasonal changes in prey 3274 availability. Ecol Evol 9: 1410?30. 3275 Ward, DM., Weller, R, & Bateson, MM. 1990. 16S rRNA sequences reveal 3276 numerous uncultured microorganisms in a natural community. Nature 345: 63? 3277 65. 3278 136 Weigand, H, A.J. Beermann, F. ?iampor, F. O. Costa, Z. Csabai, S. Duarte, M. 3279 F. Geiger, M. Grabowski, F. Rimet, B. Rulik, M. Strand, N. Szucsich, A. M. 3280 Iigand, E. Willassen, S. A. Wyler, A. Bouchez, A. Borja, Z. ?iamporov?- 3281 Za?ovi?ov?, S. Ferreira, K. B. Dijkstra, U. Eisendle, J. Freyhof, P. Gadawski, W. 3282 Graf, A. Haegerbaeumer, B. B. van der Hoorn, B. Japoshvili, L. Keresztes, E. 3283 Keskin, F. Leese, J. N. Macher, T. Mamos, G. Paz, V. Pe?i?, D. M. Pfannkuchen, 3284 M. A. Pfannkuchen, B. W. Price, B. Rinkevich, M. A. L. Teixeira, G. V?rb?r?, & T. 3285 Ekrem. 2019. DNA barcode reference libraries for the monitoring of aquatic biota 3286 in Europe: Gap-analysis and recommendations for future work. Sci Total Environ 3287 678: 499-524. 3288 Weisburg, WG, Barns, SM, Pelletier, DA, & Lane, DJ. 1991. 16S ribosomal DNA 3289 amplification for phylogenetic study. J Bacteriol 173: 697?703. 3290 Weis, S., Xu, ZZ,. Peddada, S., et al. 2017. Normalization and microbial 3291 differential abundance strategies depend upon data characteristics. Microbiome 3292 5, 27. 3293 White, JR, Nagarajan, N, & Pop, M. 2009. Statistical methods for detecting 3294 differentially abundant features in clinical metagenomic samples. PLoS Comput 3295 Biol 5: e1000352. 3296 Wustman, BA., Lind, J., Wetherbee, R., Gretz, MR., 1998. Extracellular Matrix 3297 Assembly in Diatoms (Bacillariophyceae). Plant Physiology Apr. 116(4) 1431- 3298 1441. 3299 Yarza, P, Yilmaz, P, Pruesse, E, Gl?ckner, FO, Ludwig, W, Schleifer, KH. et al. 3300 (2014). Uniting the classification of cultured and uncultured bacteria and archaea 3301 using 16S rRNA gene sequences. Nat Rev Microbiol, 12(9), 635-645. 3302 Zarrinpar, A, Chaix, A, Yooseph, S, & Panda, S. (2014). Diet and feeding pattern 3303 affect the diurnal dynamics of the gut microbiome. Cell Metab, 20(6), 1006-1017. 3304 Zeale, MR, Butlin, RK, Barker, GL, Lees, DC, & Jones, G. (2011). Taxon-specific 3305 PCR for DNA barcoding arthropod prey in bat feces. Mol Ecol Resour, 11(2), 3306 236-244. 3307 3308 137