Skip to content

P677 - MiFish


Project Aim

"The goal would be to be able to name as many OSUs as possible down to species level: Since we have species level in the traditional data lists and we hope that eDNA locally detects the same fish as the traditional method, possibly some more (DNA transport, false negatives in electric fishing). On the one hand, we do not expect so many species, and on the other hand, the differences between their environmental requirements are relatively large, so a good annotation would be important as we link communities to water quality in a second step. So, as always, both would be important: as many taxonomic annotations as possible, and as trustworthy ones as possible."
- Jeanine Brantschen

Overview

  • Data Summary: MiSeq run200720 (12S) overview, data summary, and a description of the data processing.
  • Primer Evaluation: Primer information, expected amplicon size, and taxa coverage.
  • MiFish Reference: Sanger sequenced 12S rRNA reference collection from Swiss fish.
  • Clustering: Classical approach with UPRASE and UNOISE clustering.
  • ZOTU Annoation ZOTU annoation predictions with SINTAX classifier and available references (MitoFish and MIDORI).
  • Data Analysis Data interpretation based on the ZOTU count table with taxonomic annoation.
  • OSU Mapping: Mapping cleaned reads to MiFish reference.
  • Phylogentic Comparison: MiFish and ZOTUs.

Summary

  • The clustering approach worked, but the number of ZOTUs (7309) was high. There are a few dominate ZOTUs but the majority is rare.
  • The annotation with the MitoFish reference and the Sintax classifier worked but did not result in many and deep associations. An increase in diversity of the reference helps to better understand the problem with the previous annotation, but it did not improve depth. Mixing different references can be problematic because of the variable nomenclature and the increase in redundance.
  • Co-amplification of non-fish targets is difficult to prevent. Most of the ZOTUs (71%) are assigned to bacteria and not eukaryotes. We also found evidence for the presence of human, cattle, pig and a few more eukaryotes. These species do not have prefect primer sites and only amplify with a minimum of three mismatches.
  • The OSU approach worked but back-mapping efficiency varies (50% - 80%) depending on the parameters used. The results show that identity (ID) has a bigger influence on efficiency than query coverage (QC). Although more reads map to the OSUs if we lower ID the fish-species composition does not change much.
  • We can also increase back-mapping efficiency, including known fish-species (cod from the positive samples) and some possible non-fish targets (e.g. human).
  • The top ZOTU sequences are identical to the top OSUs but not all top ZOTUs have species-level annotations. A possible explanation is the lack of local species in the existing references and the high diversity in closely related but not native species.
  • The quality and depth of the annotation/association depends on the reference. The more diverse and specific it is the better the annotation. The ideal reference should only include possible target species and for each species we would need many sequences from multiple individuals (sequence diversity).
  • There are limits to both approaches. The 12S amplicon is ~170nt in length and the resolution-power is limited. For example, for genius Salmo we are not able to distinguish between Salmo salar and all the other Salmo species (e.g. Salmo trutta, Salmo carpio, Salmo labrax, Salmo marmoratus).

Feedback

Let me know if you have questions, requests or you spotted errors. It is all work in progress and constructive comments are welcome. Thanks! ➜ jean-claude.walser-at-env.ethz.ch.