Skip to content

MIDORI - Teleo

In-silico PCR

  • Primer: Teleo
  • Reference: MIDORI based and fish related fragments
for E in {0..3}
do
  # PCR (exp size: 63nt)
  ${u} -search_pcr MIDORI_FishHits.fa \
       -db Primer_Teleo.fa -strand both \
       -maxdiffs ${E} \
       -minamp 20 \
       -maxamp 200 \
       -pcrout MIDORI_Teleo_e${E}.hits \
       -ampout MIDORI_Teleo_e${E}.fa
done

PCR Fragment-Length Distribution:

# N(e0) = 4,064
#  80  1
#  90 ***************************** 1307
# 100 ************************************************************ 2,747
# 110  5
# 120  3
# 130  0
# 140  0
# 150  0
# 160  1

# N(e1) = 5,751
#  80  7
#  90 **************************** 1,797
# 100 ************************************************************ 3,898
# 110  19
# 120  29
# 130  0
# 140  0
# 150  0
# 160  1

# N(e2) = 6,337
#  80  8
#  90 ******************************* 2,161
# 100 ************************************************************ 4,119
# 110  19
# 120  29
# 130  0
# 140  0
# 150  0
# 160  1

# N(e3) = 6,382
#  80  8
#  90 ******************************** 2,182
# 100 ************************************************************ 4,143
# 110  19
# 120  29
# 130  0
# 140  0
# 150  0
# 160  1

# Note: error == mismatch

Problematic Hits

We remove PCR hits with primer mismatches at the last 2 positions at the 3`-end. This choice is based for reproducibility with Thang et al. (2020).

for E in {0..3}
do
  echo "Mis-Match: ${E}"
  awk -F"\t" '{if($5 == "Teleo-F" && $8 == "Teleo-R" && substr($7,length($7)-1,length($7)) !~ "[ATCG]" && substr($10,length($10)-1,length($10)) !~ "[ATCG]")  print ">"$1"\n"$12}' MIDORI_Teleo_e${E}.hits > MIDORI_Teleo_e${E}_clean.fa
  awk -F"\t" '{if($5 == "Teleo-F" && $8 == "Teleo-R" && (substr($7,length($7)-1,length($7)) ~ "[ATCG]" || substr($10,length($10)-1,length($10))  ~ "[ATCG]")) print ">"$1"\n"$12}' MIDORI_Teleo_e${E}.hits > MIDORI_Teleo_e${E}_out.fa
done

Results (sequence counts):

## Clean Hits
cfa MIDORI_Teleo_e*_clean.fa
# MIDORI_MiFish_e0_clean.fa: 4,064
# MIDORI_MiFish_e1_clean.fa: 5,708
# MIDORI_MiFish_e2_clean.fa: 6,280
# MIDORI_MiFish_e3_clean.fa: 6,319

## Remove Species
cfa MIDORI_Teleo_e*_out.fa
# MIDORI_Teleo_e0_out.fa:  0
# MIDORI_Teleo_e1_out.fa: 43
# MIDORI_Teleo_e2_out.fa: 57
# MIDORI_Teleo_e3_out.fa: 63

## Removed Records (mismatches at the 3`-end):
* Species: Acanthaphritis unoorum (ID:270607)
* Species: Amblygaster sirm (ID:997022)
* Species: Anguilla mossambica (ID:48164)
* Species: Apteronotus albifrons (ID:36673)
* Species: Apteronotus rostratus (ID:1479096)
* Species: Ariosoma balearicum (ID:182421)
* Species: Barilius bendelisis (ID:209118)
* Species: Barilius malabaricus (ID:1982766)
* Species: Betadevario ramachandrani (ID:794813)
* Species: Cabdio morar (ID:1504033)
* Species: Callanthias japonicus (ID:270594)
* Species: Carassius carassius (ID:217509)
* Species: Chelon dumerili (ID:693640)
* Species: Chelon planiceps (ID:1111461)
* Species: Cynoglossus puncticeps (ID:435148)
* Species: Danio dangila (ID:127599)
* Species: Danio margaritatus (ID:487618)
* Species: Danionella dracula (ID:623740)
* Species: Danio nigrofasciatus (ID:144739)
* Species: Danio rerio (ID:7955)
* Species: Devario chrysotaeniatus (ID:496980)
* Species: Devario devario (ID:46781)
* Species: Devario laoensis (ID:437613)
* Species: Echiichthys vipera (ID:94984)
* Species: Electrona carlsbergi (ID:123328)
* Species: Esomus metallicus (ID:353259)
* Species: Etheostoma tuscumbia (ID:54347)
* Species: Ictalurus pricei (ID:64534)
* Species: Luciosoma bleekeri (ID:643386)
* Species: Luciosoma setigerum (ID:487620)
* Species: Microdevario kubotai (ID:857698)
* Species: Microdevario nana (ID:487621)
* Species: Microrasbora erythromicron (ID:432391)
* Species: Microrasbora rubescens (ID:451687)
* Species: Nannocharax schoutedeni (ID:1104209)
* Species: Opsaridium microlepis (ID:1346795)
* Species: Opsaridium ubangiense (ID:643343)
* Species: Oxydoras niger (ID:238584)
* Species: Paraplagusia bilineata (ID:1148453)
* Species: Paraplagusia blochii (ID:366904)
* Species: Promethichthys prometheus (ID:349644)
* Species: Raiamas buchholzi (ID:857649)
* Species: Raiamas guttatus (ID:75360)
* Species: Raiamas senegalensis (ID:516811)
* Species: Raiamas steindachneri (ID:1001924)
* Species: Rineloricaria stewarti (ID:1748020)
* Species: Salmostoma bacalia (ID:497986)
* Species: Sigmops gracilis (ID:48457)
* Species: Sillaginopsis panijus (ID:270580)
* Species: Squalomugil nasutus (ID:1040953)
* Species: Trachinus draco (ID:56737)

Summary

Step-by-step
for E in 0 1 2 3
do
  # Class
  awk -F",c:" '{if($1 ~ />/) print $2}' MIDORI_Teleo_e${E}_clean.fa |\
  awk -F",|_" '{if(length($1)>0) print $1" (TaxID:"$2")"}' |\
  sort -u > MIDORI_Teleo_e${E}_Class.txt
  # Order
  awk -F",o:" '{if($1 ~ />/) print $2}' MIDORI_Teleo_e${E}_clean.fa |\
  awk -F",|_" '{if(length($1)>0) print $1" (TaxID:"$2")"}' |\
  sort -u > MIDORI_Teleo_e${E}_Order.txt
  # Family
  awk -F",f:" '{if($1 ~ />/) print $2}' MIDORI_Teleo_e${E}_clean.fa |\
  awk -F",|_" '{if(length($1)>0) print $1" (TaxID:"$2")"}' |\
  sort -u > MIDORI_Teleo_e${E}_Family.txt
  # Genus
  awk -F",g:" '{if($1 ~ />/) print $2}' MIDORI_Teleo_e${E}_clean.fa |\
  awk -F",|_" '{if(length($1)>0) print $1" (TaxID:"$2")"}' |\
  sort -u > MIDORI_Teleo_e${E}_Genus.txt
done
./PrintSummary.sh MIDORI_Teleo # Create Simple Summary Report including WebLogo