Count (OTU) Table - GDC First Aid Kit

OTU/Count Table (Matrix)

	Sample A	Sample B	Sample C	Mock	Negative	Taxa
ZOTU1	5000	24	2	0	3	KCOFGS
ZOTU2	1202	546	34	734	0	KCOFGS
ZOTU3	560	899	124	0	0	KCOFGS
ZOTU4	12	356	5461	0	0	KCOFGS
ZOTU5	0	0	65	0	0	KCOFGS
ZOTU6	0	5	0	0	1	KCOFGS
ZOTU7	0	34	0	0	2	KCOFGS
ZOTU8	0	0	453	1432	0	KCOFGS
ZOTU9	1	0	0	243	12	KCOFGS
ZOTU10	0	0	0	546	0	KCOFGS
...

cat e_OTU/*_ZOTU_Count.summary

compositional (multiple parts of non-negativ numbers)
high dimensional (few data points and many features)
underdetermined (the number of OTUs is much greater than the number of samples)
overdispersed (variance of the counts of reads is larger than expected)
zero-inflated (many zeros)

There are various approaches to correcting for one or more of the data structure difficulties described above.

Schloss (2023) Rarefaction is currently the best approach to control for uneven sequencing effort in amplicon sequence analyses. bioRxiv 2023.06.23.546313.