Data Sets
These are the data sets needed during the workshop.
Download Files¶
You need R functions to download files from the internet. You can use either download.file
from the package utils
or function curl
from the package curl
. You only need one and the choice is yours.
## Your current (working) directory getwd() ## Change working directory (if needed) # setwd("my/folder") ## Create new working directory (if needed) # dir.create("MDA"); setwd("MDA")
An Apple A Day¶
## Create new working directory dir.create("Apple"); setwd("Apple") ## Download data (R data image file) apple.url <- "https://www.gdc-docs.ethz.ch/MDA/data/Wassermann_2019.RData" utils::download.file(apple.url, destfile = "Wassermann_2019.RData") ## Alternaitve download (curl has more and better options) # curl:curl_download(apple.url, "Wassermann_2019.RData") ## Verify loaded data load("Wassermann_2019.RData") apple <- d apple # phyloseq-class experiment-level object # otu_table() OTU Table: [ 3128 taxa and 48 samples ] # sample_data() Sample Data: [ 48 samples by 7 sample variables ] # tax_table() Taxonomy Table: [ 3128 taxa by 7 taxonomic ranks ] # phy_tree() Phylogenetic Tree: [ 3128 tips and 3127 internal nodes ] setwd("../")
Food-Specific Bacterial Communities¶
## Create new working directory dir.create("Food"); setwd("Food") ## Download data (R data image file) food.url <- "https://www.gdc-docs.ethz.ch/MDA/data/Chaillou2015.zip" utils::download.file(food.url, destfile = "Chaillou2015.zip") unzip("Chaillou2015.zip") file.remove("Chaillou2015.zip") list.files() # "chaillou.biom" # "Chaillou2015b.pdf" # "otu_table.tsv" # "sample_data.tsv" # "sequences.fasta" # "tax_table.tsv" # "tree.nwk" setwd("../")
Let us see, if we have what we need.
## Function twee - a plain text listing of directories ## similar to tree in linux dir.create("Scripts"); setwd("Scripts") twee.url <- "https://www.gdc-docs.ethz.ch/MDA/scripts/twee.R" utils::download.file(twee.url, destfile = "twee.R") source("twee.R") setwd("../") ## Show diretories twee("MDA/")
If everything worked, you should see the following directories and files:
MDA/ ├── Apple │ └── Wassermann_2019.RData └── Food ├── Chaillou2015b.pdf ├── Chaillou2015.zip ├── chaillou.biom ├── otu_table.tsv ├── sample_data.tsv ├── sequences.fasta ├── tax_table.tsv └── tree.nwk
Load and Adjust Food Data¶
## Import from biom biomfile <- "chaillou.biom" treefile <- "tree.nwk" food <- import_biom(biomfile, treefile, parseFunction = parse_taxonomy_greengenes) ## Verify Import food # otu_table() OTU Table: [ 508 taxa and 64 samples ] # sample_data() Sample Data: [ 64 samples by 3 sample variables ] # tax_table() Taxonomy Table: [ 508 taxa by 7 taxonomic ranks ] # phy_tree() Phylogenetic Tree: [ 508 tips and 507 internal nodes ] ## Summary microbiome::summarize_phyloseq(food) ## Variables sample_data(food) ## French to English dictionary = c("BoeufHache" = "Ground_Beef", "VeauHache" = "Ground_Veal", "MerguezVolaille" = "Poultry_Sausage", "DesLardons" = "Bacon_Dice", "SaumonFume" = "Smoked_Salmon", "FiletSaumon" = "Salmon_Fillet", "FiletCabillaud" = "Cod_Fillet", "Crevette" = "Shrimp") env_type <- sample_data(food)$EnvType sample_data(food)$EnvType <- factor(dictionary[env_type], levels = dictionary) ## Add Sample ID sample_data(food)$SID <- sample_names(food) ## Save / Load # save.image("food.Rdata") # load("food.Rdata")