Data Sets
These are the data sets needed during the workshop.
Download Files¶
You need R functions to download files from the internet. You can use either download.file from the package utils or function curl from the package curl. You only need one and the choice is yours.
## Your current (working) directory getwd() ## Change working directory (if needed) # setwd("my/folder") ## Create new working directory (if needed) # dir.create("MDA"); setwd("MDA")
An Apple A Day¶
## Create new working directory dir.create("Apple"); setwd("Apple") ## Download data (R data image file) apple.url <- "https://www.gdc-docs.ethz.ch/MDA/data/Wassermann_2019.RData" utils::download.file(apple.url, destfile = "Wassermann_2019.RData") ## Alternaitve download (curl has more and better options) # curl:curl_download(apple.url, "Wassermann_2019.RData") ## Verify loaded data load("Wassermann_2019.RData") apple <- d apple # phyloseq-class experiment-level object # otu_table() OTU Table: [ 3128 taxa and 48 samples ] # sample_data() Sample Data: [ 48 samples by 7 sample variables ] # tax_table() Taxonomy Table: [ 3128 taxa by 7 taxonomic ranks ] # phy_tree() Phylogenetic Tree: [ 3128 tips and 3127 internal nodes ] setwd("../")
Food-Specific Bacterial Communities¶
## Create new working directory dir.create("Food"); setwd("Food") ## Download data (R data image file) food.url <- "https://www.gdc-docs.ethz.ch/MDA/data/Chaillou2015.zip" utils::download.file(food.url, destfile = "Chaillou2015.zip") unzip("Chaillou2015.zip") file.remove("Chaillou2015.zip") list.files() # "chaillou.biom" # "Chaillou2015b.pdf" # "otu_table.tsv" # "sample_data.tsv" # "sequences.fasta" # "tax_table.tsv" # "tree.nwk" setwd("../")
Let us see, if we have what we need.
## Function twee - a plain text listing of directories ## similar to tree in linux dir.create("Scripts"); setwd("Scripts") twee.url <- "https://www.gdc-docs.ethz.ch/MDA/scripts/twee.R" utils::download.file(twee.url, destfile = "twee.R") source("twee.R") setwd("../") ## Show diretories twee("MDA/")
If everything worked, you should see the following directories and files:
MDA/
├── Apple
│ └── Wassermann_2019.RData
└── Food
├── Chaillou2015b.pdf
├── Chaillou2015.zip
├── chaillou.biom
├── otu_table.tsv
├── sample_data.tsv
├── sequences.fasta
├── tax_table.tsv
└── tree.nwk
Load and Adjust Food Data¶
## Import from biom biomfile <- "chaillou.biom" treefile <- "tree.nwk" food <- import_biom(biomfile, treefile, parseFunction = parse_taxonomy_greengenes) ## Verify Import food # otu_table() OTU Table: [ 508 taxa and 64 samples ] # sample_data() Sample Data: [ 64 samples by 3 sample variables ] # tax_table() Taxonomy Table: [ 508 taxa by 7 taxonomic ranks ] # phy_tree() Phylogenetic Tree: [ 508 tips and 507 internal nodes ] ## Summary microbiome::summarize_phyloseq(food) ## Variables sample_data(food) ## French to English dictionary = c("BoeufHache" = "Ground_Beef", "VeauHache" = "Ground_Veal", "MerguezVolaille" = "Poultry_Sausage", "DesLardons" = "Bacon_Dice", "SaumonFume" = "Smoked_Salmon", "FiletSaumon" = "Salmon_Fillet", "FiletCabillaud" = "Cod_Fillet", "Crevette" = "Shrimp") env_type <- sample_data(food)$EnvType sample_data(food)$EnvType <- factor(dictionary[env_type], levels = dictionary) ## Add Sample ID sample_data(food)$SID <- sample_names(food) ## Save / Load # save.image("food.Rdata") # load("food.Rdata")