Data Transfer
To transfer data from and to Euler you need a secure file transfer program or have to use a Linux command such as scp or rsync. For smaller files the graphical-user-interface tools Cyberduck or FileZilla can be used. To transfer large data sets (> 1 Gb) we suggest using scp or rsync, which in case of a failure will only send files again that did not arrive properly.
Connections from the compute nodes to servers outside ETH can be made vie the ETH's proxy server.
module load eth_proxy
Small files - via a GUI using Cyberduck
The first time you transfer data you need to create a new server bookmark in Cyberduck.
- Select "Bookmark" -> "New Bookmark" from the menu
- Select SFTP protocol.
- Enter a nickname (e.g. Euler), the server address (euler.ethz.ch) and your
<USER>
. Do not change the port number - Close the window by clicking the red dot, top left corner.
- Now you have a new server bookmark created and you can connect to the server by double-clicking it.
Configuration of Cyberduck
The next time you can simply connect by double‐clicking the bookmark.
In order to access scratch, GDC home and GDC projects we recommend to generate symbolic links
Only use these softlinks for navigation and not to run commands for example.
ln -s /cluster/work/gdc/people/<USER> gdc_home
ln -s /cluster/scratch/<USER> scratch
ln -s /cluster/work/gdc/shared/p999 p999
MobaXterm users have an SFTP client already integrated.
Large files - via the terminal
Use scp to send a file from your computer to Euler
scp local-file <USER-ID>@euler.ethz.ch:/cluster/scratch/<USER-ID>/folder
Or pull data from Euler
scp <USER-ID>@euler.ethz.ch:/cluster/scratch/<USER-ID>/folder local-file
For many files such as uploading raw data we recommend to use rsync since you can continue a stopped job
rsync -av local‐file <USER-ID>@euler.ethz.ch:/cluster/scratch/<USER-ID>/folder
- Do not forget the ":" at the end of the server address (euler.ethz.ch:). The rsync option "-a" invokes "archive" mode, which will overwrite existing files with the same name, size and date, but not remove files on an existing remote folder if they are not present in the local‐dir on the sending computer. If you need to transfer huge files or directories be aware that it can take several minutes per Gb.
- For more options use the help option (e.g.
rsync -h
orscp -h
) or have a look at the manual (e.g. man rsync)
Verify your data after transfer.
Compare md5sums before and after transfer to check the file intensity. Sequencing facilities normally provide md5sum values (e.g. list with md5sum for each file).
#Generate list with md5sums
md5sum *fq.gz > md5sums.txt
#Verify md5sums
md5sum --check md5sums.txt