News Archive

12.11.24: WebGUI

At the moment the WebGUI does not provide the memory usage correctly. The Cluster Support is working on the issue. You can use reportseff --user $USER --since d=1 instead.

06.11.24 Slurm update

The slurm update has been finised and the queues are open again. Here are the current information.

30.10.24: VPN

The VPN is working again for all users. Users with a ETH guest account will need to use staff-net in the future.

20.08.24: DNS outage

Due to an ETH-wide DNS outage, jobs may have crashed between 10:25 and about 13:00. Euler is now back in normal mode.

12.08.24: Work file system

We are still experiencing problems with the work file system. Please use your personal scratch ($SCRATCH) whenever possible, which is best practice anyway (more information).

Ubuntu Software Stack

On Thursday 27 June at 07:00, Ubuntu will become the new default operating system of Euler. Most of the important tools have been reinstalled, but not yet tested. If you find any bugs, please let us know. Finde more information to use the new software stack here.

19.06.24: Power outage

Due to a power outage at the CSCS data centre early in the morning, Euler is offline. All jobs running on Euler have been lost. The Cluster Support is working to bring the cluster back online more infos.

June Maintenance

Euler will be down from 4 - 6 June 2024. More information is provided here.

Ubuntu Software Stack

Euler's operating system will be upgraded to Ubuntu in Summer, which will require a complete reinstallation of the software stack. Tools will be reinstalled by priority and we will do our best to reinstall your tools as soon as possible. We recommend that you plan your migration well in advance. We keep you informed about the changes here. More information can be found here.

04.03.24:Slurm down

Slurm is currently blocked. The issue has been fixed.

22.02.24: Reduced capacity

Waiting time with slurm is currently high due to a system issue that forces us to restart nodes. It will be resolved in the next days. More information is provided here.

Maintenance of the file system

On Tuesday 13 February, the file system may be unavailable in the morning and queues will remain closed for a couple of hours.

MFA for VPN

Multifactor authentication (MFA) is now also to be extended to VPN connections as of 17 January 2024. When logging into the VPN, you should enter your one-time password in the "Second password" input field. Find the revevant information here.

cisco VPN

Please exclusively use the Cisco Secure VPN Client going forward to connect the the ETH VPN. The native clients are not supported anymore. Finde the relevant information here.

Web GUI for slurm jobs

The cluster Support has released a beta version of a web interface that allows you to check the status and efficiency of your jobs on Euler. The GUI is rather useful to monitore your jobs. Any feedback could be providied us Jobs.

Saturday 7 October 2023: Infrastructure inciden

Saturday 7 October 2023 around 19:00, a number of compute nodes went down due to some infrastructure incident at CSCS. The affected compute nodes will be brought back into operation as soon as possible.

Euler maintenance

Euler will be offline from 24 - 27 October 2023. As usual the queues will be progressivly inactivated. Find more information here.

17.08.2023: Network issues

Euler suffers from network problems. The cluster Support is working on this problem.

Planned maintenance

4-11 August: The water cooling system will be replaced and Euler will be offline. As usual the queues will be progressively inactived. Find more information here.

03.07.2023: Outage

There was an outage July 3 2023 in the CSCS datacenter. Running jobs got killed. Queues will be progressively re-opened. More information will be provided here.

04.05.23: Transition to slurm

Most compute nodes are now under slurm control. We now recommend to switch to slurm and have updated our manual accordingly. Lsf nodes are still available if you want to use your old scripts. To port your scripts, take a look at this section. Every now and then we see problems related to slurm, so don’t hesitate to contact us.

04.05.23: Planned maintenance

Second week of June The storage system will be upgraded and the file system will not be accessible for 1-2 days.
Second week of August The water cooling system will be replaced. Euler will be unavailable for one week.

09.02.23: Power outage

There was a power outage in the CSCS datacenter. All running jobs have been crashed. Most of the queues are open again. Find more information here.

News: 15.09.22

Euler will be offline from Thursday, 18 October until Monday, 24 October 2022 due to maintenance. Batch queues will be progressively inactivated prior to the maintenance. More information could be found here.

News: 13.05.22

The problem with the file system has been fixed. Euler operates normal again.

News: 12.05.22

There is a problem with the file system, and /cluster/work is only accessible through the login nodes, and the job queues are inactive. The storage controller has been replaced and integrity tests are in progress. The Cluster Support is working diligently to get the job queues back online Friday morning. More information can be found here.

News: 09.05.22

Transition from LSF to Slurm. The Scientific IT Services decided last year to phase out LSF in favour of Slurm. The most important including the new commands will be posted here. The transition should be finished by September 2022. Some slum nodes will be available in Summer. Find more information here.

News: 23.03.22

Tonight the queues are suspended and the file system is partially inaccessible.

News: 21.03.22

Euler will not be accessible from Tuesday 5 to Thursday 7 April 2022 due to a global maintenance. Batch queues will be progressively inactivated prior to the maintenance. More information could be found here.

News: 10.03.22

We currently have a problem with the file system. The Cluster Support might already be working on this problem. The problem has been fixed.

News: 26.01.22

On 1. February 2022 the new software stack will be active as default. If you like to use the old GDC software stack use lmod2env, change to the old stak permanently using set_software_stack.sh old or add source /cluster/apps/local/lmod2env.sh to your submission scripts. Finde more information here.

News: 03.01.22

A happy, healthy and successful New Year.

News: 07.12.21

Euler will operate during the holidays (17.12.2021 - 02.01.2022) but the capacity will be reduced. Nobody is on duty in case something happens. We, therefore, kindly ask you to play nice on Euler: Do not submit too many or large jobs, do not leave any jobs unattended and be careful when generating a lot of new data or inodes on gdc home or gdc projcts.

News: 29.11.21

We are experiencing scaling issues with some of the new nodes. If your job is not scaling (bbjobs) use #BSUB -R "select[nthreads==2]"to use only old nodes, which are not effected by this issue.

News: 23.11.21

There are network issues with the Euler VII nodes. If you have problems with jobs that get stuck add this line #BSUB -R "model==EPYC_7742 to your submission script. The problem has been fixed.

News: 25.10.21

We have started to port our GDC software stack to the new lmode system. It might take a couple of weeks until all tools are available in the new software stack. Find more information here. The old GDC stack will remain.

News: 20.10.21

We are experiencing performance issues with our file system (GDC home and GDC projects). To keep the impact on the file system as low as possible work on the scratch (${SCRATCH} or ${TMPDIR}) whenever possible.

News: 09.08.21

The maintenance has been finished and the login nodes are online again. The queues will be progressively activated in the next couple of hours.

News: 20.08.21

Euler will be offline from Tuesday, 7 September 2021 until Thursday, 9 September 2021 due to maintenance. Batch queues will be progressively inactivated prior to the maintenance. More information could be found here.

News: 09.08.21

We have seen lately increased job failures due to reading errors of IO intensive jobs (e.g mapping) on the old compute nodes. The newer nodes seems to be less affected by this problem. In order to use only newer nodes add this command to your submission script #BSUB -R "model==EPYC_7742".

News: 09.06.21

Euler is online again. Some users had login problems. If the problem remains in the next couple of days let us know.

News: 26.05.21

Euler will be offline on June 8 2021 due to storage maintenance. Batch queues will be inactivated progressively before the maintenance.

News: 25.05.21

The filesystem issues have been solved, Euler operates as usual.

News: 10.05.21

All batch queues are activate again. There are still sporadic issues with the scratch. The Cluster Support is working on it.

News: 06.05.21

There seems to be still problems with the filesystem as reading and writing is very slowly. Please do not start too many jobs at the moment. In case the CPU usage is very low (bbjobs) please stop and restart them. The Cluster Support might already work on this issue.

News: 29.04.21

Do not install conda environments on GDC home or GDC projects. Finde more information here.

News: 06.04.21

The scratch and GDC home/projects are now separated and independent Lustre filesystems. We have noticed higher read and write performances for several tools. Therefore, copy the data first to the scratch (/cluster/scratch/user-id) to speed up your analysis. During the migration of your scratch, which can take a couple of hours, you might be not able to access it, sending jobs or login to Euler. Please be patient and try it again on the next day. In case the problem remains or you have issues with the new filesystem let me know. Finde here more information.

News: 24.02.21

There will be a short network maintenance on Euler between 18:00 and 20:00. Login to Euler will not be possible and queues will be inactivated.

News: 02.02.21

There was a short interruption of the file system on Euler. Now it seems to work again.

News: 26.01.21

Euler operates as usual.

News: 05.01.21

A happy, healthy and successful New Year. Euler operates but the queuing time might be a bit longer at the moment. Please be patient.

News: 15.12.20

Euler will operate during the holidays but the capacity will be reduced. Nobody is on duty in case something happens. We, therefore, kindly ask you to play nice on Euler: Do not submit too many or large jobs and do not leave any jobs unattended. And as always, we urge you to check disk space and optimise your jobs.

News: 28.10.20

All queues are active again.

News: 27.10.20

The 4 hours queue is again active, the other queues will be progressively activated.

News: 26.10.20

Due to a power outage in the CSCS datacentre the submission queues are inactive. All running jobs have been crashed this morning at 8 am. Find the latest information here.

News: 05.10.20

A part of the compute nodes in Euler will be OFFLINE from 8:00 pm on Thursday, 8. October 2020 due to the installation of new power lines for the Euler VII expansion. Computing capacity will thus be reduced. Longer queues might be affected even earlier.

News: 28.09.20

Euler is fully operational.

News: 07.09.20

We have noticed scattered file reading issues on GDC home and GDC projects. If you notice similar issues let us know.

News: 04.08.20

Euler is very busy at the moment, especially the big- and the ultramem nodes are full. I kindly ask you to be patient and optimise your jobs as much as possible.

News: 25.06.20

Lsf reports the used memory including the cache, which can be somethings much higher than the effectively memory used. For tools like BWA or angsd you normally can request much less memory than what bbjobs tells you. I have added a list with such tools in the running jobs section. Let me know if you see more such tools.

News: 17.06.20

Most of the features are working again on Euler. In case you like to connect to an external source (e.g. install R packages or use wget) always load the ETH proxy module (module load eth_proxy).

News: 08.06.20

The 4 h and the 24 h queues are open, the 120 h queue will be opened today. You still need to be patient until the GDC share reaches full capacity again. Due to new security protocols all out-going connections are blocked. Http(s) connections can be made vie the ETH's proxy server (module load eth_proxy). If you still have problems let us know. Some users have login problems, find more information here.

News: 02.06.20

Euler is online again. The compute nodes will be progressively activated. You need to remove the old host keys to be able to login (how-to). Many people like to submit jobs at the moment so you need patient until the GDC share reaches again full capacity. If you notice any problems with established scripts or tools, let me know. Remember, all compute nodes have been set up from scratch.

News: 26.05.20

The reinstallation of Euler is taking more time than expected. Euler should be hopefully reopend towards the end of this week.

News: 25.05.20

Euler will be online again on Wednesday. Before you login reset your ETH LDAP password. If you are using SSH keyes they need to be renewed. Finde more information here and here.

News: 20.05.20

The Cluster Support is reinstalling the clusters from scratch. If all goes well, Euler should be open again at the beginning of next week.

News: 15.05.2020

After receiving information on the attack of several HPC European insfrastructures it was discovered that some of our HPC systems have been compromised. The cluster remains offline most likely several days, possibly weeks.

News: 13.05.20

Currently an unexpected high number of jobs does not start properly (e.g. no CPU usage at all) and the jobs cannot be killed. The crashed nodes will be rebooted 1-2 days later and then the jobs will disappear from the lsf system (bjobs). We are in contact with the Cluster Support to fix this problem. We are sorry for any inconvenience.

News: 01.05.20

We have finished the migration to the new file system "work". Please make sure that you have update all your scripts, symbolic links and bashrc's. Our own databases, the own software stack as well as the scripts can be still found in /cluster/project/gdc/shared.