Login nodes Compute Nodes Lustre /global/scratch NFS /home
Cooling in HPCC is working, so the outage is over and Grex is operational.
Unfortunately, due to heat outside of the datacentre, and the work inside the datacentre, we were unable to keep the environment cool enough to run even the storage and login nodes. So Grex is fully powered down again.
The electrical part of the update is over. We have powered up Grex login nodes and storage, so the users can SSH in and access their data. What does not yet work is compute, because water cooling is being worked on. So no jobs will get started, and OOD Web portal on zebu is also down.
There is a planned outage on Grex in effect now. During this outage, Physical Plant will work on HPCC power and cooling, and the entire Grex system will be powered down.
Users will not have access to any Grex services (compute and storage and Web portal) during the outage. During the outage, all running jobs will be terminated.
Should you have any questions about the upcoming Grex outage, please do not hesitate to contact us at support@tech.alliancecan.ca ! Thank you for your patience,
Your Grex HPC team.
Last updated: July 18, 2024 at 11:49 PM UTC