← Go back to Status of the Grex HPC system

Grex has a problem with external network and login nodes

June 23, 2022 at 7:00 PM UTC

Network Login nodes

Resolved after 14h 30m of downtime. June 24, 2022 at 9:30 AM UTC

Update Jun 24, 10AM

Finally, all Ethernet switches are powered up, and all Grex login nodes are available. Running jobs and storage were not affected during the outage. Grex should be fully operational now.

If you have questions or concerns, please don’t hesitate to contact us at: support@computecanada.ca , mentioning Grex in the subject line.

Update Jun 24, 8AM

The reason for this partial outage is a faulty UPS that fed some of the Grex network switches. As of now, the power to most of the switches is re-routed, so jobs run normally, but only yak.hpc.umanitoba.ca works for users to connect to.

Legacy login nodes of grex.westgrid.ca are on, but external network to them is still unavailable. Please use Yak to connect for now.

Grex network, management VMs and login nodes are down

We are investigating the issue. Access to Grex is not possible, but running jobs and storage seems to be largely unaffected.

If you have questions or concerns, please don’t hesitate to contact us at: support@computecanada.ca , mentioning Grex in the subject line.

Last updated: June 24, 2022 at 2:55 PM UTC