All Systems Operational

Apply for New Accounts Operational
Job Submission Operational
Running Slurm Jobs Operational
Pending Slurm Jobs in the Queue Operational
Rāpoi: Login node Operational
90 days ago
98.39 % uptime
Today
Storage Infrastructure Operational
Rāpoi: Compute Infrastructure Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Apr 9, 2025

No incidents reported today.

Apr 8, 2025

No incidents reported.

Apr 7, 2025

No incidents reported.

Apr 6, 2025

No incidents reported.

Apr 5, 2025

No incidents reported.

Apr 4, 2025

No incidents reported.

Apr 3, 2025

No incidents reported.

Apr 2, 2025

No incidents reported.

Apr 1, 2025

No incidents reported.

Mar 31, 2025

No incidents reported.

Mar 30, 2025

No incidents reported.

Mar 29, 2025

No incidents reported.

Mar 28, 2025

No incidents reported.

Mar 27, 2025

No incidents reported.

Mar 26, 2025
Resolved - HPC is now recommissioned and available for you to use.
The quicktest partition now has: amd01n02,amd01n02,amd01n03,amd01n04
Intel machines (itl02n[01-04]) have been retired.
The rest of the partitions have been restored to their original state before we moved to reduced capacity, i.e.,
gpu partition has gpu[01-03],
longrun has bigtmp[01-02],
bigmem has high[01-04], and
parallel has amd's and spj in it.

Mar 26, 00:21 UTC
Update - DS team has completed the HPC system relocation to the off-campus site.
Testing will be conducted tomorrow in the morning and an update will be provided by midday.

Mar 25, 04:03 UTC
Update - We are continuing to monitor for any further issues.
Mar 25, 01:41 UTC
Update - We are continuing to monitor for any further issues.
Mar 25, 01:39 UTC
Update - The HPC relocation is tomorrow (20 Mar). Power will be shut down for the move by the DS team. Testing will occur on Tuesday (25 Mar). Please contact support team with any questions.
Mar 18, 22:23 UTC
Update - Rāpoi HPC system will be unavailable from Thursday, March 20th, until the week of March 24th due to relocation.

We expect to begin testing on Tuesday, March 25th, and based on the results, we will determine when the system can be reopened for general use.

Please make sure to update your new job submission with relevant time limits such that they end before March 20.

We will provide updates as necessary.

Feb 28, 01:51 UTC
Update - Status - relocation:
The team at Digital Solutions is making progress on the migration. The logistics of quote, approvals, and insurance have been completed. Currently, they are waiting for the the off-campus facility provider to get the racks ready with power and network cabling.
Status - down nodes:
Due to a possibility of summer humidity levels exceeding 80% RH, we are unable to restart the parallel nodes at this time.
Planned:
Users will receive at least 10 days' notice before the cluster is shut down for relocation. The migration will result in several days of complete system outage as DS team de-rack, transport, and re-rack all equipment.
Potential delays:
A change freeze is planned for the first week of Trimester 1 (starting 24th Feb), which is typically a high-demand period. Depending on operational workload and incident response, this may further impact migration timelines.
We appreciate your patience and will provide updates as soon as firm migration dates are confirmed. Please reach out with any concerns.

Feb 19, 21:58 UTC
Monitoring - The compute infrastructure is running with limited capacity. Nodes will be moved to a new off-campus facility, but the schedule hasn’t been announced yet.
Feb 10, 21:34 UTC