Skip to main content

64 posts tagged with "maintenance"

View all tags

(Restored) [Outage] May 22, 2025: Slurm Outage in General Analysis Division on Thursday, May 22, 2025

At 02:54 on Thursday, May 22, 2025 (24-hour format; all times below are in 24-hour format), the Slurm management server for the general analysis division encountered a service outage.

The cause of the issue was insufficient memory on the compute node hosting the Slurm management server.

Recovery procedures were completed at 10:34 on the same day, and job submission has since resumed.

Scope of Impact

Power Outage on April 7, 2025

A brief power outage occurred in Mishima City around 13:55, lasting approximately one minute.
The supercomputer itself did not experience a power outage due to the UPS, but network connectivity was lost for about 5 minutes.
Other impacts are currently under investigation.

(Ended) [Outage] September 4, 2024: Emergency maintenance of Lustre8

The restoration work was completed at 16:32 on Wednesday, 4 September 2024.

Also, it is available to log in to the Lustre8.

At 11:55 on Wed 4 September, the MDT (Meta Data Target) of the Lustre8 high-speed storage system in the Personal Genome Analysis division has failed and is currently unable to read and write in the personal genome analysis division.