Skip to main content

65 posts tagged with "maintenance"

View all tags

(Resolved) [Outage] Juy 29, 2025: SSH Connection Disruption on General Analysis Gateway Node, 29 July 2025

On Tuesday, 29 July 2025, at 10:48 (24-hour format), an issue occurred preventing SSH connections to the General Analysis Division Gateway Node (gw.ddbj.nig.ac.jp).

The issue occurred due to a memory shortage on the gateway node caused by the execution of a user program. This caused the sssd process to stop, making SSH connections to the gateway node unavailable.

To address the issue, the sssd service was restarted at 12:19 on the same day, and the service has since been fully restored.

If you are unable to log in to gw.ddbj.nig.ac.jp, try logging in via the alternative gateway node, gw2.ddbj.nig.ac.jp. The login process is the same for both nodes.

For more information on login procedures, refer to the “How to Login to the Gateway node (The general analysis division)”.

Scope of Impact

(Restored) [Outage] May 22, 2025: Slurm Outage in General Analysis Division on Thursday, May 22, 2025

At 02:54 on Thursday, May 22, 2025 (24-hour format; all times below are in 24-hour format), the Slurm management server for the general analysis division encountered a service outage.

The cause of the issue was insufficient memory on the compute node hosting the Slurm management server.

Recovery procedures were completed at 10:34 on the same day, and job submission has since resumed.

Scope of Impact

Power Outage on April 7, 2025

A brief power outage occurred in Mishima City around 13:55, lasting approximately one minute.
The supercomputer itself did not experience a power outage due to the UPS, but network connectivity was lost for about 5 minutes.
Other impacts are currently under investigation.

(Ended) [Outage] September 4, 2024: Emergency maintenance of Lustre8

The restoration work was completed at 16:32 on Wednesday, 4 September 2024.

Also, it is available to log in to the Lustre8.

At 11:55 on Wed 4 September, the MDT (Meta Data Target) of the Lustre8 high-speed storage system in the Personal Genome Analysis division has failed and is currently unable to read and write in the personal genome analysis division.