Maintenance and Outages
|Status||Type||Title||Start of Outage||Anticipated End of Outage||Resolution|
|Resolved||Outage||Intermittent difficulties logging in||
Replace failed rack CDU.
|Resolved||Outage||Various systems and services|
New kernel was hanging on boot-up. Seemed related to Infiniband (mlx5_core driver).
Booted with prior kernel and OK for now. Will need to test future updates to see if resolved.
The system partition table was recovered. Data appears to be intact. The node is available for use.
|Resolved||Maintenance||Storage battery replacement||
Battery replaced. fgstor01 and fgstor02 both have connections to storage controller B, so both had to be rebooted to recover from the disconnected eSCSI devices.
|Resolved||Outage||Tango cluster unavailable||
DHCP service had not started automatically.
|Resolved||Outage||Power outage March 8, 2020|
|Ongoing||Outage||Victor v-016 reserved|
|Outage||Nodes Reserved on Romeo, Juliet|