SD 06.07.2020

This commit is contained in:
2020-06-09 12:04:59 +02:00
parent 0100bc8790
commit 6585572f05
2 changed files with 24 additions and 24 deletions

View File

@ -28,9 +28,9 @@ but jobs already queued on the partition may be allocated to nodes and run.
Unless explicitly specified, the default draining policy for each partition will be the following:
* The **daily** and **general** partition will be soft drained 24h before the downtime.
* The **daily** and **general** partitions will be soft drained 12h before the downtime.
* The **hourly** partition will be soft drained 1 hour before the downtime.
* The **gpu** partition will be soft drained 1 hour before the downtime.
* The **gpu** and **gpu-short** partitions will be soft drained 1 hour before the downtime.
Finally, **remaining running jobs will be killed** by default when the downtime starts. In some specific rare cases jobs will be
just *paused* and *resumed* back when the downtime finished.
@ -41,8 +41,8 @@ The following table contains a summary of the draining policies during a Schedul
| **Partition** | **Drain Policy** | **Default Drain Type** | **Default Job Policy** |
|:---------------:| -----------------:| ----------------------:| --------------------------------:|
| **general** | 24h before the SD | soft drain | Kill running jobs when SD starts |
| **daily** | 24h before the SD | soft drain | Kill running jobs when SD starts |
| **general** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **daily** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **hourly** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gpu** | 1h before the SD | soft drain | Kill running jobs when SD starts |
@ -52,9 +52,8 @@ The following table contains a summary of the draining policies during a Schedul
The table below shows a description for the next Scheduled Downtime:
| From | To | Affected Service/s | Description |
| :------------: | --------- | :----------------------------- | :----------------------------------------------------------------- |
| *04.05.2020 8h* | *04.05.2020 10h* | Merlin Logan nodes | Outage. YFS (AFS) update and reboot |
| *04.0555555 8h* | *04.05.2020 17h* | Merlin5 computing nodes | Outage. O.S. update, OFED drivers update, YFS (AFS) update. |
| From | To | Service | Description |
| ---------------- | ---------------- |:------------:|:---------------------------------------------------------------|
| 06.07.2020 8am | 02.09.2019 6pm | All services | Upgrade: GPFSv5.0.4-2,OFEDv5.0,YFSv0.195,RHEL7.7,Slurmv19.05.7 |
* **Note**: An e-mail will be sent when the login nodes are fully available. An e-mail will be sent when Merlin5 computing nodes are back.
* **Note**: An e-mail will be sent when the services are fully available.

View File

@ -10,21 +10,22 @@ permalink: /merlin6/past-downtimes.html
## Past Downtimes: Log Changes
### SD: 04.11.2019
*Pending...*
| From | To | Affected Service/s | Description |
| :------------: | --------- | :----------------------------- | :----------------------------------------------------------------- |
| *04.05.2020 8h* | *04.05.2020 10h* | Merlin Login nodes | Outage. YFS (AFS) update v0.194 and reboot |
| *04.05.2020 8h* | *04.05.2020 17h* | Merlin5 computing nodes | Outage. O.S. update, OFED drivers update, YFS (AFS) update. |
### SD: 02.09.2019
| Service | Clusters | Description | Exceptions |
|:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
| O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
| O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
| Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
| Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
| PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
| AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
| AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
| O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
| Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |
| From | To | Service | Clusters | Description | Exceptions |
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| 02.09.2019 | 02.09.2019 | GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
| 02.09.2019 | 02.09.2019 | O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
| 02.09.2019 | 02.09.2019 | PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |