SD 06.07.2020
This commit is contained in:
@ -28,9 +28,9 @@ but jobs already queued on the partition may be allocated to nodes and run.
|
||||
|
||||
Unless explicitly specified, the default draining policy for each partition will be the following:
|
||||
|
||||
* The **daily** and **general** partition will be soft drained 24h before the downtime.
|
||||
* The **daily** and **general** partitions will be soft drained 12h before the downtime.
|
||||
* The **hourly** partition will be soft drained 1 hour before the downtime.
|
||||
* The **gpu** partition will be soft drained 1 hour before the downtime.
|
||||
* The **gpu** and **gpu-short** partitions will be soft drained 1 hour before the downtime.
|
||||
|
||||
Finally, **remaining running jobs will be killed** by default when the downtime starts. In some specific rare cases jobs will be
|
||||
just *paused* and *resumed* back when the downtime finished.
|
||||
@ -41,8 +41,8 @@ The following table contains a summary of the draining policies during a Schedul
|
||||
|
||||
| **Partition** | **Drain Policy** | **Default Drain Type** | **Default Job Policy** |
|
||||
|:---------------:| -----------------:| ----------------------:| --------------------------------:|
|
||||
| **general** | 24h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
| **daily** | 24h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
| **general** | 12h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
| **daily** | 12h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
| **hourly** | 1h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
| **gpu** | 1h before the SD | soft drain | Kill running jobs when SD starts |
|
||||
|
||||
@ -52,9 +52,8 @@ The following table contains a summary of the draining policies during a Schedul
|
||||
|
||||
The table below shows a description for the next Scheduled Downtime:
|
||||
|
||||
| From | To | Affected Service/s | Description |
|
||||
| :------------: | --------- | :----------------------------- | :----------------------------------------------------------------- |
|
||||
| *04.05.2020 8h* | *04.05.2020 10h* | Merlin Logan nodes | Outage. YFS (AFS) update and reboot |
|
||||
| *04.0555555 8h* | *04.05.2020 17h* | Merlin5 computing nodes | Outage. O.S. update, OFED drivers update, YFS (AFS) update. |
|
||||
| From | To | Service | Description |
|
||||
| ---------------- | ---------------- |:------------:|:---------------------------------------------------------------|
|
||||
| 06.07.2020 8am | 02.09.2019 6pm | All services | Upgrade: GPFSv5.0.4-2,OFEDv5.0,YFSv0.195,RHEL7.7,Slurmv19.05.7 |
|
||||
|
||||
* **Note**: An e-mail will be sent when the login nodes are fully available. An e-mail will be sent when Merlin5 computing nodes are back.
|
||||
* **Note**: An e-mail will be sent when the services are fully available.
|
||||
|
@ -10,21 +10,22 @@ permalink: /merlin6/past-downtimes.html
|
||||
|
||||
## Past Downtimes: Log Changes
|
||||
|
||||
### SD: 04.11.2019
|
||||
|
||||
*Pending...*
|
||||
| From | To | Affected Service/s | Description |
|
||||
| :------------: | --------- | :----------------------------- | :----------------------------------------------------------------- |
|
||||
| *04.05.2020 8h* | *04.05.2020 10h* | Merlin Login nodes | Outage. YFS (AFS) update v0.194 and reboot |
|
||||
| *04.05.2020 8h* | *04.05.2020 17h* | Merlin5 computing nodes | Outage. O.S. update, OFED drivers update, YFS (AFS) update. |
|
||||
|
||||
### SD: 02.09.2019
|
||||
|
||||
| Service | Clusters | Description | Exceptions |
|
||||
|:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
|
||||
| GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
|
||||
| O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
|
||||
| O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
|
||||
| Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
|
||||
| Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
|
||||
| PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
|
||||
| AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
|
||||
| AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
|
||||
| O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
|
||||
| Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |
|
||||
| From | To | Service | Clusters | Description | Exceptions |
|
||||
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
|
||||
| 02.09.2019 | 02.09.2019 | GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
|
||||
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
|
||||
| 02.09.2019 | 02.09.2019 | O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
|
||||
| 02.09.2019 | 02.09.2019 | Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
|
||||
| 02.09.2019 | 02.09.2019 | Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
|
||||
| 02.09.2019 | 02.09.2019 | PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
|
||||
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
|
||||
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
|
||||
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
|
||||
| 02.09.2019 | 02.09.2019 | Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |
|
||||
|
Reference in New Issue
Block a user